Human gait recognition is one of the most important authentication technologies as it can often happen that people approach a computer system or a robot by walking. Therefore in this study, a multi-angle gait recognition method has been proposed by using skeletal tracking data, measured by an RGB-D camera. The proposed method includes a two stage process, which estimates an optimal gait angle view from the five discrete angles at the first stage and subsequently recognizes human gait based on the specific features for the respective gait angle views. In order to evaluate the proposed method, two types of experiments have been done: gait angle estimation and gait recognition. From the result of the first experiment, the best estimation of 97.4% accuracy has been achieved. In the second experiment, the best gait recognition accuracy was 96.4%. Finally the best gait recognition accuracy with the two stage process has been estimated as 93.9%.
We propose a novel user interface that enables control of a singing voice synthesizer at a live improvisational performance. The user first registers the lyrics of a song with the system before performance, and the system builds a probabilistic model that models the possible jumps within the lyrics. During performance, the user simultaneously inputs the lyrics of a song with the left hand using a vowel keyboard and the melodies with the right hand using a standard musical keyboard. Our system searches for a portion of the registered lyrics whose vowel sequence matches the current user input using the probabilistic model, and sends the matched lyrics to the singing voice synthesizer. The vowel input keys are mapped onto a standard musical keyboard, enabling experienced keyboard players to learn the system from a standard musical score. We examine the feasibility of the system through a series of evaluations and user studies.
This paper describes an automatic karaoke generation system, which can suppress the singing voice in audio music signals, and can also change the pitch of the song. Furthermore, this system accepts the streaming input, and it works in real-time. To the best of our knowledge, there have been no real-time audio-to-audio karaoke system that has the two functions above. This paper particularly describes the two technical components, as well as some comments on the implementation. In this system, the authors employed two signal processing techniques: singing voice suppression that is based on two-stage HPSS, a vocal enhancement technique that the authors proposed previously, and a pitch shift technique that is based on the spectrogram stretch and phase vocoder. The attached video file shows that the system works in real-time, and the sound quality may be practically acceptable.
This paper describes how the short-term Fourier transform (STFT) and inverse short-term Fourier transform (ISTFT) are integrated within the sound synthesis framework of LC, a new computer music programming language, which the authors prototyped, and discusses its benefits for computer music programming. In addition to the traditional unit-generator-based sound synthesis framework, LC provides a framework for microsound synthesis, which is highly independent from the unit-generator concept, and STFT and ISTFT can be also performed within the same framework. While the unit-analyzer concept in the ChucK audio programming language shows a certain degree of similarity to LC's programming model for STFT and ISTFT, in that both languages allow direct access to low-level spectral data from user programs, due to the dependence on the unit-generator-based sound synthesis framework, a ChucK program that utilizes unit analyzers can exhibit unnecessary complexity in its implementation, when the hop sizes differ among the STFT frames in the program. On the other hand, thanks to the high independence from the unit-generator concept, LC's microsound synthesis framework can provide a simpler and terser programming model and avoid such unnecessary complications. As other unit-generator languages can also exhibit similar problems as seen in ChucK's unit analyzers, depending on its sound synthesis framework design, such a language design of LC would be beneficial, not just as a design exemplar for next generation computer music languages, but also to reconsider the design of existing unit-generator languages on such issues regarding how STFT should be integrated in a unit-generator language and whether unit-generators should fully synchronize the audio computation with the advance of global system time.
The latest processors exploit Instruction Level Parallelism to improve performance, but this strategy is limited by control dependency. To alleviate this problem, the most recent processors utilize branch prediction. A typical branch predictor applies prediction to all instructions; however, this means that the branch predictor requires a high energy input, especially to the BTB (branch target buffer). In this paper, we propose a method that reduces the number of BTB accesses and abolishes the BTB tag by associating the instruction cache line and BTB entry. This proposal allocates a fixed number of BTB entries to a cache line and allocates an index to the corresponding instruction in the cache line as a substitute for the BTB tag. Due to the small fixed numbers of BTB entries compared to the fetch amount and reduction of the BTB tag, our proposal can reduce BTB access energy requirements. Our proposal is anticipated to cut energy consumption, but it cannot apply a branch target prediction to the entire set of instructions if there are too many branch instructions per cache line. We therefore evaluated its effects on processor performance and energy consumption. The evaluation results show that the proposal reduces BTB access energy requirements to 47.5% without any performance loss.
The emergence of multi-core processors in smart devices promises higher performance and low power consumption. The parallelization of applications enables us to improve their performance. However, simultaneously utilizing many cores would drastically drain the device battery life. This paper shows a demonstration system of real-time video processing combined with power reduction controlled by the OSCAR automatic parallelization compiler on ODROID-X2, an open Android development platform based on Samsung Exynos4412 Prime with 4 ARM Cortext-A9 cores. In this paper, we exploited the DVFS framework, core partitioning, and profiling technique and OSCAR parallelization - power control algorithm to reduce the total consumption in a real-time video application. The demonstration results show that it can cut power consumption by 42.8% for MPEG-2 Decoder application and 59.8% for Optical Flow application by using 3 cores in both applications.
Many activity recognition systems using accelerometers have been proposed. Activities that have been recognized are single activities which can be expressed with one verb, such as sitting, walking, holding a mobile phone, and throwing a ball. In fact, combined activities that include more than two kinds of state and movement are often taking place. Focusing on hand gestures, they are performed not only while standing, but also while walking and sitting. Though the simplest way to recognize such combined activities is to construct the recognition models for all the possible combinations of the activities, the number of combinations becomes immense. In this paper, firstly we propose a method that classifies activities into postures (e.g., sitting), behaviors (e.g., walking), and gestures (e.g., a punch) by using the autocorrelation of the acceleration values. Postures and behaviors are states lasting for a certain length of time. Gestures, however, are sporadic or once-off actions. It has been a challenging task to find gestures buried in other activities. Then, by utilizing the technique, we propose a recognition method for combined activities by learning single activities only. Evaluation results confirmed that our proposed method achieved 0.84 recall and 0.86 precision, which is comparable to the method that had learned all the combined activities.
We analyze the increasing threats against IoT devices. We show that Telnet-based attacks that target IoT devices have rocketed since 2014. Based on this observation, we propose an IoT honeypot and sandbox, which attracts and analyzes Telnet-based attacks against various IoT devices running on different CPU architectures such as ARM, MIPS, and PPC. By analyzing the observation results of our honeypot and captured malware samples, we show that there are currently at least 5 distinct DDoS malware families targeting Telnet-enabled IoT devices and one of the families has quickly evolved to target more devices with as many as 9 different CPU architectures.
The effectiveness of punishment that a player pays certain costs and punishes an uncooperative player is currently discussed in the field of the study of cooperation under non-kin relationships. The discussions of the effectiveness of punishment are based on either negative or positive point of view. Contrary to these previous discussions, this study proposes a novel model introducing an alternative notion of punishment “sanction with jealousy”. The degree of sanction is proportional to the payoff of the sanctioning player. The condition for sanction to occur reflects jealousy in that each player sanctions their neighbor players when their payoff is smaller than the payoff of their neighbor players. Utilizing this model, the author investigates whether the introduction of the sanction with jealousy improves both the number of players having the strategy of cooperation and the average payoff of all players or not. In addition, the author organizes the new findings from this investigation in comparison with these previous discussions.
Detecting the boundaries of citations in the running text of research papers is an important task for research paper summarisation, idea attribution, sentiment analysis, and other citation-based analysis research. Recently, detecting non-explicit citing sentences has garnered some attention, but can still be seen as in its infancy. We define this task as citation block determination (CBD). In this paper we propose and investigate the effects of various types of textual coherence on CBD, positing that it is a crucial aspect of identifying citation blocks, as it is fundamental to the composition of citations themselves. We demonstrate promising results, with our method outperforming previous state-of-the-art on F1 by a large margin, with an improvement in both precision and recall, and further provide an in-depth error analysis and discussion of why this is the case.
The process of nanocrystal device development is not well systematized. To support this process, analysis of the information produced by developmental experiments is required. In this study, we constructed an annotated corpus to support the extraction of experimental information from relevant publications. We designed the corpus-construction guidelines by cooperating with a domain expert. We evaluated these guidelines through corpus-construction experiments with graduate students from this domain, and then evaluated the corpus with the domain expert. In the corpus construction experiments, we achieved a sufficient level of Inter-Annotator Agreement by using a loose agreement measure that ignored the term-boundary mismatch problem, and made an agreement corpus that excluded annotations based on misunderstanding the guidelines. The domain expert evaluated this agreement corpus and modified the guidelines based on real examples. Using these guidelines, we finalized the corpus called “NaDev” (Nanocrystal Device development corpus). The NaDev corpus and its construction guidelines will be released via our website, http://nanoinfo.ist.hokudai.ac.jp/. The NaDev corpus aims to support automatic information extraction from publications relevant to nanocrystal device development. This information can be used to solve problems in the nanotechnology domain using the massive availability of fresh information. To the best of our knowledge, this is the first corpus constructed for the development of nanocrystal devices.
Dictionary learning is an unsupervised learning task that finds a set of template vectors that expresses input signals by sparse linear combinations. There are currently several methods for dictionary learning, for example K-SVD and MOD. In this paper, a new dictionary learning method, namely K-normalized bilateral projections (K-NBP), is proposed, which uses faster low rank approximation. Experiments showed that the method was fast and when the number of iterations was limited, it outperforms K-SVD. This indicated that the method was particularly suited to large data sets with high dimension, where each iteration takes a long time. K-NBP was applied to an image reconstruction task where images corrupted by noise were recovered using a dictionary learned from other images.
Bugs in operating system kernels threaten system reliability and availability. Static analysis of device drivers is one of the most useful methods to find and fix bugs in operating systems. Unfortunately, existing tools focus on bug patterns that come from developers' ad hoc beliefs and experiences, although the developers have a chance to utilize many past bug reports. The objective of this paper is to uncover particular types of real bugs in a widely used operating system. Specifically, this paper presents a case for finding six real bugs in Linux when obtaining 160 bug reports about interrupt request line (IRQ) handlers in past Linux. The 160 bug reports enable us to recognize nine patterns of mishandling IRQ handlers, and our analyzer, which is based on the recognized patterns, successfully detects the uncovered bugs.
Task parallelism on large-scale distributed memory environments is still a challenging problem. The focuses of our work are flexibility of task model and scalability of inter-node load balancing. General task models provide functionalities for suspending and resuming tasks at any program point, and such a model enables us flexible task scheduling to achieve higher processor utilization, locality-aware task placement, etc. To realize such a task model, we have to employ a thread—an execution context containing register values and stack frames—as a representation of a task, and implement thread migration for inter-node load balancing. However, an existing thread migration scheme, iso-address, has a scalability limitation: it requires virtual memory proportional to the number of processors in each node. In large-scale distributed memory environments, this results in a huge virtual memory usage beyond the virtual address space limit of current 64bit CPUs. Furthermore, this huge virtual memory consumption makes it impossible to implement one-sided work stealing with Remote Direct Memory Access (RDMA) operations. One-sided work stealing is a popular approach to achieving high efficiency of load balancing; therefore this also limits scalability of distributed memory task parallelism. In prior work, we propose uni-address, a new thread migration scheme which significantly reduces virtual memory usage for thread stacks and enables RDMA-based work stealing, and implements a lightweight multithread library supporting RDMA-based work stealing on top of Fujitsu FX10 system. In this paper, we port the library to an x86-64 Infiniband cluster with GASNet communication library. We develop one-sided and non one-sided implementations of inter-node work stealing, and evaluate the performance and efficiency of the work stealing implementations.