About 20 years have passed since the word “Virtual Reality” became popular. During these two decades, novel human interface technology so called “multimodal interface technology” has been formed. In this paper, firstly, recent progress in realtime CG, BCI and five senses IT is quickly reviewed. Since the life cycle of the information technology is said to be 20 years or so, novel directions and paradigms of VR technology can be found in conjunction with the technologies forementioned. At the end of the paper, these futuristic directions such as ultra-realistic media are briefly introduced.
We examined electroencephalography (EEG) power spectrum and complex coherence in the theta and alpha band while the subjects were performing a three - dimension (3-D) virtual maze navigation tasks. 10 healthy males participated. The imaginary part of coherency was used to provide a functional measurement of corticocortical communication. The imaginary part of coherency analysis applied to a measurement EEG data in the section which frontal midline theta rhythm (6 - 7 Hz) appeared during 3-D maze tasks. The results showed that the coherency in the theta band between regions of frontal and right temporal increased during 3-D maze tasks. This result suggests that the frontal midline theta rhythm modulate the component of the neural circuit, and greater connections between the cortex regions of the frontal - temporal increase during 3-D maze task. Therefore, this finding leads to conclusion that information processing in spatial navigation is related to regions of frontal and temporal.
In this paper, we present some results of analysis on surface electromyogram (SEMG) using Self-Organizing -Maps (SOM) algorithm, which is one of the neural network algorithm, for unspoken vowel recognition system. Three pairs of electrodes were placed on facial muscles and SEMG signals were recorded. We have examined the classification of three pairs of the values of activity for each muscle using SOM algorithm. The SOM algorithm is also able to translate the multi-dimensional vectors of RMS values of SEMG signal into the two-dimensional map.
A purpose of this study is to support communication of developmentally disabled individuals with motor paralysis, such as Guillain-Barre Syndrome, brain-stem infarction, having difficulty in conveying their intention. In the present paper, a pointing device controlled by DC-coupled electrooculograms (EOGs) has been developed. The optic angle of the subject was estimated from the amplitude of vertical and horizontal EOGs for determining the two dimensional pointing position on the PC screen in real time. The eye blinking artifact was reduced using a median filter. The displacement of electrode position was compensated by considering the potential gradient. Moreover, the position error caused by drift phenomenon was adjusted by using head movement. The accuracy and operating speed of the proposed method were evaluated in human experiments.
In this study, we attempted to identify influential characteristics of input data for neural decoding across different decoders. Support vector machine (SVM), k-nearest neighbor method (KNN) and canonical discriminant analysis (CDA) were used as decoders to predict test tone frequencies from tone-induced neural activities in the rat auditory cortices. The sequential dimensionality reduction (SDR) that we had previously proposed reduced input data dimension one by one without deteriorating the prediction accuracy in order to identify the neural activity pattern that led to the best prediction accuracy for each decoder. We found that the accuracy of SVM and KNN improved when neural activities had high spike rates and high dispersiveness, while CDA performed better on sparse neural activities. These results suggest that the best decoder can change according to the spike rates and dispersiveness of neural activities. Since these characteristics of neural activities change depending on brain regions or test stimuli, the selection of proper decoder would be important for efficient neural decoding.
In this research, we focused on near-infrared spectroscopy (NIRS) as an alternative technique for mental state analysis, and compared its performance with other conventional techniques such as electroencephalography (EEG), heart rate variability (HRV), and peripheral arterial tonometry (PAT) during stress and healing task. In our experiment, we measured biological signals during stress or healing simultaneously with those techniques for comparison. Our NIRS results showed that the amount of total hemoglobin (totalHb) in the frontal cortex was increased during stress task, and that it decreased during healing task. Conventional physiological analysis techniques such as EEG and HRV, however, showed inconsistent results during tasks. Only PAT gave the consistent results in many of subjects. Our results suggested that NIRS and PAT might have correlation to mental stress, and can be useful to analyze stress condition.
Relationship between evoked activity and spontaneous activity in neuronal circuits is one of the important theme for the improvement of neuroprosthetic apparatus. The spontaneous activity and evoked action potentials are mutually related in the cultured neuronal network autonomously reconstructed on the culture dish, but there is a question whether spontaneous activity and the evoked action potentials constitute one state respectively or the spontaneous activity is only a random background noise. Comparing the frequencies and standard deviations of spontaneous activity with those of evoked activity, we found that the silent and reproducible period lasting for 1 sec immediately after the activity evoked primally. In addition, the repetitive stimuli suppress the spontaneously occurring bursting activity in frequency, even though the inter-stimulus-interval was more than 10 sec. These results suggests that distinct internal state of the neuronal circuit was triggered by an electrical stimulation, and there were state of spontaneous mode and evoked mode in a neuronal circuit.
A method to detect the direction and the distance of voluntary eye gaze movement from EOG (electrooculogram) signals was proposed and tested. In this method, AC-amplified vertical and horizontal transient EOG signals were classified into 8-class directions and 2-class distances of voluntary eye gaze movements. A horizontal and a vertical EOGs during eye gaze movement at each sampling time were treated as a two-dimensional vector, and the center of gravity of the sample vectors whose norms were more than 80% of the maximum norm was used as a feature vector to be classified. By the classification using the k-nearest neighbor algorithm, it was shown that the averaged correct detection rates on each subject were 98.9%, 98.7%, 94.4%, respectively. This method can avoid strict EOG-based eye tracking which requires DC amplification of very small signal. It would be useful to develop robust human interfacing systems based on menu selection for severely paralyzed patients.
Methods for extracting features of motor imagery from 1-channel bipolar EEG were evaluated. The EEG power spectrums which were used as feature vectors were calculated with filter bank, FFT and AR model, and were then classified by linear discriminant analysis (LDA) to discriminate motor imagery and resting states. It was shown that the extraction method using AR model gave the best result with the average true positive rate of 83% (σ = 7%). Furthermore, when principal component analysis (PCA) was applied to the feature vectors, the dimension of the feature vectors could be reduced without decreasing accuracy of discrimination.
In the real world, there are a lot of objects and it is impossible to make a system memorize all knowledge concerning the real world. Therefore, the system should autonomously learn knowledge relating to the environment. We propose a system that autonomously acquires concepts which are derived by statistical relation between audio-visual events. Firstly, the system determines correspondence between audio-visual events after extracting patterns from the external world, and accumulates them as cases. Secondly, it applies a canonical correlation analysis to the cases, and categorizes them by using K-means method. Finally, it identifies unknown image or sound, and associates the corresponding sound or image. As the result of experiments, the identification success rate of concepts is more than 83.2%. And the association success rate of concepts is more than 81.5%. Consequently, the effectiveness of this method was confirmed.
This paper presents the experimental evaluation of auditory cognition's effects on visual cognition of video. The influences of seven auditory stimuli on visual recognition are investigated based on experimental data of key-down operations. The key-down operations for locating a moving target by visual and auditory images are monitored by an experiment system originally made by devices including VTR, CRT, Data Recorder, etc.. Regression analysis and EM algorithm are applied to analyzing the experiment data of 350 key-down operations, made with 50 people and 7 auditory stimulus types. The following characteristic results about the influence of auditory stimulus on visual recognition are derived. Firstly, seven people responded too early for every experiment. The average of and the standard deviation of their response times are 439[ms] and 231[ms] respectively. Secondly, the other forty three people responded about 10[ms] after at cases, in which auditory images were presented 30[ms] or 60[ms] before visual images. Also they responded about 10[ms] early at the other cases. Thirdly, as the visual image was dominant information used for the key-down decision making, apparent effects of auditory images on the key-down operation were not measured. Averages and standard deviations of distributions measured by EM algorithm, regarding to 7 auditory stimulus types, are considered and verified with the Card's MHP model of human response.
Usability, or easiness of operation of a console such as a universal remote console (URC), has been investigated in terms of the optimum number of buttons of the console which minimizes the operation time. The console operation is consisted of two major processes, that is, cognition of the button and motion of hand to press it. Cognitive workload would increase in accordance with the increase in number of buttons because difficulty in finding a correct button is increased. Conversely, physical workload would increase when number of buttons is decreased, because one has to press many times the same buttons in different meanings. Thus the optimum number of buttons which minimizes the total operation time of a console appears. To verify this hypothesis, several virtual consoles equipped with different number of buttons were developed on a PC. Subjects were asked to input designated family names in Roman alphabets. Easiness of operation, i.e. usability was evaluated by the operation time among 49 subjects. The operation time showed the minimum value when the number of buttons was approximately 18.
To operate in Immersive Virtual Environment (IVE) with standard devices, beginners will feel difficulties to do it because they are not intuitive devices. Haptic sense is very important for intuitive operation. But existing haptic device is not suited to use in IVE for reasons of displayed sense and the size of the device itself. A device that is a portable one can only display Force-Feedback sense, and a device that can display tactile sense is impossible to be mounted on a hand. In this paper we proposed Haptic display Actuated with Magnetorheological fluid and Artificial muscle (HAMA device). It is a potable haptic device that can display Force-Feedback and tactile sense. The device is constructed of two small devices, a device for displaying Force-Feedback sense and a device for displaying tactile sence. They use Artificial Muscle and Magnetorheological fluid for an actuator. This time we developed index finger part for a trial and evaluate it.
Lip motion features are of practical use in identifying individuals. It is therefore important to develop non-contact type interface. For the interface using lip motion features, individual differences such as accents and dialects in commands should be accepted. In this paper, we propose a method to identify commands by analyzing three kinds of lip motion features. They are lip width, lip length, and ratio of width and length. The analysis is made on the basis of these features' relative values obtained from the primary and object frame. The proposed method has three steps. First, we extracted the lip motion features on the basis of both positions and shapes of lip in each frame of facial images. Second, standard patterns were created from features of six utterances per command. The standard pattern is able to reduce the relative difference in the lip motion features. Third, similarities among commands were computed by Dynamic-Programming (DP) matching. And then, the command with the largest similarity was selected as the target one. Our experimental results suggest that proposed method is useful to construct the non-contact type interface of command input using lip motion features.
This research studies the possibility of an intuitive interface for an electric wheelchair by using human body except hands. For this purpose, we focused on the human body motion which has relation to actions or behavior. This motion comes from the human stabilization function for holding expectable collapsing caused by voluntary motion. Thus this motion is considered as a kind of characteristics of human motion, and is linked to intentions unconsciously. Therefore, the interface which does not require conscious and complex motion is realized by applying this human body motion to the interface of electric wheelchair. In this paper, first, we did experiment to search a part which vividly shows the pressure change on the seat. As a result, it was confirmed that pressure change of the seat back vividly shows the human body motion. Next, we designed the prototype based on this evidence. Finally, experiment was conducted by using 10 subjects and SD method to evaluate feeling of operation. For this result, it was turned out that all subjects feel that proposed interface was intuitive, or to control at their direction. Therefore it was confirmed that human body motion interface has a possibility to be used for an interface of electric wheelchair.
This paper proposes an emotion estimation algorithm from e-Learning user's facial image. The algorithm characteristics are as follows: The criteria used to relate an e-Learning use's emotion to a representative emotion were obtained from the time sequential analysis of user's facial expressions. By examining the emotions of the e-Learning users and the positional change of the facial expressions from the experiment results, the following procedures are introduce to improve the estimation reliability; (1) some effective features points are chosen by the emotion estimation (2) dividing subjects into two groups by the change rates of the face feature points (3) selection of the eigenvector of the variance-co-variance matrices (cumulative contribution rate>=95%) (4) emotion calculation using Mahalanobis distance.
This paper proposes a pitch estimation method suitable for singing evaluation incorporable in KARAOKE machines. Professional singers and musicians have sharp hearing for music and singing voice. They recognize that singer's voice pitch is “a little off key” or “be in tune”. In the same way, the pitch estimation method that has high frequency resolution is necessary in order to evaluate singing. This paper proposes a pitch estimation method with high frequency resolution utilizing harmonic characteristic of autocorrelation function. The proposed method can estimate a fundamental frequency in the range 50 ∼ 1700[Hz] with resolution less than 3.6 cents in light processing.
To establish a universal communication environment, computer systems should recognize various modal communication languages. In conventional sign language recognition, recognition is performed by the word unit using gesture information of hand shape and movement. In the conventional studies, each feature has same weight to calculate the probability for the recognition. We think hand position is very important for sign language recognition, since the implication of word differs according to hand position. In this study, we propose a sign language recognition method by using a multi-stream HMM technique to show the importance of position and movement information for the sign language recognition. We conducted recognition experiments using 28,200 sign language word data. As a result, 82.1 % recognition accuracy was obtained with the appropriate weight (position:movement=0.2:0.8), while 77.8 % was obtained with the same weight. As a result, we demonstrated that it is necessary to put weight on movement than position in sign language recognition.
An optical system having a set of light emitting diodes with different wavelengths has been applied to the characteristic parameter extraction from a seaweed-water mixture or slurry in a seaweed processing plant. The system or extraction method is based on the assumption that the optical transmissions of the mixture at particular wavelengths could be related to seaweed-product nature or quality. Using this system, we have made optical transmission measurements in a plant and conducted analyses for the parameter extraction, including pigment content estimation and principal component analysis, on the data. The results imply that the principal components or pigment relative content derived from the optical transmission data of the mixture could be useful in characterizing the nature of the final seaweed products. One could utilize the obtained information when providing feedback control of the process for improving product quality.
Today it is easy to upload or download files to and from Web servers or file servers on the Internet. However, it is difficult for users to select the optimal server because these servers rarely disclose their performance specifications or current processing loads. In order to select the optimal server for user, there have been many studies in server side approach, but not so many studies in user side approach. Therefore, some novel indicator that enables the user to select the optimal server is desired. For this purpose we propose to use a pilot file that is small enough to be downloaded quickly. The proposed method is compared with the conventional method, which measures the round trip time between the user and the server. The effectiveness of these methods is demonstrated in this study. The possibility for these methods to estimate the time to download a target file (which a user desires to download) is also discussed.
Retinotectal projection, together with the well-known retinogeniculocortical pathways, plays important roles in visual information processing. In this study, we try in vitro reconstruction of retina-superior-colliculus (SC) pathways on micro- electrode arrays (MEAs). First, retinal tissue and SC slices were prepared from newborn rats and individually cultured on MEA substrates. Spontaneous electrical activity was recorded in both retina and SC cultures. Continuous firing was observed in cultured retina, where the frequency increased from a few Hz to more than 10 Hz with culture duration. Evoked responses were also recorded from cultured retinal tissue. A single biphasic pulse successfully elicited spike trains. Spontaneous activity of SC was observed in the 6 days in vitro (DIV) cultures. Finally, retina and SC were co-cultured under the conditions established for SC-slice cultures. Within a few days, we could observe neurite outgrowth from both tissue and connections were established morphologically. Spontaneous activity was recorded from both retina and SC areas in 11 DIV cultures. The next step will be spatio-temporal analysis of signal-propagation patterns of spontaneous activity as well as SC responses to retinal stimulation.
Time delay systems have the mathematical difficulties for the controller design, the modeling and so on. In the continuous time approach, a time delay system has an infinite dimension. On the other hand, in the discrete time approach, an input time delay system has finite dimension and many controller design methods for the finite dimensional systems can be applied. But, for the case that the length of a time delay is not multiples of a discrete time, some additional methods are needed. The modeling and controller design methods have been studied by some authors for this case. In this paper, we focus on the case that the delay length is not multiples of a discrete time. Then the discrete time model is derived for the input and unilateral time delay systems without any approximations.
Image hallucination is a technique of Super-Resolution. It is to estimate unobservable high frequency parts in order to restore proper signal from band-limited observational data. Image hallucination has been developed as a method to create high resolution images from low resolution ones. The purpose of the paper is to propose the technique of creating high definition images from low resolution images with bilateral filter. The high frequency primitives are inferred from low frequency images by training. To obtain effective training data, we use a non-linear edge-preserving filter called bilateral filter. This algorithm works by decomposing the original image into a high-frequency texture and low-frequency illumination. Finally, some simulation results are illustrated to show the effectiveness of the present method.
This paper proposes a new robust tracking method to the moving object on image. Recently, technique of image processing to detect and track the moving object on time series images is important subject in the fields of intelligent transport systems and security systems. In the past, it was difficult to detect the moving object by the effect of changes in brightness which caused by the sunlight and so on. In order to solve this problem, some methods have been proposed. However, they are needed a background image or a template of the moving object. In here, we propose new detection and tracking methods based on Radial Reach Filter, and then introduce the process which recognizes situation of the moving object. Usefulness of our method is shown through experiments based on the real images.
This paper describes an image processing method for the detection of the lateral displacement of an agricultural vehicle, such as a tractor. It achieves this by tracking the movement of the ground on images while detecting a lamp placed at a target point in a field by means of a computer vision system mounted on the tractor. The aim of this technology is to enable tractors to automatically travel in a straight line with high accuracy towards the target lamp. Some experiments showed the effectiveness of the proposed method.
So many scheduling problems arise in railway industries. One of the typical scheduling problems is Crew Scheduling Problem. Much attention has been paid to this problem by a lot of researchers, but many studies have not been done to the problems in railway industries in Japan. In this paper, we consider a railway crew scheduling problem in Japan. The problem can be formulated into Set Covering Problem (SCP). In SCP, a row corresponds to a trip representing a minimal task and a column corresponds to a pairing representing a sequence of trips performed by a certain crew. Many algorithms have been developed and proposed for it. On the other hand, in practical use, it is important to investigate how these algorithms behave and work on a certain problem. Therefore, we focus on Wedelin's algorithm, which is based on Lagrange relaxation and is known as one of the high performance algorithms for SCP, and mainly examine the basic idea of this algorithm. Furthermore, we show effectiveness of this procedure through computational experiments on instances from Japanese railway.
A novel power reduction technique for a variable gain amplifier (VGA) with a two-stage operational amplifier is proposed. The technique improves the power consumption of a VGA by optimizing the bandwidth and the phase margin dynamically on all gain range of the VGA through controlling the input transconductance of opamp. A VGA utilizing the proposed technique shows 40% reduction of power consumption against a conventional VGA at the best condition of VGA gain range.
This paper proposes a photography method that used a moving image in the dark place. When the photography situation is dark, flashbulb function are used well, but there is the case ineffective. The proposed method generates a image with a good contrast. The experimental results showed that even if the case that cannot photograph a good image by the flashbulb-function, the generated image by the proposed method was good contrast.