A brain injury can cause many cognitive dysfunctions, including aphasia, memory disorder, attention deficit, and so on. Although cooking training programs are popular as a rehabilitation method for people with cognitive dysfunctions, it is difficult to have such individuals perform cooking alone. Ordinary recipes (such as those found in books or TV programs) and existing cooking support systems are not designed with cognitive dysfunctions in mind and require too much capability in terms of comprehension and memory. We have developed a multimedia cooking navigation system that can effectively support individuals with cognitive dysfunctions. The system integrates and displays recipes available in different media, and users can easily choose which media they receive and therefore better understand the cooking procedures. We tested the system by using it in the rehabilitation of a patient, with encouraging results: the patient became more positive about cooking and her feeling of independence was increased.
We investigated the influences of network latency on the quality of experience (QoE) for a haptic media, sound and video transfer system. In cases of network delay jitter, we subjectively assessed the operability of the haptic interface device, sound output quality, video output quality, inter-stream synchronization quality, and comprehensive quality as QoE. We also evaluated the application-level quality of service (QoS). Assessment results demonstrated that it is possible to estimate QoE parameters from QoS parameters with a high degree of accuracy.
Wearing cosmetics or makeup has important implications for human. However, it is very hard for visually impaired people to apply make-up because the process of making up their face requires visual contacts. We propose a wind pressure display that shows the given position of the face. The proposed system takes a face image of the person using a camera installed in front of the person, and determines a given position such as the makeup spot in the image. Then, wind pressure provides the necessary position on the face to the person so they can effectively apply the necessary makeup. We measured two point thresholds of the wind pressure display in our experiment. Moreover, the position presentation on the face using the proposed system was also evaluated. The results of these experiments showed the proposed system could effectively present the necessary position on the face.
Recent studies have demonstrated a close relationship between the frequency of microsaccades and covert attention shift. However, there are no arguments of attentional effect on drift movements during attentive or inattentive fixation. We examined whether visual attention has any influence on drift eye movements using statistical analysis of a time series of fixation eye movements. The results indicated that the power in the 3-4 Hz frequency range was enhanced when visual attention was dispersed over the parafoveal visual field. Furthermore, the amplitudes of the drift eye movements were reinforced immediately after the microsaccades, especially after square-wave jerks. These results suggest that microsaccades and drift eye movements are both controlled by higher order brain functions to acquire detailed visual information from the peripheral vision.
While several of the characteristics of human sight haves been elucidated mainly in a the fields of medicine and psychology, many remain unclear. In particular, the details of the pro-sight mechanism are hoped to elucidate the cause of the optical illusion. The following sentence was hard to understand, so I took a guess. Please use my version as a model to rewrite your sentence. We carried out multi-scale analysis of the Hellmann grid image by Using wavelet transforms. We used the results of physiological research on neurons to find channels in which there is frequency selectivity. The concept of the Hellmann grid is that there is an illusion of dark spots in the intersection of white lines. The illusion in the Hellmann grid image is caused by 'side restraint' and creates a high black-and-white contrast. Therefore, a model showing a dark spot is explained by side restraint.
The movement of the pupils and corneal reflections from the video camera images that capture the entire human face is very important to develop systems for precisely and remotely detecting eye-gaze. However, doing this was difficult under intense illumination when using the conventional method because the pupil becomes small. This decreases the brightness of the pupil brightness. In our method, bright and dark pupil images and their corresponding non-lighting images are consecutively obtained while turning two light sources on and off. Using the divided and multiplied images of the bright and dark pupil images differentiated by the non-lighting images enables the pupil and corneal reflection, as well as the face, to be detected more sensitively. This means that a detection with no defects around the eyes can be extracted. The proposed methods showed better experimental results under the intense illumination than the conventional method did.
Since 3D display methods and viewing environments vary widely, high-quality content is expected to become multi-purpose. Meanwhile, there is increasing interest in the bio-medical effects of various types of image content and there are moves toward international standardization, so safety and comfort need to be considered in 3D content production. The aim of the authors' research is to contribute to the production and application of 3D content that is safe and comfortable. In this paper, the authors focus on the process of changing the screen size, examining a scalable 3D conversion algorithm and evaluating its effectiveness. The authors evaluated the visual load imposed during the viewing of various 3D contents converted by the prototype algorithm by comparing them with ideal conditions and with content expanded without conversion. To examine the effects of screen size reduction on viewers, changes in user impression and experience were elucidated using an evaluation grid method.
In this paper, we present an authentication mechanism for ISDB-T broadcast streams, especially a One-Seg broadcast stream, which is suitable for low-power devices. Our method makes it possible to authenticate data streams at a low computational cost. The method requires a small memory for buffering to process the broadcast stream and is resistant to packet-loss. We evaluated the computational cost of our method by computer simulation and theoretical estimation, and we show here that our method achieved good properties for authenticating data streams broadcast through lossy channels, e.g. wireless channels. Furthermore, we developed a mobile phone that can authenticate One-Seg broadcast streams with our method, and we report the effectiveness of our scheme here.
Today, the demand is increasing for comic contents on cellular phones and speech software for the visually impaired. When speech software reads aloud a comic character's speech, it is useful for both the visually impaired and unimpaired to have the character's voice injected with his/her feeling, which is inferred from types of speech balloons. As a result, comic contents come to life. In this research, a method has been developed to detect speech balloons on comic pages and then classify them into four types. In this method, speech balloon candidates are extracted based on speech text information detected by AdaBoost, and then speech balloons are selected and classified using SVM. Experimental results show that the proposed method successfully detected and classified 86 percent of 2844 speech balloons.
We describe an automatic video digesting technique for broadcasted baseball games based on information entropies derived from an overlayed score ticker. The overlayed score ticker of baseball games displays the current score of each team, the inning, and the number of balls/strikes/outs, and whether or not there are runners on base. This information is used to estimate specific events in baseball that we defined: 3 runner situations, 8 batting results, and 4 types of scores. Rare events such as a “home run” have higher information entropy, and frequent events such as “out” have lower entropy. We suppose that an event with higher information entropy is a more important scene for video digesting. Thus, the scenes with higher entropies are selected by priority and embedded into a limited timeline. In our experiment, we applied our technique to broadcasted baseball games and compared it with a conventional sound volume based technique. The two techniques were evaluated subjectively by watching generated videos. We found that our technique has higher performance.
The relationship between the resolution and pixel pitch considering the optical diffraction principally caused by the lens circle aperture is discussed for the image sensor required to secure a high SNR and full well capacity for each pixel. The light diffraction of some high luminance patterns was calculated. We discuss the results for the optimization methodology of pixel pitch scaling considering the influence of the light diffraction. The performance of a CMOS image sensor with shared and small pixels and lateral overflow integration capacitor was developed based on our results. A CMOS image sensor consisting of 1/3.3 inch optical format, 3-μm pixel pitch and 1280 (H) × 960 (V) pixels was fabricated Using a 0.18-μm 2P3M CMOS technology with a buried pinned photodiode process. The sensor achieved 84μV/e- photo-electric conversion gain, 6.9×104e- full well capacity and 90 dB dynamic range in one exposure.
We propose a method of using a random forest algorithm to quickly detect semantically high-level features such as specific objects. The random forest has a lower computation cost than that of the common algorithm such as a support vector machine (SVM). However, it cannot cope with training data that have a large bias in the number of negative and positive examples. We improve the conventional training algorithm to ensure sampling the data with equal probability from each class when creating bootstrap samples, which increases the classification accuracy. Experiments on the Caltech-101 dataset resulted in a recall of 64.3% and precision of 71.1%, which were comparable to those of conventional methods. The average time needed for training and for detection were reduced to one sixteenth and one twenty-seventh that of SVM, respectively.
We propose the use of a hologram plate, which can be used to reconstruct an active 3-dimensional image by changing the position of light sources. We experimented with using an optical system and reconstructed two kinds of 3-dimensional images by changing the position of the light sources. Theoretically, we can also reconstruct various 3-dimensional images. We then did a computer simulation to confirm our results, and we obtained the same result as that of an optical system. Furthermore, we divided the hologram into segments and did a computer simulation. We obtained reconstructed images from the hologram divided into the segments. We are now investigating the possibility of using the display of a cellular phone as a light source for the plate.
We present a simple method for improving photo composition based on the rule of thirds using a Cartesian resizing technique to vary the width and height of images while preserving the shape of principal objects. First, salient objects are extracted from an input image, which is then partitioned into nine rectangles. These rectangles are resized and recombined to form an image where the salient objects are moved to intersections of trisection lines.