We developed a model that can operate similarly to human categorical color perception. The color of an object is not exclusively determined by the reflection spectrum from the surface of the object but is greatly affected by the ambient environmental conditions and depends upon color constancy, The mechanism of color constancy, however, is not explained in detail so acquiring the cognition of the categorical color name of objects under different illuminations is difficult. To that end, the relationship between the chromaticity and the categorical color perception of colored chips under different illuminations is the product of a categorical color-naming experiment was learned by using a neural network. The results showed that the obtained neural network has similar characteristics to those of human vision system.
We developed a robust method for estimating lip position that is independent of camera zoom and direction of the face. The method require no extra information about position or form in advance. The method was carriedout without limited conditions such as lipstick or lighting because lip is generally redder than the skin. The method makes use of the psychometric quantities of rectangular coordinates (a*) and color difference (Δa*), which are defined in theCIE 1976 L*a*b* color space. The CIE 1976 L*a*b* color space separates chroma channels and lightness, and it is nearly perceptually uniform. The method extracted the outer-lip contour based on the Δa* values. Furthermore, we estimated the lip position with the aid of the a* color histogram. The results indicated the effectiveness of this method; the image data was used to estimate the lip's position with about 96-percents accuracy.
This paper presents experiments conducted to evaluate a video editing rule based on participant's gaze for clearly conveying face-to-face multiparty conversations, such as meetings, to viewers. Systems that record meetings and those that support teleconferences are attracting considerable interest because of their potential for facilitating human communication, Conventional systems use a fixed-viewpoint camera and simple camera selection based on the participants utterances. Unfortunately, such systems fail to adequately convey who is talking to whom and nonverbal information about participants etc., to viewers. To solve these problems, we previously proposed a video editing rule that uses the majority decision of participants' gaze direction. This paper describes experiments in which videos of entire conversations with 3 participants were presented to viewers. The results confirm the effectiveness of our video editing rule.
We developed an “Opto-Navi” system as a multi-purpose visual remote controller of information appliances. The system can be implemented into mobile information appliances such as cellular phones or PDAs. The “Opto-Navi” system incorporates a customized image sensor capable of capturing high-speed ID signals duringnormal scene imaging. The CMOS image sensor architecture for the “Opto-Navi” system includes a fast-readout function for multiple regions-of-interested (ROIs). In our implementation, pilot signals slower than the videorate were used to define the ROIs. We developed new sensor architecture to suppress the increase in power consumption during high-speed ID readout. Furthermore, the sensor has a simple pixel circuit that enables high-resolution imaging. We fabricated a QVGA image sensor using standard 0.35-pm CMOS technology. We demonstrated the acquisition of 7 IDs with 5 X 5 pixels at 1.2 kfps/ID with simultaneous normal scene imaging at 30 fps. Thedevice power consumption was 3.6 mW.
Pictures taken by a camera with a shutter look better and sharper than those without one, because there is less blurring of moving objects. This is explained by apparent motion, a mystery of the human visual system. Similarly, pictures on liquid crystal displays are clearer with blinking lighting. This is explained by eye-tracking for moving objects. In this paper, spatio-temporal sampling is used to analyze and clarify the effects of the shutter and blinking lighting, instead of the apparent motion and the eye-tracking.
We are developing “virtual acoustic environment, ” in which listeners hear a car crossing in front of them and judge the right time to cross a driveway. People with severe visual impairment (referred to as the blind) have to make a lot of effort to travel independently because they have difficulty crossing a driveways. We conducted a preliminary experiment to demonstrate that the reverberation and reflection of sound is an effective way of judging the arrival time of car. Ten sighted subjects were asked to listen to a car passing by them in the virtual acoustic environment and press a computer button when they perceived the car passing in front of them. The results of this study suggested that reverberation and reflection of sound is useful for blind in perceiving when crossing driveways is safe. The system we are developing and findings obtained from the psychoacoustic experiments will be used to develop education and rehabilitation programs at the school for the blind.
To determine the differences in the visual characteristics between people in their twenties and those in their sixties under low-illuminance conditions, assuming night driving conditions, we measured the recognition ratios and recognition time of subjects by changing the presentation position, the size, the luminance, and the contrast of a Landolt. A Landolt is the circle which is cut off like alphabet “C”, and is used as the eye testing chart. By analyzing the data, we found that the recognition ratio of both age groups increased as the background luminance increased with a constant contrast, or as the contrast increased with a constant background luminance.
We developed a method for presentingauditory information to drivers that even elderly drivers can hear easily. Our method is used for correcting high-frequency sounds that are important for arousing attention during emergencies and is based on the physiology that is characteristic of elderly people whose hearing is impaired. We verified the effectiveness of our method by conducting experiments in which the subjects included elderly drivers whose hearing was significantly impaired.
We investigated temporal summation properties in the peripheral retina using luminance and chromatic double pulse lights. The sizes of the stimuli were 1° at the fovea and 10° at 30° temporal visual field, being scaled according to the cortical magnification factor. Consistent acceleration of the characteristic peaks and troughs of the summation properties with retinal eccentricity suggested that a double-duty mechanism might be preserved in the periphery, whereas a marked elevation of the thresholds and the summation indexes in the peripheral chromatic channel suggested a significant change of the chromatic mechanism in the periphery.
In this paper, a selective visual attention module based on motion stimuli is introduced for the purpose of detecting ROI (region of interest) or FOA (focus of attention) in motion pictures. Previous approaches rarely analyze motion fields and incorporate the result to feature integration for the detection of ROI. Analysis of motion fields in our approach is in direct contrast to some of the previous studies of selective visual attention module. Motion that presents temporal visual saliency in an aspect between two successive frames is analyzed based on psychological studies in “DORF (double opponent receptive fields)” and “NF (noise filtration)” in MT (middle temporal cortex). Analyzed results areintegrated based on the theory of “motion integration” in MT to obtain a single conspicuous region. Experiments through a human subjective evaluation showed generally accepted results.
We developed a phase-code multiplexing method using discrete cosine transform. In theory, parts of a phase code must be changed because of symmetric phase conditions. We created a 3×3 phase code condition. Our method has no restriction on the pixel number in contrast with Walsh-Hadamard transform, whose pixel number is 2n. In addition, it has a faster data-transfer-rate because the distance from one phase to another is shorter than in Wlash-Hadamard. We fabricated a phase modulator using liquid crystal and measured the multiplexing recording and reproduction characteristics. We found that multiplexing images could be reproduced with some cross talk. It is important to control the phase precisely.
Stripe ribs make address discharge more stable than box ribs due presumably to the priming particles provided from vertically neighboring cells. Discharge deactivation film (DDF) of γi low material, which selectively covers an MgO surface so as to fold the relative bus lines, improves the address discharge response dramatically because DDF pushes the address discharge area closer to the surface discharge gap. A TiO2 under-layer for phosphor also significantly improves the response. Presumably the electrification characteristic of the layer, which may depends on the surface treatment onto the TiO2 grains, contributes much to the formation of an electric field for generating face-to-face address discharge. Using these in our 46-in. HD-PDP, Xe15% (by volume) is acceptable with respect to practical driving for TV use if the dielectric layer is arranged thinly to prevent extremely high sustain voltage. Thus white peak luminance and luminous efficiency of 1220 cd/m2 and 2.16 lm/W, respectively, can be achieved.