This paper describes a color reproduction method to improve the white-balance of a color image when no information is provided on an illuminant color in a scene and the color characteristics of the color imaging device that took the scene. Based on the assumption that human beings use colors of specific objects that we have memory colors as a key to recognize colors in an image, the illuminant color and object colors in a scene are estimated from color information of specific objects in the image. In this study, a human face is used as one of the most important objects for which we have memory colors. Color correction of the image is carried out by transposing the estimated illuminant color to an illuminant color that improves the white-balance. The illuminant color for the white-balance is determined by introducing the concept that incomplete chromatic adaptation in human color vision produces the desired results. Subjective experiments highlight the validity of the proposed method.
We have developed an HDTV (High-definition television) high speed camera that uses three 2.2 M-pixel complementary metal oxide semiconductor image devices. The camera enables image acquisition up to a maximum speed of 300 frames per second and stores the image sequences directly in a semiconductor memory chip in the camera head. High-quality slow motion playback is available as the image is stored without being compression. A compact, lightweight (7.7 kg), handheld camera head can be controlled from a camera control unit using a single standard HDTV hybrid fiber optic camera cable (Max. 2 km).
We propose Si-CMOS image sensors that can detect near infrared (NIR) light, including the eye-safe region (λ: 1.4-2.0μm). The capability of the sensor to capture visible images remained completely intact. The proposed sensor consisted of a conventional Si complementary metal oxide semiconductor (CMOS) image sensor and a Ge photodiode (PD) array formed underneath a CMOS image sensor. The operation principle for NIR detection was based on photo-carrier injection into the Si-substrate from the Ge PD. A process to form n-regions on the reverse of a fabricated CMOS sensor is also discussed.
We have developed a 2.0-μm-pixel-pitch 2M-MOS image sensor. The key technologies that were used to achieve the high sensitivity of the MOS image sensor are a new pixel circuit configuration, a fine design rule of 0.15 μm, and a thin amorphous silicon film color filter. In the new pixel circuit configuration, a unit pixel consists of a photodiode, a transfer transistor, and an amplifier circuit with two transistors that are shared by four pixels. The unit pixel thus has 1.5 transistors. The fine design rule of 0.15 μm enables a 40% reduction in wiring area. As a result, an aperture ratio of 30% is achieved. A new color filter made of amorphous silicon is 1/10 the thickness of a conventional organic pigment color filter and gives rise to highly efficient light collection. The high sensitivity of the image sensor is achieved using these three technologies.
A newly developed camera is proposed that uses two types of devices-gallium arsenide phosphorus (GaAsP) to provide a high level of quantum efficiency in the visible light region and gallium arsenide (GaAs) to provide a high level of quantum efficiency in the near infrared region-in the photocathode of a proximity type image intensifier (I.I). We have discovered that image burning, which has been a problem with the photoelectric surface in a conventional I.I, can be halved by increasing the resistance of a micro-channel plate. We have developed a noise-reduction circuit capable of eliminating feedback ion noise by taking into account its characteristics. We have also developed a lens and a prism that has an enhanced ability to allow more near-infrared rays to pass through. In actual imaging experiments, we were able to take quality images of animals in conditions without moonlight.
We have investigated a method for adapting the luminosity of an image sensor based on CMOS technology. The sensor effectively takes advantage of a negative feedback method to turn any mid-voltage into a PD-capacitance while a pixel circuit is in motion. We also used quasi-holding for each pixel value by resetting the pixel-output voltage to the PD-capacitance. We adapted the luminosity by selecting each pixel indvidually then setting mid-voltage or the quasi-holding. The output data is characteristic of a polygonal line that changes from high to low sensitivity as the luminosity gets stronger and can be output without being modified. Using this method, the sensor was created from pixel circuits that consist of three nMOSFETs and a PD. We greatly reduced the flags and frame buffer in the external memories.
We have been investigating a smart image sensor on which an image compression function can be integrated. By using on-sensor image compression, the communication bottleneck between a sensor and a peripheral circuit can be improved. Therefore, high frame rate imaging can be developed. We designed a new image compression sensor. It can significantly reduce the amount of pixel data output from the sensor by using spatial and temporal correlation. We evaluated the proposed sensor by computer simulation and evaluated a prototype chip.
To determine an effective way of displaying language information, the ability of a subject to appropriately understand the information must be investigated. Event-related potential (ERP), which appears to reflect cognitive processes in the brain, is a suitable measure for such investigations. Therefore, to investigate the perception processes of kanji (Chinese characters), we analyzed the ERP of the subjects performing a kanji discrimination task which used known kanji, unknown kanji, and symbols as stimuli. P100, N170 and P250 at occipito-temporo-parietal electrodes and the vertex-positive potential (VPP) at front-central electrodes were elicited. There were significant amplitude differences between unknown and known kanji and between symbols and known kanji in both P100 and N170. Significant latency differences were also observed in both N170 and VPP in the same comparisons. In addition, similar significant differences existed in the comparison between symbols and unknown kanji. These results suggest that P100, N170, VPP are related to the kanji perception process, especially to the comparison stage.
We investigated the effects of pixel density, anti-aliasing, and stroke width on the subjective image quality of characters displayed on high-density liquid-crystal displays (LCDs). We used 6×4.5 photographs to simulate high-density LCDs. The simulated LCDs had 70 different display conditions that consisted of seven pixel densities (100 to 400 ppi stepped by 50 ppi), five stroke widths, and two font types (anti-aliased fonts and bi-level fonts). The character size was fixed at about 3 mm. The display luminance was set to 150 cd/m2 and a contrast ratio of 30:1 was used. At a 30-cm viewing distance, 30 subjects assessed the character image quality of the 70 simulated LCDs. The results indicated that subjective evaluation of character image quality became saturated at about 250 ppi for anti-aliased fonts and 350 ppi for bi-level fonts. Therefore, the required pixel density for computer displays is 250 ppi for anti-aliased fonts and 350 ppi for bi-level fonts.
This paper proposes a new method for integrating range images with different resolutions to generate an entire 3D model. The modeling method is based on the fact range images need an overlapping area between the images for the registration process. Thus, we integrate data from two range images into one surface model, and represent the overlapping area with triangular patches. Few conventional methods have paid attention to integrating images with different resolutions. Moreover, most conventional methods find corresponding points by using the distance between points. However, it sometimes causes wrong corresponding points in convex and concave surfaces. We propose a new integration method that uses geometric and topological information. Experimental results showed that this method is efficient for integrating multiple range images.
Pictures are sometimes a good way to convey concepts. When a picture is used as an index of its referent to explain a concept, the understandability of the concept depends on the relationship of the picture and its referent. In this article, we considered the reason for this “relationship influence” by using two seemingly heterogeneous measures, subjective distance and reaction time to connect a picture and its referent. We used word association and perceptual simulation to determine what the picture represented and whether it was relevant to its referent, and a combination of these two strategies affected reaction time. The ease in making a path from a referent to a picture and a referent belonging to the basic level are the conditions when using word association. Strength of corporal sense, especially tactual and adjacency, depends on the relationships effect on understandability of a concept.
An SD (semantic differential) method is an effective way to evaluate subjective impression of an AV (audio visual) content. However, because the SD method is performed after viewers watch an AV content, it is difficult to examine their impression of detailed scenes in the AV content in real time. To measure viewer's impression of an AV content in real time, we have developed a “Taikan sensor” that precisely measures the viewer's grip strength and body temperature. Based the Taikan sensor measurement, we investigated the relationships between the viewers' impression and their grip strength. Factor analysis of the viewers' impression was performed on detailed scenes when the grip strength became strong. Results showed that the viewers' impression influenced their grip strength and the viewer received an “active” impression from the AV content when the grip strength increased. Therefore, the Taikan sensor enables real time measurement of a viewer's “active” impression from an AV content.
When a blur in images was changed, we found a new depth effect in a DFD (depth-fused 3-D) display that consisted of two overlapping 2-D screens. In ordinary DFD displays, two images displayed on front and rear screens are identical, but their luminances are different. Observers can perceive an apparent 3-D depth depending on the luminance ratio between the front and rear images. The purpose of our study is to examine the depth perception cues of a DFD display when two identical images are made slightly different by changing a blur. We found that the depth was controlled by only changing the luminance distribution around the edges of an image without any change in the luminance ratio. We proposed a perception model for depth changed by a blur in this paper. The perceived depth was qualitatively evaluated using a predetermined value. We suggest that the binocular disparity between the effective edges defines the perceived depth.
In this thesis, a remote visualization service is proposed that evaluated generality and interactiveness to make easy use of volume data. The proposed service consisted of renderers, master, and clients, and mediated two or more renderers and clients by a master to enable generality. On a client, intermediate frames were generated using image-based rendering based on a result image transmitted from a master, so that the client can display result images for operations of the user. Therefore an interactive visualization was provided through a WWW brower. The proposed service achieved a high frame rate display and reduced the size of the transmission data, independent of the size of objective volume data.
To create a new virtual reality space, it is important to obtain a human characteristic based on a psychophysiological analysis of a human sensation, KANSEI and an involuntary physiological response. We propose a new method of psychophysiological assessment based on a bodily sensation that is essential to virtual reality.
When thresholding is performed on images for isolating the facial area from the background, many defect areas (holes) remain in the facial area. The holes must be eliminated before extracting facial features, such as the eyes, mouth, and nose, which should be extracted with a high degree of accuracy. An algorithm that quickly eliminates holes instead of using conventional propagating and shrinking operations is described. The algorithm can be used independently of the size of the holes.
We propose estimation technique of user preferences for TV programs based on channel operations. The user preferences are available for automatic recommendation of TV programs. For practical use, automatically learning the preferences of a user from a channel selection history is important. However, obtaining such information is difficult because most users change channels frequently and do not watch programs from beginning to end. For automatic learning under such situations, an appropriate hypothesis describing the relationship between viewing time and preference degree for a program is needed. We propose three hypotheses and compared their utility in our program recommendation system. Experimental results showed that the preference for a TV program is not proportional to the viewing time, but becomes either 1 (like) or 0 (dislike) about 30 minutes after channel selection.
We have previously invented 3D visualization technologies, but none ever had broad success. We improved the stereoscopic display using a convex lens. On a conventional field-lens display with eyetracking, several viewers can perceive a stereo pair simultaneously and can move independently without special equipment. However, to superimpose the left and right full screen images for high resolution viewing, a complicated optical system is needed. To overcome this problem, we developed a field-lens display using a dual LCD panel, which consists of two liquid crystal layers and modulates orthogonal polarized illumination. This 3D display using the field-lens, which serves to direct each image into the left and right eyes, enables observers to view full screen high resolution 3D images.