In this paper we propose a novel method of arbitrarily focused image acquisition using multiple differently focused images. Based on the assumption that depth of the scene changes stepwise, we derive a formula between the desired arbitrarily focused image and multiple acquired images; we can reconstruct the arbitrarily focused image by iterative use of the formula. We show we can reconstruct arbitrarily focused images for natural scenes. In other words, we can simulate virtual cameras and synthesize images focused on arbitrary depths. Our method needs only Point Spread Functions (PSFs) of acquired images for reconstruction and does not need any spatial segmentation. We also present our method of analysis of camera features; we analyze the relation between PSFs and camera parameters for test scenes. Then using the relation we can generate the desired arbitrarily focused images without any preprocessings for determining PSFs of the acquired images.
First, this paper introduces the concept of “Integrated 3-D Visual Communication”. The key feature in this concept is the display-independent neutrality and flexibility of the representation of visual data. Secondly, a ray-based approach is examined in order to realize this concept. The advantage of a ray-based approach is that any view can be synthesized from ray data independently of any geometric representations. All the visual data is represented by a set of ray data which is defined in five-dimensional data space. Assuming that rays go straight on without any variations in the direction of propagation, any individual ray data can be efficiently stored in the fewer dimensions. This paper proposes and formulates three methods for projecting five-dimensional data space onto a space with fewer dimensions. The experimental results show how the proposed methods could be used in the next-generation 3-D image communication and photo-realistic virtual reality systems.
A ray-based method has been proposed as a neutral description method for various kinds of 3-D images. In this paper, we propose a new method for converting a set of range and color data of 3-D objects obtained by a range finder into ray data. This method enables us to handle range data in the same manner as other kinds of 3-D images and to use conventional image processing techniques to improve the quality of reconstructed images. Furthermore, to compensate for the missing of range data caused by some restrictions inherent to a range finder, we examine the ray-based merging method of multiple sets of range and color data of the same object, which are obtained under different conditions. Experimental results show the usefulness of the proposed method to facilitate a range finder as an input device of 3-D information in an integrated 3-D communication system.
3D Active Net is an energy-minimizing surface model which can extract a volume around features of interest from 3D volume data. It is deformable and evolves in 3D space to be attracted to salient features, according to its internal and image energy. The net can be fitted to the contours of a target by defining the image energy suitable for the contour property. We present test results of the extraction of a muscle from Visible Human Data by two methods : manual segmentation and the application of 3D Active Net. We use principal component analysis, which utilizes color information of the 3D volume data to define an ill-defined contour of the muscle, and then we use 3D Active Net. We recognize that the extracted object has a smooth and natural contour which is in contrast with that of a comparable manual segmentation, proving the advantage of our approach.
A new method for estimating human posture from multiple images using a genetic algorithm is proposed. In the proposed algorithm, the posture parameters to be estimated are assigned to the genes of an individual in the population. For each individual, its fitness evaluates to what extent the human multiple images synthesized by deforming a 3D human model according to the values of the genes are registered to the real human multiple images. Genetic operations such as natural selection, crossover and mutation are performed so that individuals in the next generation are generated. After a certain number of repetitions of these processes, the estimated parameter values are obtained from the individual with the best fitness. Experiments using multiple synthesized images show promising results for estimating 17 joint angle values for each degree of freedom of the joints and also the three translational and three rotational degrees of freedom.
We describe a method for detecting hand position, posture, and finger bendings using multiple camera images. Stable detection can be achieved using distance transformed images. We detect the maximum point in each distance transformed image as the center of gravity (COG) point of the hand region and calculate its 3D position by stereo matching. The distance value of a COG point varies according to the angle between the camera axis and normal axis of the hand plane. Hand rotation angle can be determined in maximum likelihood estimation from the distance values in all camera images. Using the detected position and posture, the best camera for hand shape detection can be selected. This camera selection makes the hand shape detection simple and stable. This system can be used as a user interface device in a virtual environment, replacing glove-type devices and overcoming most of the disadvantages of contact-type devices.
To input an accurate magnification image, calibration of the focal length is needed in an active camera system which can control the zoom, focus, iris, and viewing angle. We show that in a calibration with a pinhole camera model, the arrangement of the measurement points influences accuracy and good camera parameter estimation is difficult by using a single target. In addition, we show that by calibrating with the zoom lens camera model we propose, good camera parameter estimation can be achieved even when the number of the measurement points on a single plane is small and the error of the image coordinates is large.
The range finder proposed in this paper is based on color pattern projection method so that it can measure small objects. The color pattern consists of stripes, whose color is continuously changed in such a way that hue changes linearly. The advantages of the system are that it requires only commonly available optical system, it uses a low-cost commercial CCD camera and a LCD projector, and it can measure the whole field of view simultaneously.
We propose a novel image sensor which compresses image data on its sensor plane. The image compression sensor can significantly reduce the amount of pixel data output from the sensor. The proposed sensor is intended to overcome the communication bottleneck in high-pixel-rate imaging, such as high-frame-rate and high-resolution imaging. The compression algorithm is based on conditional replenishment. It detects motion and encodes only the pixels in moving areas. This paper describes the design and implementation of the proposed sensor based on column parallel architecture. In this architecture, fill factor and power dissipation are at a similar level to those of conventional MOS sensors. We show the results of some experiments using a prototype chip.
We propose a motion adaptive sensor for image enhancement and for a wide dynamic range. The motion adaptive sensor is able to control integration time pixel by pixel. The integration time of each pixel is controlled by saturation and temporal changes caused by detected incident light. By controlling the integration time, we can expect a high temporal resolution in the moving area, a high SNR in the static area, and a wide dynamic range. In this paper, we present a design and an implementation of the proposed sensor based on column parallel architecture. By using this column parallel architecture, integration of processing does not significantly sacrifice fill factor and power dissipation. We have fabricated a prototype using the CMOS 1.0 pm process and we show the results of our experiments.
The high-resolution and comprehensive volume rendering algorithm described in this paper deals with three types of optical structures within a volume : boundaries, fuzzy boundaries, and volume intervals. After these structures are detected by isosurface-ray intersection tests, the boundaries are rendered by sophisticated shading techniques that are intensively investigated in the ray tracing society. Fuzzy boundaries are visualized using multiple-threshold semitransparent isosurfaces to weaken visible artifacts. The absorption of a volume interval is also visualized, using an absorption ratio resampled at the middle point within the interval. The images produced are free from aliasing and blur. We also show that the processing time and memory requirements are reasonable when we consider the image quality.