This paper presents a novel texture-mapping method, called harmonized texture mapping (HTM), for reducing the blurring and ghosting artifacts that are created by most existing visualization methods which do not consider the inaccuracy of a generated 3D shape and camera parameters. HTM can significantly reduce the occurrence of artifacts by dynamically adjusting textures generated from multi-viewpoint images in accordance with both textural features of the images and geometrical features of the generated 3D shape. The performance of HTM is evaluated by quantitative and qualitative experiments using several 3D video sequences.
We present a way of using a computer simulation to reconstruct gray-level images from computer-generated Fresnel holograms (Fresnel CGHs) using a two-dimensional (2D) image as an object. The interference fringes formed with spherical object waves emitting from 2D-image pixels and a plane reference wave were recorded as the sum of Fresnel zone plates at a CGH plane. It is assumed that the amplitude of the object wave is proportional to the pixel intensity of the 2D image. Reconstructed images from the Fresnel CGH are obtained by applying the fast Fourier transform algorithm to the Fresnel-Kirchhoff diffraction integral. The simulations were executed by changing parameters such as a bit number for CGH data recordings and a sampling interval of the interference fringes at the CGH plane, and some simulation results are presented; recording data with a resolution of 5 or more bits appears to be the best way to construct high-quality images.
We propose a method that can reconstruct both facial poses and the facial shape from freehand multi-viewpoint snapshots. This method is based on Active Shape Models (ASM), which is a technique that uses a facial shape database. Most ASM methods require image in which the facial pose is known, but our method does not require this information. First, we chose an initial shape by selecting the model from the database which is most similar to the input images. Then, we improve the model by morphing it to better fit the input images. Next, we estimate the pose of the face using the morphed model. Finally we repeat the process, improving both the facial shape and the facial pose until the error between the input and the computed result is minimized. Through experimentation, we show that our method reconstructs the facial shape within 3.5mm of the ground truth.
We present an automatic topology change detection and mesh editing interface for Time-Varying Meshes (TVMs). A TVM is a sequence of 3D mesh models of real-world objects generated from multiple-view images. Changes in their topology are detected by extracting skeletons from them and tracking the motion of the skeletons. An interactive tool for mesh editing has also been developed. As a result, the time consistency of the mesh topology can be maintained thus enabling us to apply several mesh processing algorithms to TVMs. The validity of our system was confirmed by applying an edited TVM sequence to motion tracking. Experimental results demonstrate that the tracking accuracy was improved by 859.4 mm in the best case.
We present an efficient coding method for multiview video plus a multiview depth map. This representation has been discussed as a potential standard for advanced visual media such as free-viewpoint TV and 3D video. The method is used to predict three kinds of information, disparity vectors, coding modes, and image signals, by utilizing pixel correspondences among views that can be calculated from coded multiview depth maps. Experiments show that using the proposed method reduces the bitrate by up to 22.5% relative to simulcast multiview video coding of multiview video and depth maps. The results also show the coding are not significantly affected by encoding noises on depth maps. Furthermore, the bitrate is reduced by up to 11.8% for multiview video coding where the bitrate for the depth maps is treated as only an overhead.
We propose a dynamic range extension method for CMOS imagers with an in-pixel lateral overflow structure, in which one of active and passive readout schemes is selected at each pixel on the basis of the illumination. For a high illumination, the charges stored in both photodiode and area-effective MOS capacitor as an overflow capacitor in a pixel are read with high linearity by a charge amplifier at each column to extend the range of the readable voltage. The column circuit area is suppressed and row-wise pixels are accessed by sharing the pixel control signals and the column amplifier between two readout modes. We also describe an image reproduction scheme that makes use of electric calibration from a pixel-wise mixed-mode image. The signal-to-noise ratio of the scheme is discussed. Preliminary experiments with a 128 x96-pixel prototype imager fabricated in 0.35μm CMOS technology showed an extension of the dynamic range of 18.5 dB (72.8 dB in total) and offline wide-dynamic-range image reproduction. Optimizing of the capacitances in the pixels and the column amplifier and reducing the amount of noise in the column amplifiers will provide a total dynamic range of around 120dB.
If sharp and blurred TV pictures (or sampled motion) are repeated, they look still sharp. This mechanism is made clear by analyzing motion visual perception in the spatio-temporal frequency domain. On the basis of this mechanism, two applications are proposed: enabling pictures to be more efficiently coded, and frame interpolation with blurred pictures in a doubled frame rate display.