Understanding 3D image world from 2D image is a underconstraint problem. To solve the ploblem, some modeis and knowledges have to be used. Recent researches are introduced such as automatic model synthesis and learning of knowledg by neural network. Researches on Human Image Reader (understanding of human image) for a user friendly human interface are also introduced.
We have been studying a personalized information environment and its architecture, and have proposed a new basic concept called "metaware". This metaware concept is a system which is created by synthesizing multiplexed metaphors. In this paper we discuss that metaware and an image sequence in TV programs and movies are the same in the sense that they both visually represent their produced world dynamically. Next, we take a TV program as an example to analyze its structure and propose a framework of the produced world and a manner of its visualization. Lastly, we discuss that "metaware", a human interface as a produced world, helps the user in his recognition of a viewpoint.
We present an approach to describe moving pictures in terms of their structural properties for video coding, video haudling, and video indexing. The description contains 2D shape, motion, spatial relation, and relative depth of each region. To obtain the deseription, we develop the incremental segmentation scheme. The scheme has been desigued aloug the analysis-by-synthesis approach, and uses a sequence of images to estimate object boundaries and motion information incrementally/successively. By combining the information from extended (longer) image sequences, aud also by treating the segmentation and dynamic occlusion analysis simultaneously, the scheme attenipt to improve successively over time the accuracy of the object boundary and motion information. Some preliminary results are shown.
This paper proposes video handling techniques based on structured video information. Video handling architecture which includes feature extraction process and video structure management is discussed. Main mechanism of this architecture is video indexing based on video features which assures efficient video handling without sophisticated pattern recognition. Applications using communication networks are also proposed.
We have already proposed an automatic facial motion image synthesis schemes driven by speech and text as media conversion schemes. The purpose of this scheme is to realize an intelligent human-machine interface or intelligent communication system with talking head images. Human face is reconstructed with 3D surface model and texture mapping technique. In this paper, we applied these schemes to multi-media human-machine interface. One example is multi-media E-mail system. Scenerio expression tool and real-time image playback system are realized on workstation window.
We have recently proposed a new method for acquiring high resolution pictures by processing two images taken with two different cameras. Proposed method integrates two images into one higher resolution image. Integration process consists of two processes, "registration" and "restoration". We herein study the restoration process, and demonstrate that the iteration algorithm is the most suitable to the restoration processing.