Active Search is a general algorithm for quick searching of image, audio, or video. This method accelerates the searching speed by skipping matching process at some locations without sacrificing the accuracy using the upper bound of a matching similarity. In this talk, I explain the basic idea of the Active Search, and its recent research directions; (1) the application to efficient control of pan-tilte-zoom-cameras to find the objects in a scene, (2) a very high speed searching algorithm based on global upper-bound pruning, and (3) the application to searching background music from broadcast audio data.
In this paper we introduce a novel vision-based interface which captures gestures of the user's mouth and uses these together with a pen and tablet interface for multimodal control of drawing and painting. With small motions of the mouth the user can control pen attributes such as line thickness, hardness, ink colour and so on. Simultaneous control of mouth and hand gestures is relatively facile, so we have found that it is not difficult for an artist to learn how to use the new interface. so we have found that it is not difficult for an artist to learn how to use the new interface.
We propose a new compressed-domain image retrieval scheme exploiting the similarity of PIFS codes. In PIFS encoding, the compressed code contains contractive mapping information between similar regions in an image. These mapping information can be represented as vectors carrying structural feature of the image. We introduce a new simialrity measure between two vector sets, and exploit them in image categorization and retrieval. In this report, we explain this scheme and demonstrate its usefulness through experiments.
This paper describes a web-based system visualizing user's touring records automatically and supporting context-awareness and communication. Touring records are records of exhibition touring at museums, academic conferences, and so on. This system creates virtual cylindrical structures from user's touring records using Web3D technologies. This information visualization technique clarifies user's experience that includes interactions with other users and articles. We present a first prototype of this system after considering related researches and our video summary serves as a pilot study.
RoboCup is an international joint project to promote AI, robotics, and related field. In this paper, we briefly introduce the RoboCup organization first and then discuss the configuration of RoboDragons system which is our team participating in the small size robot league. RoboDragons is one of the best eight teams in the RoboCup 2002 Fukuoka competition. We discuss the system configuration of RoboDragons, with an emphasis on the importance of the balanced system design.
This paper presents technical ideas and improvements from the view points of image processing, mechanical devices and strategy. Our new robot system introduces high-speed and high resolution image processing system and multi-camera system in order to realize the high-speed self localization and to avoid the occlusion of the ball by the body of the robot, respectively. Each robot has 4 omni-directional wheels, a kicking device controled by a solenoid and a dribbling device to keep a ball. The software systems, image processing system and strategy system, control the robot to realize cooperative play such as defense, pass, assist, and so on.
Scalable video coding is being expected in video transmission applications due to adapting for wide variety of capabilities on video terminals and network conditions. SNR scalability, where video quality is adjusted without changing the number of coded pixels, is a suitable method for such scenarios and several types of techniques have been proposed. Especially the layered coding technique has been studied in which after coded original image as the base layer, the differential image between the decoded image of base layer and original image is coded as the enhancement layer. Recently FGS (Fine Granular Scalable) technique is high profile, since it makes finer rate scalability possible because quantization is multistage and the differential image can be decoded from a part of enhancement layer bitstream. In MPEG-4 FGS, quantization is processed in DCT domain, therefore subjective quality is not always improved since quantization noise (e.g. blocking noise) occurs on the enhancement layer. This paper proposes PDS (Pixel Domain Scalable) video coding technique, in which subjective quality is possible to be adjusted more directly. For the fundamental study, this paper proposes the multistage scalar quantization PDS, and also discusses the effectiveness with the preliminary results of subjective tests.
Under the 21st Century Center of Excellence (COE) Program organized by the Ministry of Education, Culture, Sports, Science and Technology, a research group led by Dr. Yasuhito Suenaga, Professor at the Graduate School of Information Science, Nagoya University, was selected as a COE in the category of Information Sciences, Electrical and Electronic Engineering, with the research subject of "Intelligent Media Integration (IMI) for Social Information Infrastructure." This IMI COE promotes empirical research into media information processing through the intelligent integration of speech and image with a view to creating the "ears," "eyes," and "brain" in information-oriented society. At the IMI COE, about 40 researchers from the Graduate School of Information Science, newly established in April 2003; the Graduate School of Engineering, the Center for Information Media Studies and the Information Technology Center work together to promote world-class research and educational activities with the focus on providing scope for young researchers who will be major players in the next generation.
Pattern recognition of medical images is a main theme of medical image processing, esp. in CAD of medical images. The author summaraizes historical change and extension of research themes in this field, and discusses present states and future themes.