The Journal of the Institute of Image Electronics Engineers of Japan

This paper proposes a selﬁe system using an omnidirectional camera considering facial expressions. In the proposed system, a user captures a few seconds of video with an omnidirectional camera. Next, face detection is performed on all the captured omnidirectional frames, and false face detections are eliminated and undetected faces are complemented considering consistency across all frames. Next, facial expression recognition is performed on the faces in all the frames, and the frame in which a particular facial expression is most apparent is extracted. Finally, a perspective projection image is generated from the omnidirectional image such that all the target participants are within the angle of view. Experiments demonstrate the eﬀectiveness of the proposed method using a variety of scenes, and a developed prototype system shows the practical feasibility of the proposed system.

View full abstract

3D point clouds represent object shapes in 3D space as a collection of points, typically acquired using equipment such as LiDAR. If we use data acquired from multiple directions, we need to registrate between 3D point clouds. However, if overlapping area is small, it is necessary to identify the overlapping areas. There are some methods considered overlapping area, it diﬃcult to optimize according to the features of the 3D point cloud. We aimed to establish the high accurate and speed method to be able to registrate 3D point clouds, and introduced a weighted constraint function during patch selection for registration using genetic algorithm from SVP(Super Voxel Patch). Experimental results conﬁrm the high registration accuracy, with an average translational error of 0.05 [cm] and an average rotational error of 0.26 [deg], even in the registration of 3D point clouds dataset that could not estimate the position of Bunny dataset due to local solutions in previous studies.

View full abstract

Show abstractHide abstract

A method has already been proposed to classify sound sources by converting sound into images and using the image information. In this paper, the authors applied this method to the problem of classifying complex sounds in which multiple sound sources were sounding simultaneously and attempted to improve the classification accuracy. In addition to conventional spectrogram images as sound features, mel-spectrogram, and scalogram images were used. The authors proposed two methods: (1) a method using a learning model based on a color composite of these images, and (2) a method using three types of learning models created from each image and the likelihood information obtained from them. A total of 14 classes of sound sources were classified, including four types of room sounds and their simultaneous sounds, and it was confirmed that a maximum classification accuracy of 90.7% was achieved. In addition, 15 persons subjectively discriminated against these sound sources by hearing. The classification rate was about 30%, indicating that the performance of the proposed method by image information was higher than that of human auditory performance for the target sound sources.

View full abstract

Download PDF (1361K)

QR Code uses RS code to provide an error correction function and achieves correction performance up to its correction limit.Its maximum correction rate is about 30% of all data code words.However, the correction rate of about 30% is limited to correcting data code words and does not address errors in finder patterns and format information that are essential for reading data.It cannot handle a wide range of burst errors.Therefore, finder patterns and format information are added and arranged in a dispersed manner while maintaining compatibility with reading by existing reading devices.Furthermore, the data code word portion is duplicated.An shell structure is used for the additional arrangement of finder patterns and format information.A chimera structure is used for duplication of data code words.This makes it possible to handle burst errors of up to about 60% for 2d symbol area.

View full abstract

Register with J-STAGE for free!