This paper addresses a unified tone mapping operation (TMO) for HDR images with fixed-point arithmetic. A TMO generates a low dynamic range (LDR) image from a high dynamic range (HDR) image by compressing its dynamic range. A unified TMO can perform tone mapping for various HDR image formats with a single common TMO. Since HDR images are generally expressed in a floating-point data format, a TMO also deals with floating-point data even though resulting LDR images have integer data. As a result, conventional TMOs require many resources such as computational and memory cost. To reduce the resources, the method which allows to replace a floating-point number with two 8-bit integer numbers was proposed. However, this method has a limitation of available input HDR image formats. The proposed unified TMO can be applied for various formats such as the RGBE and the OpenEXR by introducing an intermediate format. Moreover, the method can conduct all calculations in the TMO with fixed-point arithmetic. By using both integer data and fixed-point arithmetic, the method reduces not only the memory cost but also the computational cost. The experimental and evaluation results show the proposed method reduces the computational and memory cost, and gives almost same quality of LDR images, compared to the conventional method with floating-point arithmetic.
Essential information contained in an original image may be deteriorated if the image is highly compressed without considering the importance of each region in the image. Assuming that textual information contained in an image is important, we propose a method for image compression while maintaining the readability of characters by automatic evaluation for character readability. The proposed automatic evaluation classifies character images into either readable or unreadable images by using machine learning, and the proposed evaluation is used in quantization table optimization in order to ensure character readability while minimizing the overall image data size. In addition, less important information in view of image recognition in the background region is reduced. By several subjective experiments, we confirm that the proposed method maintains character readability relative to the standard JPEG compression method while retaining the required image quality of background regions in order to maintain sufficient recognition of content and situations.
Image inpainting has been widely investigated to remove undesired parts of images. One of the effective approaches is exemplar-based inpainting, which uses texture patterns in an image as exemplars for filling in missing regions. As one of exemplar-based methods, an inpainting method based on automatic perspective correction using vanishing points was recently proposed. However, the target scene is limited to artificial one in which vanishing points are easily detectable. Although some other methods for automatic perspective correction have also been proposed, the effect on image inpainting has not been evaluated in detail so far. This paper analyzes the effect of multiple methods for automatic perspective correction on image inpainting by developing a method that combines a variety of automatic perspective correction methods and image inpainting. Specifically, we examine the influence of the amount of perspective distortion and characteristics of textures on image inpainting results by using images distorted by simulation. We also examine the effect using real images. In addition, we demonstrate the advantage of employing multiple criteria for perspective correction over the conventional method from the analyzed results.
We have developed a reflective color liquid crystal display with high reflectivity of 60% and a good contrast ratio of 50:1, and a wide color gamut of 50% of NTSC color standards. We fabricated two light diffusion layers with different diffusion angle range in a single polymer film by using two-step UV irradiation process and achieved high reflectivity across a wide angle range. Our reflective display has a high image quality as well as a low power consumption, therefore it is expected to contribute to the development of future low power display applications including smart watches, e-book readers, and digital signages.
We have been developing a new type of Virtual Museum which enables users to participate in the space with both active and passive modes of operation. In the “active mode”, the new virtual museum provides a user walkthrough using the realistic 3DCG-modeled museum space and artifacts in the space. And in the “passive mode”, the system adds desired visual and audio effects such as camerawork, superimposed text, synthesized voice narration, post production processes, background music and so on to give users a TV commentary type of CG animation. Users can easily transition back and forth between the two modes of doing walkthrough in the space actively and watching the video content passively. This paper describes the details of the system design and the implementation followed by a discussion on the functioning prototype.
This paper presents an accurate and efficient method for extracting hierarchical structure of Web communities, i.e., Web video sets with similar topics for Web video retrieval. First, efficient canonical correlation analysis (CCA), named sub-sampled CCA, is derived to obtain link relationships that represent similarities between latent features of Web videos. Moreover, the obtained link relationships enable application of an algorithm based on recursive modularity optimization to extract hierarchical structure of Web communities. Different from previously reported methods, our method can extract the hierarchical structure for the whole target dataset since the algorithm enables recursive reduction of its processing targets. This means it becomes unnecessary to perform screening of Web videos, and we can avoid performance degradation caused by discarding relevant Web videos in the screening, which occurred in previously reported methods. Consequently, our method enables extraction of the hierarchical structure with high accuracy as well as low computational cost.
OFDM (Orthogonal Frequency Division Multiplexing) is the modulation scheme that is widely used in broad band power line communication (PLC). In OFDM transmission, impulsive noise is one of the major factors that limits the performance of PLC systems. In order to sufficiently reduce the influence of impulsive noise, precise estimation of channel transfer function is necessary. In this paper, the estimation scheme of channel transfer function under existence of additive impulsive noise is proposed. In the proposed scheme, successive two pilot OFDM symbols are used for reduction of the influence of impulsive noise. As the results, it is possible to perform precise estimation of channel transfer functions.
A new decision-level fusion (DLF)-based speech segment detection method and its application to audio noise removal for video conferences are presented in this paper. The proposed method calculates visual and audio features from video sequences and audio signals, respectively, obtained in video conferences. Features extracted from mouth regions of participants and attribution degrees of speech class are used as visual and audio features, respectively, and Support Vector Machine (SVM)-based classification is performed by using each kind of feature. The SVM classifier performs two-class classification of speech and non-speech segments to realize speech segment detection. From the detection results obtained from the visual and audio features, DLF based on Supervised Learning from Multiple Experts is performed to successfully obtain the final detection results with focus on the accuracy of each detection result. Then, from audio signals in the non-speech segments detected by our method, we can extract noise information to realize accurate audio noise removal in the speech segments.
A three-dimensional measurement scheme based on a time-of-flight method using a multi-aperture image capturing system is proposed. Temporal coding of exposure time for each aperture is individually provided by equipping an electronic shutter controller for every sub block of a pixel array in an image sensor. The measurement frequency is enhanced by capturing multiple temporally coded images simultaneously. The results of preliminary experiments for confirming measurement of the phase of optical signal modulated at a higher frequency than the frame rate of the image sensor are presented.