ITE Transactions on Media Technology and Applications
Online ISSN : 2186-7364
Volume 4 , Issue 1
Showing 1-11 articles out of 11 articles from the selected issue
Special Section on Advanced Image Technology
  • Kazuhito Murakami
    2016 Volume 4 Issue 1 Pages 1
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    Download PDF (24K)
  • Hitoshi Kiya, Toshiyuki Dobashi
    2016 Volume 4 Issue 1 Pages 2-9
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    This paper addresses a unified tone mapping operation (TMO) for HDR images with fixed-point arithmetic. A TMO generates a low dynamic range (LDR) image from a high dynamic range (HDR) image by compressing its dynamic range. A unified TMO can perform tone mapping for various HDR image formats with a single common TMO. Since HDR images are generally expressed in a floating-point data format, a TMO also deals with floating-point data even though resulting LDR images have integer data. As a result, conventional TMOs require many resources such as computational and memory cost. To reduce the resources, the method which allows to replace a floating-point number with two 8-bit integer numbers was proposed. However, this method has a limitation of available input HDR image formats. The proposed unified TMO can be applied for various formats such as the RGBE and the OpenEXR by introducing an intermediate format. Moreover, the method can conduct all calculations in the TMO with fixed-point arithmetic. By using both integer data and fixed-point arithmetic, the method reduces not only the memory cost but also the computational cost. The experimental and evaluation results show the proposed method reduces the computational and memory cost, and gives almost same quality of LDR images, compared to the conventional method with floating-point arithmetic.
    Download PDF (2145K)
  • Shota Kaneko, Yoshihiro Sugaya, Shinichiro Omachi
    2016 Volume 4 Issue 1 Pages 10-20
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    Essential information contained in an original image may be deteriorated if the image is highly compressed without considering the importance of each region in the image. Assuming that textual information contained in an image is important, we propose a method for image compression while maintaining the readability of characters by automatic evaluation for character readability. The proposed automatic evaluation classifies character images into either readable or unreadable images by using machine learning, and the proposed evaluation is used in quantization table optimization in order to ensure character readability while minimizing the overall image data size. In addition, less important information in view of image recognition in the background region is reduced. By several subjective experiments, we confirm that the proposed method maintains character readability relative to the standard JPEG compression method while retaining the required image quality of background regions in order to maintain sufficient recognition of content and situations.
    Download PDF (2749K)
  • Hiroto Sasao, Norihiko Kawai, Tomokazu Sato, Naokazu Yokoya
    2016 Volume 4 Issue 1 Pages 21-32
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    Image inpainting has been widely investigated to remove undesired parts of images. One of the effective approaches is exemplar-based inpainting, which uses texture patterns in an image as exemplars for filling in missing regions. As one of exemplar-based methods, an inpainting method based on automatic perspective correction using vanishing points was recently proposed. However, the target scene is limited to artificial one in which vanishing points are easily detectable. Although some other methods for automatic perspective correction have also been proposed, the effect on image inpainting has not been evaluated in detail so far. This paper analyzes the effect of multiple methods for automatic perspective correction on image inpainting by developing a method that combines a variety of automatic perspective correction methods and image inpainting. Specifically, we examine the influence of the amount of perspective distortion and characteristics of textures on image inpainting results by using images distorted by simulation. We also examine the effect using real images. In addition, we demonstrate the advantage of employing multiple criteria for perspective correction over the conventional method from the analyzed results.
    Download PDF (7625K)
Special Section on ITE Awards Selection
Regular Section
  • Masaki Hayashi, Steven Bachelder, Masayuki Nakajima, Akihiko Iguchi
    2016 Volume 4 Issue 1 Pages 41-48
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    We have been developing a new type of Virtual Museum which enables users to participate in the space with both active and passive modes of operation. In the “active mode”, the new virtual museum provides a user walkthrough using the realistic 3DCG-modeled museum space and artifacts in the space. And in the “passive mode”, the system adds desired visual and audio effects such as camerawork, superimposed text, synthesized voice narration, post production processes, background music and so on to give users a TV commentary type of CG animation. Users can easily transition back and forth between the two modes of doing walkthrough in the space actively and watching the video content passively. This paper describes the details of the system design and the implementation followed by a discussion on the functioning prototype.
    Download PDF (4464K)
  • Ryosuke Harakawa, Takahiro Ogawa, Miki Haseyama
    2016 Volume 4 Issue 1 Pages 49-59
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    This paper presents an accurate and efficient method for extracting hierarchical structure of Web communities, i.e., Web video sets with similar topics for Web video retrieval. First, efficient canonical correlation analysis (CCA), named sub-sampled CCA, is derived to obtain link relationships that represent similarities between latent features of Web videos. Moreover, the obtained link relationships enable application of an algorithm based on recursive modularity optimization to extract hierarchical structure of Web communities. Different from previously reported methods, our method can extract the hierarchical structure for the whole target dataset since the algorithm enables recursive reduction of its processing targets. This means it becomes unnecessary to perform screening of Web videos, and we can avoid performance degradation caused by discarding relevant Web videos in the screening, which occurred in previously reported methods. Consequently, our method enables extraction of the hierarchical structure with high accuracy as well as low computational cost.
    Download PDF (1869K)
  • Hirotomo Yasui, Akira Nakamura, Kohei Ohno, Makoto Itami
    2016 Volume 4 Issue 1 Pages 60-67
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    OFDM (Orthogonal Frequency Division Multiplexing) is the modulation scheme that is widely used in broad band power line communication (PLC). In OFDM transmission, impulsive noise is one of the major factors that limits the performance of PLC systems. In order to sufficiently reduce the influence of impulsive noise, precise estimation of channel transfer function is necessary. In this paper, the estimation scheme of channel transfer function under existence of additive impulsive noise is proposed. In the proposed scheme, successive two pilot OFDM symbols are used for reduction of the influence of impulsive noise. As the results, it is possible to perform precise estimation of channel transfer functions.
    Download PDF (338K)
  • Kazuto Sasaki, Takahiro Ogawa, Sho Takahashi, Miki Haseyama
    2016 Volume 4 Issue 1 Pages 68-77
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    A new decision-level fusion (DLF)-based speech segment detection method and its application to audio noise removal for video conferences are presented in this paper. The proposed method calculates visual and audio features from video sequences and audio signals, respectively, obtained in video conferences. Features extracted from mouth regions of participants and attribution degrees of speech class are used as visual and audio features, respectively, and Support Vector Machine (SVM)-based classification is performed by using each kind of feature. The SVM classifier performs two-class classification of speech and non-speech segments to realize speech segment detection. From the detection results obtained from the visual and audio features, DLF based on Supervised Learning from Multiple Experts is performed to successfully obtain the final detection results with focus on the accuracy of each detection result. Then, from audio signals in the non-speech segments detected by our method, we can extract noise information to realize accurate audio noise removal in the speech segments.
    Download PDF (847K)
  • Koshiro Moriguchi, Daisuke Miyazaki, Takaaki Mukai, Futa Mochizuki, Ke ...
    2016 Volume 4 Issue 1 Pages 78-83
    Published: 2016
    Released: January 01, 2016
    JOURNALS FREE ACCESS
    A three-dimensional measurement scheme based on a time-of-flight method using a multi-aperture image capturing system is proposed. Temporal coding of exposure time for each aperture is individually provided by equipping an electronic shutter controller for every sub block of a pixel array in an image sensor. The measurement frequency is enhanced by capturing multiple temporally coded images simultaneously. The results of preliminary experiments for confirming measurement of the phase of optical signal modulated at a higher frequency than the frame rate of the image sensor are presented.
    Download PDF (2541K)
feedback
Top