IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E100.D , Issue 9
Showing 1-42 articles out of 42 articles from the selected issue
Special Section on Picture Coding and Image Media Processing
  • Toshiaki Fujii
    2017 Volume E100.D Issue 9 Pages 1943
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS
    Download PDF (61K)
  • Shin KURIHARA, Suguru HIROKAWA, Hisakazu KIKUCHI
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 1944-1952
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Compressive sensing is attractive to distributed video coding with respect to two issues: low complexity in encoding and low data rate in transmission. In this paper, a novel compressive sensing-based distributed video coding system is presented based on a combination of predictive coding and Wyner-Ziv difference coding of compressively sampled frames. Experimental results show that the data volume in transmission in the proposed method is less than one tenth of the distributed compressive video sensing. The quality of decoded video was evaluated in terms of PSNR and structural similarity index as well as visual inspections.

    Download PDF (1519K)
  • Saori TAKEYAMA, Shunsuke ONO, Itsuo KUMAZAWA
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 1953-1961
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Existing image deblurring methods with a blurred/noisy image pair take a two-step approach: blur kernel estimation and image restoration. They can achieve better and much more stable blur kernel estimation than single image deblurring methods. On the other hand, in the image restoration step, they do not exploit the information on the noisy image, or they require ad hoc tuning of interdependent parameters. This paper focuses on the image restoration step and proposes a new restoration method of using a blurred/noisy image pair. In our method, the image restoration problem is formulated as a constrained convex optimization problem, where data-fidelity to a blurred image and that to a noisy image is properly taken into account as multiple hard constraints. This offers (i) high quality restoration when the blurred image also contains noise; (ii) robustness to the estimation error of the blur kernel; and (iii) easy parameter setting. We also provide an efficient algorithm for solving our optimization problem based on the so-called alternating direction method of multipliers (ADMM). Experimental results support our claims.

    Download PDF (2379K)
  • Somchai PHATTHANACHUANCHOM, Rawesak TANAWONGSUWAN
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 1962-1970
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Color transfer is a simple process to change a color tone in one image (source) to look like another image (target). In transferring colors between images, there are several issues needed to be considered including partial color transfer, trial-and-error, and multiple target color transfer. Our approach enables users to transfer colors partially and locally by letting users select their regions of interest from image segmentation. Since there are many ways that we can transfer colors from a set of target regions to a set of source regions, we introduce the region exploration and navigation approach where users can choose their preferred color tones to transfer one region at a time and gradually customize towards their desired results. The preferred color tones sometimes can come from more than one image; therefore our method is extended to allow users to select their preferred color tones from multiple images. Our experimental results have shown the flexibility of our approach to generate reasonable segmented regions of interest and to enable users to explore the possible results more conveniently.

    Download PDF (2515K)
  • Motoharu SONOGASHIRA, Masaaki IIYAMA, Michihiko MINOH
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 1971-1983
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Blind deconvolution (BD) is the problem of restoring sharp images from blurry images when convolution kernels are unknown. While it has a wide range of applications and has been extensively studied, traditional shift-invariant (SI) BD focuses on uniform blur caused by kernels that do not spatially vary. However, real blur caused by factors such as motion and defocus is often nonuniform and thus beyond the ability of SI BD. Although specialized methods exist for nonuniform blur, they can only handle specific blur types. Consequently, the applicability of BD for general blur remains limited. This paper proposes a shift-variant (SV) BD method that models nonuniform blur using a field of kernels that assigns a local kernel to each pixel, thereby allowing pixelwise variation. This concept is realized as a Bayesian model that involves SV convolution with the field of kernels and smoothing of the field for regularization. A variational-Bayesian inference algorithm is derived to jointly estimate a sharp latent image and a field of kernels from a blurry observed image. Owing to the flexibility of the field-of-kernels model, the proposed method can deal with a wider range of blur than previous approaches. Experiments using images with nonuniform blur demonstrate the effectiveness of the proposed SV BD method in comparison with previous SI and SV approaches.

    Download PDF (1976K)
  • Takahiro SUZUKI, Keita TAKAHASHI, Toshiaki FUJII
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 1984-1993
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Structure tensor analysis on epipolar plane images (EPIs) is a successful approach to estimate disparity from a light field, i.e. a dense set of multi-view images. However, the disparity range allowable for the light field is limited because the estimation becomes less accurate as the range of disparities become larger. To overcome this limitation, we developed a new method called sheared EPI analysis, where EPIs are sheared before the structure tensor analysis. The results of analysis obtained with different shear values are integrated into a final disparity map through a smoothing process, which is the key idea of our method. In this paper, we closely investigate the performance of sheared EPI analysis and demonstrate the effectiveness of the smoothing process by extensively evaluating the proposed method with 15 datasets that have large disparity ranges.

    Download PDF (2209K)
  • Huu-Noi DOAN, Tien-Dat NGUYEN, Min-Cheol HONG
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 1994-2004
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    This paper presents a new hole-filling method that uses extrapolated spatio-temporal background information to obtain a synthesized free-view. A new background codebook for extracting reliable temporal background information is introduced. In addition, the paper addresses estimating spatial local background to distinguish background and foreground regions so that spatial background information can be extrapolated. Background holes are filled by combining spatial and temporal background information. Finally, exemplar-based inpainting is applied to fill in the remaining holes using a new priority function. The experimental results demonstrated that satisfactory synthesized views can be obtained using the proposed algorithm.

    Download PDF (1401K)
  • Kohei TATENO, Takahiro OGAWA, Miki HASEYAMA
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 2005-2016
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    A novel dimensionality reduction method, Fisher Discriminant Locality Preserving Canonical Correlation Analysis (FDLP-CCA), for visualizing Web images is presented in this paper. FDLP-CCA can integrate two modalities and discriminate target items in terms of their semantics by considering unique characteristics of the two modalities. In this paper, we focus on Web images with text uploaded on Social Networking Services for these two modalities. Specifically, text features have high discriminate power in terms of semantics. On the other hand, visual features of images give their perceptual relationships. In order to consider both of the above unique characteristics of these two modalities, FDLP-CCA estimates the correlation between the text and visual features with consideration of the cluster structure based on the text features and the local structures based on the visual features. Thus, FDLP-CCA can integrate the different modalities and provide separated manifolds to organize enhanced compactness within each natural cluster.

    Download PDF (3474K)
  • Peng DAI, Shengchun WANG, Yaping HUANG, Hao WANG, Xinyu DU, Qiang HAN
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 2017-2026
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Train-borne video captured from the camera installed in the front or back of the train has been used for railway environment surveillance, including missing communication units and bolts on the track, broken fences, unpredictable objects falling into the rail area or hanging on wires on the top of rails. Moreover, the track condition can be perceived visually from the video by observing and analyzing the train-swaying arising from the track irregularity. However, it's a time-consuming and labor-intensive work to examine the whole large scale video up to dozens of hours frequently. In this paper, we propose a simple and effective method to detect the train-swaying quickly and automatically. We first generate the long rail track panorama (RTP) by stitching the stripes cut from the video frames, and then extract track profile to perform the unevenness detection algorithm on the RTP. The experimental results show that RTP, the compact video representation, can fast examine the visual train-swaying information for track condition perceiving, on which we detect the irregular spots with 92.86% recall and 82.98% precision in only 2 minutes computation from the video close to 1 hour.

    Download PDF (3025K)
  • Goshiro YAMAMOTO, Luiz SAMPAIO, Takafumi TAKETOMI, Christian SANDOR, H ...
    Type: PAPER
    2017 Volume E100.D Issue 9 Pages 2027-2036
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    We present a novel method to enable users to experience mobile interaction with digital content on external displays by embedding markers imperceptibly on the screen. Our method consists of two parts: marker embedding on external displays and marker detection. To embed markers, similar to previous work, we display complementary colors in alternating frames, which are selected by considering L*a*b color space in order to make the markers harder for humans to detect. Our marker detection process does not require mobile devices to be synchronized with the display, while certain constraints for the relation between camera and display update rate need to be fulfilled. In this paper, we have conducted three experiments. The results show 1) selecting complementary colors in the a*b* color plane maximizes imperceptibility, 2) our method is extremely robust when used with static contents and can handle animated contents up to certain optical flow levels, and 3) our method was proved to work well in case of small movements, but large movements can lead to loss of tracking.

    Download PDF (6216K)
  • Takeshi CHUJOH
    Type: LETTER
    2017 Volume E100.D Issue 9 Pages 2037-2038
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In video coding, layered coding is beneficial for applications, because it can encode a number of input sources efficiently and achieve scalability functions. However, in order to achieve the functions, some specific codecs are needed. Meanwhile, although the coding efficiency is insufficient, simulcast that encodes a number of input sources independently is versatile. In this paper, we propose postprocessing for simulcast video coding that can improve picture quality and coding efficiency without using any layered coding. In particular, with a view to achieving spatial scalability, we show that the overlapped filtering (OLF) improves picture quality of the high-resolution layer by using the low-resolution layer.

    Download PDF (363K)
  • Shota KASAI, Yusuke KAMEDA, Tomokazu ISHIKAWA, Ichiro MATSUDA, Susumu ...
    Type: LETTER
    2017 Volume E100.D Issue 9 Pages 2039-2043
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    We propose a method of interframe prediction in depth map coding that uses pixel-wise 3D motion estimated from encoded textures and depth maps. By using the 3D motion, an approximation of the depth map frame to be encoded is generated and used as a reference frame of block-wise motion compensation.

    Download PDF (1496K)
  • Yukihiro BANDOH, Seishi TAKAMURA, Atsushi SHIMIZU
    Type: LETTER
    2017 Volume E100.D Issue 9 Pages 2044-2047
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In current video encoding systems, the acquisition process is independent from the video encoding process. In order to compensate for the independence, pre-filters prior to the encoder are used. However, conventional pre-filters are designed under constraints on the temporal resolution, so they are not optimized enough in terms of coding efficiency. By relaxing the restriction on the temporal resolution of current video encoding systems, there is a good possibility to generate a video signal suitable for the video encoding process. This paper proposes a video generation method with an adaptive temporal filter that utilizes a temporally over-sampled signal. The filter is designed based on dynamic-programming. Experimental results show that the proposed method can reduce encoding rate on average by 3.01 [%] compared to the constant mean filter.

    Download PDF (202K)
  • Kazuki SHIBATA, Mehrdad PANAHPOUR TEHERANI, Keita TAKAHASHI, Toshiaki ...
    Type: LETTER
    2017 Volume E100.D Issue 9 Pages 2048-2051
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Several applications for 3-D visualization require dense detection of correspondence for displacement estimation among heterogeneous multi-view images. Due to differences in resolution or sampling density and field of view in the images, estimation of dense displacement is not straight forward. Therefore, we propose a scale invariant polynomial expansion method that can estimate dense displacement between two heterogeneous views. Evaluation on heterogeneous images verifies accuracy of our approach.

    Download PDF (1330K)
  • Shu KONDO, Yuto KOBAYASHI, Keita TAKAHASHI, Toshiaki FUJII
    Type: LETTER
    2017 Volume E100.D Issue 9 Pages 2052-2055
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    A layered light-field display based on light-field factorization is considered. In the original work, the factorization is formulated under the assumption that the light field is captured with orthographic cameras. In this paper, we introduce a generalized framework for light-field factorization that can handle both the orthographic and perspective camera projection models. With our framework, a light field captured with perspective cameras can be displayed accurately.

    Download PDF (2592K)
Regular Section
  • Ning FU, Yingfeng ZHANG, Lijun SHAN, Zhiqiang LIU, Han PENG
    Type: PAPER
    Subject area: Software System
    2017 Volume E100.D Issue 9 Pages 2056-2067
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    With the in-depth development of service computing, it has become clear that when constructing service applications in an open dynamic network environment, greater attention must be paid to trustworthiness under the premise of functions' realization. Trustworthy computing requires theories for business process modeling in terms of both behavior and trustworthiness. In this paper, a calculus for ensuring the satisfaction of trustworthiness requirements in service-oriented systems is proposed. We investigate a calculus called QPi, for representing both the behavior and the trustworthiness property of concurrent systems. QPi is the combination of pi-calculus and a constraint semiring, which has a feature when problems with multi-dimensional properties must be tackled. The concept of the quantified bisimulation of processes provides us a measure of the degree of equivalence of processes based on the bisimulation distance. The QPi related properties of bisimulation and bisimilarity are also discussed. A specific modeling example is given to illustrate the effectiveness of the algebraic method.

    Download PDF (818K)
  • Hyeongboo BAEK, Jinkyu LEE
    Type: PAPER
    Subject area: Software System
    2017 Volume E100.D Issue 9 Pages 2068-2080
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    While conventional studies on real-time systems have mostly considered the real-time constraint of real-time systems only, recent research initiatives are trying to incorporate a security constraint into real-time scheduling due to the recognition that the violation of either of two constrains can cause catastrophic losses for humans, the system, and even environment. The focus of most studies, however, is the single-criticality systems, while the security of mixed-criticality systems has received scant attention, even though security is also a critical issue for the design of mixed-criticality systems. In this paper, we address the problem of the information leakage that arises from the shared resources that are used by tasks with different security-levels of mixed-criticality systems. We define a new concept of the security constraint employing a pre-flushing mechanism to cleanse the state of shared resources whenever there is a possibility of the information leakage regarding it. Then, we propose a new non-preemptive real-time scheduling algorithm and a schedulability analysis, which incorporate the security constraint for mixed-criticality systems. Our evaluation demonstrated that a large number of real-time tasks can be scheduled without a significant performance loss under a new security constraint.

    Download PDF (718K)
  • Hideaki OHASHI, Toshiyuki SHIMIZU, Masatoshi YOSHIKAWA
    Type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2017 Volume E100.D Issue 9 Pages 2081-2091
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In this study, we focus on a method to search for similar trajectories. In the majority of previous works on searching for similar trajectories, only raw trajectory data were used. However, to obtain deeper insights, additional time-dependent trajectory features should be utilized depending on the search intent. For instance, to identify similar combination plays in soccer games, such additional features include the movements of the team players. In this paper, we develop a framework to flexibly search for similar trajectories associated with time-dependent features, which we call enriched trajectories. In this framework, weights, which represent the relative importance of each feature, can be flexibly given by users. Moreover, to facilitate fast searching, we first propose a lower bounding measure of the DTW distance between enriched trajectories, and then we propose algorithms based on this lower bounding measure. We evaluate the effectiveness of the lower bounding measure and compare the performances of the algorithms under various conditions using soccer data and synthetic data. Our experimental results suggest that the proposed lower bounding measure is superior to the existing measure, and one of the proposed algorithms, which is based on the threshold algorithm, is suitable for practical use.

    Download PDF (687K)
  • Lianyong QI, Zhili ZHOU, Jiguo YU, Qi LIU
    Type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2017 Volume E100.D Issue 9 Pages 2092-2099
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    With the ever-increasing number of web services registered in service communities, many users are apt to find their interested web services through various recommendation techniques, e.g., Collaborative Filtering (i.e., CF)-based recommendation. Generally, CF-based recommendation approaches can work well, when a target user has similar friends or the target services (i.e., services preferred by the target user) have similar services. However, when the available user-service rating data is very sparse, it is possible that a target user has no similar friends and the target services have no similar services; in this situation, traditional CF-based recommendation approaches fail to generate a satisfying recommendation result. In view of this challenge, we combine Social Balance Theory (abbreviated as SBT; e.g., “enemy's enemy is a friend” rule) and CF to put forward a novel data-sparsity tolerant recommendation approach Ser_RecSBT+CF. During the recommendation process, a pruning strategy is adopted to decrease the searching space and improve the recommendation efficiency. Finally, through a set of experiments deployed on a real web service quality dataset WS-DREAM, we validate the feasibility of our proposal in terms of recommendation accuracy, recall and efficiency. The experiment results show that our proposed Ser_RecSBT+CF approach outperforms other up-to-date approaches.

    Download PDF (1455K)
  • Daisuke ANDO, Fumio TERAOKA, Kunitake KANEKO
    Type: PAPER
    Subject area: Information Network
    2017 Volume E100.D Issue 9 Pages 2100-2117
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    With rapid growth of producing high-resolution digital contents such as Full HD, 4K, and 8K movies, the demand for low cost and high throughput sharing of content files is increasing at digital content productions. In order to meet this demand, we have proposed DRIP (Distributed chunks Retrieval and Integration Procedure), a storage and retrieval mechanism for large file sharing using forward error correction (FEC) and global dispersed storage. DRIP was confirmed that it contributes to low cost and high throughput sharing. This paper describes the design and implementation of Content Espresso, a distributed large file sharing system for digital content productions using DRIP, and presents performance evaluations. We set up experimental environment using 79 physical machines including 72 inexpensive storage servers, and evaluate file metadata access performance, file storage/retrieval performance, FEC block size, and system availability by emulating global environments. The results confirm that Content Espresso has capability to deal with 15,000 requests per second, achieves 1 Gbps for file storage, and achieves more than 3 Gbps for file retrieval. File storage and retrieval performance are not significantly affected by the network conditions. Thus, we conclude that Content Espresso is capable of a global scale file sharing system for digital content productions.

    Download PDF (3321K)
  • Toshinori HOSOKAWA, Atsushi HIRAI, Yukari YAMAUCHI, Masayuki ARAI
    Type: PAPER
    Subject area: Dependable Computing
    2017 Volume E100.D Issue 9 Pages 2118-2125
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In at-speed scan testing, capture power is a serious problem because the high power dissipation that can occur when the response for a test vector is captured by flip-flops results in excessive voltage drops, known as IR-drops, which may cause significant capture-induced yield loss. In low capture power test generation, the test vectors that violate capture power constraints in an initial test set are defined as capture-unsafe test vectors, while faults that are detected solely by capture-unsafe test vectors are defined as unsafe faults. It is necessary to regenerate the test vectors used to detect unsafe faults in order to prevent unnecessary yield losses. In this paper, we propose a new low capture power test generation method based on fault simulation that uses capture-safe test vectors in an initial test set. Experimental results show that the use of this method reduces the number of unsafe faults by 94% while requiring just 18% more additional test vectors on average, and while requiring less test generation time compared with the conventional low capture power test generation method.

    Download PDF (1486K)
  • Youwei LU, Shogo OKADA, Katsumi NITTA
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2017 Volume E100.D Issue 9 Pages 2126-2137
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    We propose a novel method, built upon the hierarchical Dirichlet process hidden semi-Markov model, to reveal the content structures of unstructured domain-specific texts. The content structures of texts consisting of sequential local contexts are useful for tasks, such as text retrieval, classification, and text mining. The prominent feature of our model is the use of the recursive uniform partitioning, a stochastic process taking a view different from existing HSMMs in modeling state duration. We show that the recursive uniform partitioning plays an important role in avoiding the rapid switching between hidden states. Remarkably, our method greatly outperforms others in terms of ranking performance in our text retrieval experiments, and provides more accurate features for SVM to achieve higher F1 scores in our text classification experiments. These experiment results suggest that our method can yield improved representations of domain-specific texts. Furthermore, we present a method of automatically discovering the local contexts that serve to account for why a text is classified as a positive instance, in the supervised learning settings.

    Download PDF (1357K)
  • Eun-kyung KIM, Key-Sun CHOI
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2017 Volume E100.D Issue 9 Pages 2138-2146
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Entity descriptions have been exponentially growing in community-generated knowledge databases, such as DBpedia. However, many of those descriptions are not useful for identifying the underlying characteristics of their corresponding entities because semantically redundant facts or triples are included in the descriptions that represent the connections between entities without any semantic properties. Entity summarization is applied to filter out such non-informative triples and meaning-redundant triples and rank the remaining informative facts within the size of the triples for summarization. This study proposes an entity summarization approach based on pre-grouping the entities that share a set of attributes that can be used to characterize the entities we want to summarize. Entities are first grouped according to projected multilingual categories that provide the multi-angled semantics of each entity into a single entity space. Key facts about the entity are then determined through in-group-based rankings. As a result, our proposed approach produced summary information of significantly better quality (p-value =1.52×10-3 and 2.01×10-3 for the top-10 and -5 summaries, respectively) than the state-of-the-art method that requires additional external resources.

    Download PDF (406K)
  • Zhiming WU, Hongyan XU, Tao LIN
    Type: PAPER
    Subject area: Human-computer Interaction
    2017 Volume E100.D Issue 9 Pages 2147-2155
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Researchers have already attributed a certain amount of variability and “drift” in an individual's handwriting pattern to mental workload, but this phenomenon has not been explored adequately. Especially, there still lacks an automated method for accurately predicting mental workload using handwriting features. To solve the problem, we first conducted an experiment to collect handwriting data under different mental workload conditions. Then, a predictive model (called SVM-GA) on two-level handwriting features (i.e., sentence- and stroke-level) was created by combining support vector machines and genetic algorithms. The results show that (1) the SVM-GA model can differentiate three mental workload conditions with accuracy of 87.36% and 82.34% for the child and adult data sets, respectively and (2) children demonstrate different changes in handwriting features from adults when experiencing mental workload.

    Download PDF (507K)
  • Mohammad Nehal HASNINE, Masatoshi ISHIKAWA, Yuki HIRAI, Haruko MIYAKOD ...
    Type: PAPER
    Subject area: Educational Technology
    2017 Volume E100.D Issue 9 Pages 2156-2164
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Vocabulary acquisition based on the traditional pen-and-paper approach is outdated, and has been superseded by the multimedia-supported approach. In a multimedia-supported foreign language learning environment, a learning material comprised of a still-image, a text, and the corresponding sound data is considered to be the most effective way to memorize a noun. However, extraction of an appropriate still image for a noun has always been a challenging and time-consuming process for learners. Learners' burden would be reduced if a system could extract an appropriate image for representing a noun. Therefore, the present study purposed to extract an appropriate image for each noun in order to assist foreign language learners in acquisition of foreign vocabulary. This study presumed that, a learning material created with the help of an appropriate image would be more effective in recalling memory compared to the one created with an inappropriate image. As the first step to finding appropriate images for nouns, concrete nouns have been considered as the subject of investigation. Therefore, this study, at first proposed a definition of an appropriate image for a concrete noun. After that, an image re-ranking algorithm has been designed and implemented that is able to extract an appropriate image from a finite set of corresponding images for each concrete noun. Finally, immediate-after, short- and long-term learning effects of those images with regard to learners' memory retention rates have been examined by conducting immediate-after, delayed and extended delayed posttests. The experimental result revealed that participants in the experimental group significantly outperformed the control group in their long-term memory retention, while no significant differences have been observed in immediate-after and in short-term memory retention. This result indicates that our algorithm could extract images that have a higher learning effect. Furthermore, this paper briefly discusses an on-demand learning system that has been developed to assist foreign language learners in creation of vocabulary learning materials.

    Download PDF (808K)
  • Kou TANAKA, Tomoki TODA, Satoshi NAKAMURA
    Type: PAPER
    Subject area: Rehabilitation Engineering and Assistive Technology
    2017 Volume E100.D Issue 9 Pages 2165-2173
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    This paper presents a novel speaking aid system to help laryngectomees produce more naturally sounding electrolaryngeal (EL) speech. An electrolarynx is an external device to generate excitation signals, instead of vibration of the vocal folds. Although the conventional EL speech is quite intelligible, its naturalness suffers from the unnatural fundamental frequency (F0) patterns of the mechanically generated excitation signals. To improve the naturalness of EL speech, we have proposed EL speech enhancement methods using statistical F0 pattern prediction. In these methods, the original EL speech recorded by a microphone is presented from a loudspeaker after performing the speech enhancement. These methods are effective for some situation, such as telecommunication, but it is not suitable for face-to-face conversation because not only the enhanced EL speech but also the original EL speech is presented to listeners. In this paper, to develop an EL speech enhancement also effective for face-to-face conversation, we propose a method for directly controlling F0 patterns of the excitation signals to be generated from the electrolarynx using the statistical F0 prediction. To get an "actual feel” of the proposed system, we also implement a prototype system. By using the prototype system, we find latency issues caused by a real-time processing. To address these latency issues, we furthermore propose segmental continuous F0 pattern modeling and forthcoming F0 pattern modeling. With evaluations through simulation, we demonstrate that our proposed system is capable of effectively addressing the issues of latency and those of electrolarynx in term of the naturalness.

    Download PDF (750K)
  • Richeng DUAN, Tatsuya KAWAHARA, Masatake DANTSUJI, Jinsong ZHANG
    Type: PAPER
    Subject area: Speech and Hearing
    2017 Volume E100.D Issue 9 Pages 2174-2182
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Aiming at detecting pronunciation errors produced by second language learners and providing corrective feedbacks related with articulation, we address effective articulatory models based on deep neural network (DNN). Articulatory attributes are defined for manner and place of articulation. In order to efficiently train these models of non-native speech without such data, which is difficult to collect in a large scale, several transfer learning based modeling methods are explored. We first investigate three closely-related secondary tasks which aim at effective learning of DNN articulatory models. We also propose to exploit large speech corpora of native and target language to model inter-language phenomena. This kind of transfer learning can provide a better feature representation of non-native speech. Related task transfer and language transfer learning are further combined on the network level. Compared with the conventional DNN which is used as the baseline, all proposed methods improved the performance. In the native attribute recognition task, the network-level combination method reduced the recognition error rate by more than 10% relative for all articulatory attributes. The method was also applied to pronunciation error detection in Mandarin Chinese pronunciation learning by Japanese native speakers, and achieved the relative improvement up to 17.0% for detection accuracy and up to 19.9% for F-score, which is also better than the lattice-based combination.

    Download PDF (1824K)
  • Taravichet TITIJAROONROJ, Kuntpong WORARATPANYA
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2017 Volume E100.D Issue 9 Pages 2183-2196
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    A bi-dimensional empirical mode decomposition (BEMD) is one of the powerful methods for decomposing non-linear and non-stationary signals without a prior function. It can be applied in many applications such as feature extraction, image compression, and image filtering. Although modified BEMDs are proposed in several approaches, computational cost and quality of their bi-dimensional intrinsic mode function (BIMF) still require an improvement. In this paper, an iteration-free computation method for bi-dimensional empirical mode decomposition, called iBEMD, is proposed. The locally partial correlation for principal component analysis (LPC-PCA) is a novel technique to extract BIMFs from an original signal without using extrema detection. This dramatically reduces the computation time. The LPC-PCA technique also enhances the quality of BIMFs by reducing artifacts. The experimental results, when compared with state-of-the-art methods, show that the proposed iBEMD method can achieve the faster computation of BIMF extraction and the higher quality of BIMF image. Furthermore, the iBEMD method can clearly remove an illumination component of nature scene images under illumination change, thereby improving the performance of text localization and recognition.

    Download PDF (4316K)
  • Bin YAO, Lifeng HE, Shiying KANG, Xiao ZHAO, Yuyan CHAO
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2017 Volume E100.D Issue 9 Pages 2197-2204
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    The Euler number of a binary image is an important topological property for pattern recognition, image analysis, and computer vision. A famous method for computing the Euler number of a binary image is by counting certain patterns of bit-quads in the image, which has been improved by scanning three rows once to process two bit-quads simultaneously. This paper studies the bit-quad-based Euler number computing problem. We show that for a bit-quad-based Euler number computing algorithm, with the increase of the number of bit-quads being processed simultaneously, on the one hand, the average number of pixels to be checked for processing a bit-quad will decrease in theory, and on the other hand, the length of the codes for implementing the algorithm will increase, which will make the algorithm less efficient in practice. Experimental results on various types of images demonstrated that scanning five rows once and processing four bit-quads simultaneously is the optimal tradeoff, and that the optimal bit-quad-based Euler number computing algorithm is more efficient than other Euler number computing algorithms.

    Download PDF (1587K)
  • Chunpeng MA, Akihiro TAMURA, Lemao LIU, Tiejun ZHAO, Eiichiro SUMITA
    Type: PAPER
    Subject area: Natural Language Processing
    2017 Volume E100.D Issue 9 Pages 2205-2214
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Conventional feature-rich parsers based on manually tuned features have achieved state-of-the-art performance. However, these parsers are not good at handling long-term dependencies using only the clues captured by a prepared feature template. On the other hand, recurrent neural network (RNN)-based parsers can encode unbounded history information effectively, but they perform not well for small tree structures, especially when low-frequency words are involved, and they cannot use prior linguistic knowledge. In this paper, we propose a simple but effective framework to combine the merits of feature-rich transition-based parsers and RNNs. Specifically, the proposed framework incorporates RNN-based scores into the feature template used by a feature-rich parser. On English WSJ treebank and SPMRL 2014 German treebank, our framework achieves state-of-the-art performance (91.56 F-score for English and 83.06 F-score for German), without requiring any additional unlabeled data.

    Download PDF (575K)
  • Lei ZHANG, Qingfu FAN, Wen LI, Zhizhen LIANG, Guoxing ZHANG, Tongyang ...
    Type: LETTER
    Subject area: Data Engineering, Web Information Systems
    2017 Volume E100.D Issue 9 Pages 2215-2218
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Existing moving object's trajectory prediction algorithms suffer from the data sparsity problem, which affects the accuracy of the trajectory prediction. Aiming to the problem, we present an Entropy-based Sparse Trajectories Prediction method enhanced by Matrix Factorization (ESTP-MF). Firstly, we do trajectory synthesis based on trajectory entropy and put synthesized trajectories into the trajectory space. It can resolve the sparse problem of trajectory data and make the new trajectory space more reliable. Secondly, under the new trajectory space, we introduce matrix factorization into Markov models to improve the sparse trajectory prediction. It uses matrix factorization to infer transition probabilities of the missing regions in terms of corresponding existing elements in the transition probability matrix. It aims to further solve the problem of data sparsity. Experiments with a real trajectory dataset show that ESTP-MF generally improves prediction accuracy by as much as 6% and 4% compared to the SubSyn algorithm and STP-EE algorithm respectively.

    Download PDF (605K)
  • Gunhee LEE, Cheeha KIM
    Type: LETTER
    Subject area: Information Network
    2017 Volume E100.D Issue 9 Pages 2219-2223
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    The IEEE 802.11 wireless local area network (WLAN) is the most widely deployed communication standard in the world. Currently, the IEEE 802.11ax draft standard is one of the most advanced and promising among future wireless network standards. However, the suggested uplink-OFDMA (UL-OFDMA) random access method, based on trigger frame-random access (TF-R) from task group ax (TGax), does not yet show satisfying system performance. To enhance the UL-OFDMA capability of the IEEE 802.11ax draft standard, we propose a centralized contention-based MAC (CC-MAC) and describe its detailed operation. In this paper, we analyze the performance of CC-MAC by solving the Markov chain model and evaluating BSS throughput compared to other methods, such as DCF and TF-R, by computer simulation. Our results show that CC-MAC is a scalable and efficient scheme for improving the system performance in a UL-OFDMA random access situation in IEEE 802.11ax.

    Download PDF (385K)
  • Yoshinobu HIGAMI, Senling WANG, Hiroshi TAKAHASHI, Shin-ya KOBAYASHI, ...
    Type: LETTER
    Subject area: Dependable Computing
    2017 Volume E100.D Issue 9 Pages 2224-2227
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In this paper, we propose a method to diagnose a bridging fault between a clock line and a gate signal line. Assuming that scan based flush tests are applied, we perform fault simulation to deduce candidate faults. By analyzing fault behavior, it is revealed that faulty clock waveforms depend on the timing of the signal transition on a gate signal line which is bridged. In the fault simulation, a backward sensitized path tracing approach is introduced to calculate the timing of signal transitions. Experimental results show that the proposed method deduces candidate faults more accurately than our previous method.

    Download PDF (568K)
  • Joobeom YUN, Junbeom HUR, Youngjoo SHIN, Dongyoung KOO
    Type: LETTER
    Subject area: Dependable Computing
    2017 Volume E100.D Issue 9 Pages 2228-2231
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Ransomware becomes more and more threatening nowadays. In this paper, we propose CLDSafe, a novel and efficient file backup system against ransomware. It keeps shadow copies of files and provides secure restoration using cloud storage when a computer is infected by ransomware. After our system measures file similarities between a new file on the client and an old file on the server, the old file on the server is backed up securely when the new file is changed substantially. And then, only authenticated users can restore the backup files by using challenge-response mechanism. As a result, our proposed solution will be helpful in recovering systems from ransomware damage.

    Download PDF (603K)
  • Hideo FUJIWARA, Katsuya FUJIWARA
    Type: LETTER
    Subject area: Dependable Computing
    2017 Volume E100.D Issue 9 Pages 2232-2236
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In our previous work, we introduced new concepts of secure scan design; shift register equivalent circuits (SR-equivalents, for short) and strongly secure circuits, and also introduced generalized shift registers (GSRs, for short) to apply them to secure scan design. In this paper, we combine both concepts of SR-equivalents and strongly secure circuits and apply them to GSRs, and consider the synthesis problem of strongly secure SR-equivalents using GSRs. We also consider the enumeration problem of GSRs that are strongly secure and SR-equivalent, i.e., the cardinality of the class of strongly secure SR-equivalent GSRs to clarify the security level of the secure scan architecture.

    Download PDF (480K)
  • Chen CHEN, Chunyan HOU, Jiakun XIAO, Yanlong WEN, Xiaojie YUAN
    Type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2017 Volume E100.D Issue 9 Pages 2237-2240
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    In the era of e-commerce, purchase behavior prediction is one of the most important issues to promote both online companies' sales and the consumers' experience. The previous researches usually use traditional features based on the statistics and temporal dynamics of items. Those features lead to the loss of detailed items' information. In this study, we propose a novel kind of features based on temporally popular items to improve the prediction. Experiments on the real-world dataset have demonstrated the effectiveness and the efficiency of our proposed method. Features based on temporally popular items are compared with traditional features which are associated with statistics, temporal dynamics and collaborative filter of items. We find that temporally popular items are an effective and irreplaceable supplement of traditional features. Our study shed light on the effectiveness of the combination of popularity and temporal dynamics of items which can widely used for a variety of recommendations in e-commerce sites.

    Download PDF (441K)
  • Sanay MUHAMMAD UMAR SAEED, Syed MUHAMMAD ANWAR, Muhammad MAJID
    Type: LETTER
    Subject area: Human-computer Interaction
    2017 Volume E100.D Issue 9 Pages 2241-2244
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    A study on quantification of human stress using low beta waves of electroencephalography (EEG) is presented. For the very first time the importance of low beta waves as a feature for quantification of human stress is highlighted. In this study, there were twenty-eight participants who filled the Perceived Stress Scale (PSS) questionnaire and recorded their EEG in closed eye condition by using a commercially available single channel EEG headset placed at frontal site. On the regression analysis of beta waves extracted from recorded EEG, it has been observed that low beta waves can predict PSS scores with a confidence level of 94%. Consequently, when low beta wave is used as a feature with the Naive Bayes algorithm for classification of stress level, it not only reduces the computational cost by 7 folds but also improves the accuracy to 71.4%.

    Download PDF (741K)
  • Yeo-Jin YOON, Jaechun NO, Soo-Mi CHOI
    Type: LETTER
    Subject area: Human-computer Interaction
    2017 Volume E100.D Issue 9 Pages 2245-2248
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    The quality of visual comfort and depth perception is a crucial requirement for virtual reality (VR) applications. This paper investigates major causes of visual discomfort and proposes a novel virtual camera controlling method using visual saliency to minimize visual discomfort. We extract the saliency of each scene and properly adjust the convergence plane to preserve realistic 3D effects. We also evaluate the effectiveness of our method on free-form architecture models. The results indicate that the proposed saliency-guided camera control is more comfortable than typical camera control and gives more realistic depth perception.

    Download PDF (2694K)
  • Seongkyu MUN, Minkyu SHIN, Suwon SHON, Wooil KIM, David K. HAN, Hanseo ...
    Type: LETTER
    Subject area: Speech and Hearing
    2017 Volume E100.D Issue 9 Pages 2249-2252
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Recent acoustic event classification research has focused on training suitable filters to represent acoustic events. However, due to limited availability of target event databases and linearity of conventional filters, there is still room for improving performance. By exploiting the non-linear modeling of deep neural networks (DNNs) and their ability to learn beyond pre-trained environments, this letter proposes a DNN-based feature extraction scheme for the classification of acoustic events. The effectiveness and robustness to noise of the proposed method are demonstrated using a database of indoor surveillance environments.

    Download PDF (1116K)
  • Takaaki OKABE, Masahiro OKUDA
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2017 Volume E100.D Issue 9 Pages 2253-2256
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    The Retinex theory assumes that large intensity changes correspond to reflectance edges, while smoothly-varying regions are due to shading. Some algorithms based on the theory adopt simple thresholding schemes and achieve adequate results for reflectance estimation. In this paper, we present a practical reflectance estimation technique for hyperspectral images. Our method is realized simply by thresholding singular values of a matrix calculated from scaled pixel values. In the method, we estimate the reflectance image by measuring spectral similarity between two adjacent pixels. We demonstrate that our thresholding scheme effectively estimates the reflectance and outperforms the Retinex-based thresholding. In particular, our methods can precisely distinguish edges caused by reflectance change and shadows.

    Download PDF (854K)
  • Li WANG, Xiaoan TANG, Junda ZHANG, Dongdong GUAN
    Type: LETTER
    Subject area: Computer Graphics
    2017 Volume E100.D Issue 9 Pages 2257-2260
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Volume segmentation is of great significances for feature visualization and feature extraction, essentially volume segmentation can be viewed as generalized cluster. This paper proposes a hybrid approach via symmetric region growing (SRG) and information diffusion estimation (IDE) for volume segmentation, the volume dataset is over-segmented to series of subsets by SRG and then subsets are clustered by K-Means basing on distance-metric derived from IDE, experiments illustrate superiority of the hybrid approach with better segmentation performance.

    Download PDF (365K)
  • Takashi WATANABE, Takumi TADANO
    Type: LETTER
    Subject area: Biological Engineering
    2017 Volume E100.D Issue 9 Pages 2261-2264
    Published: September 01, 2017
    Released: September 01, 2017
    JOURNALS FREE ACCESS

    Fuzzy controller can be useful to realize a practical closed-loop FES controller, because it is possible to make it easy to design FES controller and to determine its parameter values, especially for controlling multi-joint movements by stimulating many muscles including antagonistic muscle pairs. This study focused on using fuzzy controller for the closed-loop control of cycling speed during FES cycling with pedaling wheelchair. However, a designed fuzzy controller has to be tested experimentally in control performance. In this paper, a closed-loop fuzzy FES controller was designed and tested in knee extension movements comparing to a PID controller with healthy subjects before applying to FES cycling. The developed fuzzy controller showed good control performance as a whole in comparing to PID controller and its parameter values were determined through simple control tests of the target movement.

    Download PDF (321K)
feedback
Top