IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E95.D, Issue 5
Displaying 1-46 of 46 articles from this issue
Special Section on Recent Advances in Multimedia Signal Processing Techniques and Applications
  • Haizhou LI
    2012 Volume E95.D Issue 5 Pages 1181
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Download PDF (56K)
  • Sadaoki FURUI
    Article type: INVITED PAPER
    Subject area: Speech Processing
    2012 Volume E95.D Issue 5 Pages 1182-1194
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper presents our recent work in regard to building Large Vocabulary Continuous Speech Recognition (LVCSR) systems for the Thai, Indonesian, and Chinese languages. For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and applied topic and speaking style adaptation to the language model to recognize spoken-style utterances. For Indonesian, we have applied proper noun-specific adaptation to acoustic modeling, and rule-based English-to-Indonesian phoneme mapping to solve the problem of large variation in proper noun and English word pronunciation in a spoken-query information retrieval system. In spoken Chinese, long organization names are frequently abbreviated, and abbreviated utterances cannot be recognized if the abbreviations are not included in the dictionary. We have proposed a new method for automatically generating Chinese abbreviations, and by expanding the vocabulary using the generated abbreviations, we have significantly improved the performance of spoken query-based search.
    Download PDF (746K)
  • Kuan-Yu CHEN, Hsin-Min WANG, Berlin CHEN
    Article type: PAPER
    Subject area: Speech Processing
    2012 Volume E95.D Issue 5 Pages 1195-1205
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper describes the application of two attractive categories of topic modeling techniques to the problem of spoken document retrieval (SDR), viz. document topic model (DTM) and word topic model (WTM). Apart from using the conventional unsupervised training strategy, we explore a supervised training strategy for estimating these topic models, imagining a scenario that user query logs along with click-through information of relevant documents can be utilized to build an SDR system. This attempt has the potential to associate relevant documents with queries even if they do not share any of the query words, thereby improving on retrieval quality over the baseline system. Likewise, we also study a novel use of pseudo-supervised training to associate relevant documents with queries through a pseudo-feedback procedure. Moreover, in order to lessen SDR performance degradation caused by imperfect speech recognition, we investigate leveraging different levels of index features for topic modeling, including words, syllable-level units, and their combination. We provide a series of experiments conducted on the TDT (TDT-2 and TDT-3) Chinese SDR collections. The empirical results show that the methods deduced from our proposed modeling framework are very effective when compared with a few existing retrieval approaches.
    Download PDF (1549K)
  • Xiaoxuan WANG, Lei XIE, Mimi LU, Bin MA, Eng Siong CHNG, Haizhou LI
    Article type: PAPER
    Subject area: Speech Processing
    2012 Volume E95.D Issue 5 Pages 1206-1215
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.
    Download PDF (631K)
  • Sungjin LEE, Hyungjong NOH, Jonghoon LEE, Kyusong LEE, Gary Geunbae LE ...
    Article type: PAPER
    Subject area: Speech Processing
    2012 Volume E95.D Issue 5 Pages 1216-1228
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Although there have been enormous investments into English education all around the world, not many differences have been made to change the English instruction style. Considering the shortcomings for the current teaching-learning methodology, we have been investigating advanced computer-assisted language learning (CALL) systems. This paper aims at summarizing a set of POSTECH approaches including theories, technologies, systems, and field studies and providing relevant pointers. On top of the state-of-the-art technologies of spoken dialog system, a variety of adaptations have been applied to overcome some problems caused by numerous errors and variations naturally produced by non-native speakers. Furthermore, a number of methods have been developed for generating educational feedback that help learners develop to be proficient. Integrating these efforts resulted in intelligent educational robots — Mero and Engkey — and virtual 3D language learning games, Pomy. To verify the effects of our approaches on students' communicative abilities, we have conducted a field study at an elementary school in Korea. The results showed that our CALL approaches can be enjoyable and fruitful activities for students. Although the results of this study bring us a step closer to understanding computer-based education, more studies are needed to consolidate the findings.
    Download PDF (2632K)
  • Yi Ren LENG, Huy Dat TRAN, Norihide KITAOKA, Haizhou LI
    Article type: PAPER
    Subject area: Audio Processing
    2012 Volume E95.D Issue 5 Pages 1229-1237
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Conventional features for Automatic Speech Recognition and Sound Event Recognition such as Mel-Frequency Cepstral Coefficients (MFCCs) have been shown to perform poorly in noisy conditions. We introduce an auditory feature based on the gammatone filterbank, the Selective Gammatone Envelope Feature (SGEF), for Robust Sound Event Recognition where channel selection and the filterbank envelope is used to reduce the effect of noise for specific noise environments. In the experiments with Hidden Markov Model (HMM) recognizers, we shall show that our feature outperforms MFCCs significantly in four different noisy environments at various signal-to-noise ratios.
    Download PDF (1562K)
  • Jae Gon KIM, Jun-Dong CHO
    Article type: PAPER
    Subject area: Image Processing
    2012 Volume E95.D Issue 5 Pages 1238-1247
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we propose an optimized virtual re-convergence system especially to reduce the visual fatigue caused by binocular stereoscopy. Our unique idea to reduce visual fatigue is to utilize the virtual re-convergence based on the optimized disparity-map that contains more depth information in the negative disparity area than in the positive area. Therefore, our system facilitates a unique search-range scheme, especially for negative disparity exploration. In addition, we used a dedicated method, using a so-called Global-Shift Value (GSV), which are the total shift values of each image in stereoscopy to converge a main object that can mostly affect visual fatigue. The experimental result, which is a subjective assessment by participants, shows that the proposed method makes stereoscopy significantly comfortable and attractive to view than existing methods.
    Download PDF (2366K)
  • Zhuo YANG, Sei-ichiro KAMATA
    Article type: PAPER
    Subject area: Image Processing
    2012 Volume E95.D Issue 5 Pages 1248-1255
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Polar and Spherical Fourier analysis can be used to extract rotation invariant features for image retrieval and pattern recognition tasks. They are demonstrated to show superiorities comparing with other methods on describing rotation invariant features of two and three dimensional images. Based on mathematical properties of trigonometric functions and associated Legendre polynomials, fast algorithms are proposed for multimedia applications like real time systems and large multimedia databases in order to increase the computation speed. The symmetric points are computed simultaneously. Inspired by relative prime number theory, systematic analysis are given in this paper. Novel algorithm is deduced that provide even faster speed. Proposed method are 9-15% faster than previous work. The experimental results on two and three dimensional images are given to illustrate the effectiveness of the proposed method. Multimedia signal processing applications that need real time polar and spherical Fourier analysis can be benefit from this work.
    Download PDF (1324K)
  • Kitti KOONSANIT, Chuleerat JARUSKULCHAI
    Article type: PAPER
    Subject area: Image Processing
    2012 Volume E95.D Issue 5 Pages 1256-1263
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Nowadays, clustering is a popular tool for exploratory data analysis, with one technique being K-means clustering. Determining the appropriate number of clusters is a significant problem in K-means clustering because the results of the k-means technique depend on different numbers of clusters. Automatic determination of the appropriate number of clusters in a K-means clustering application is often needed in advance as an input parameter to the K-means algorithm. We propose a new method for automatic determination of the appropriate number of clusters using an extended co-occurrence matrix technique called a tri-co-occurrence matrix technique for multispectral imagery in the pre-clustering steps. The proposed method was tested using a dataset from a known number of clusters. The experimental results were compared with ground truth images and evaluated in terms of accuracy, with the numerical result of the tri-co-occurrence providing an accuracy of 84.86%. The results from the tests confirmed the effectiveness of the proposed method in finding the appropriate number of clusters and were compared with the original co-occurrence matrix technique and other algorithms.
    Download PDF (995K)
  • Kenjiro SUGIMOTO, Koji INOUE, Yoshimitsu KUROKI, Sei-ichiro KAMATA
    Article type: PAPER
    Subject area: Image Processing
    2012 Volume E95.D Issue 5 Pages 1264-1271
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper presents a color-based method for medicine package recognition, called a linear manifold color descriptor (LMCD). It describes a color distribution (a set of color pixels) of a color package image as a linear manifold (an affine subspace) in the color space, and recognizes an anonymous package by linear manifold matching. Mainly due to low dimensionality of color spaces, LMCD can provide more compact description and faster computation than description styles based on histogram and dominant-color. This paper also proposes distance-based dissimilarities for linear manifold matching. Specially designed for color distribution matching, the proposed dissimilarities are theoretically appropriate more than J-divergence and canonical angles. Experiments on medicine package recognition validates that LMCD outperforms competitors including MPEG-7 color descriptors in terms of description size, computational cost and recognition rate.
    Download PDF (1806K)
  • Toshiyuki UTO, Yuka TAKEMURA, Hidekazu KAMITANI, Kenji OHUE
    Article type: PAPER
    Subject area: Image Processing
    2012 Volume E95.D Issue 5 Pages 1272-1279
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper describes a blind watermarking scheme through cyclic signal processing. Due to various rapid networks, there is a growing demand of copyright protection for multimedia data. As efficient watermarking of images, there exist two major approaches: a quantization-based method and a correlation-based method. In this paper, we proposes a correlation-based watermarking technique of three-dimensional (3-D) polygonal models using the fast Fourier transforms (FFTs). For generating a watermark with desirable properties, similar to a pseudonoise signal, an impulse signal on a two-dimensional (2-D) space is spread through the FFT, the multiplication of a complex sinusoid signal, and the inverse FFT. This watermark, i.e., spread impulse signal, in a transform domain is converted to a spatial domain by an inverse wavelet transform, and embedded into 3-D data aligned by the principle component analysis (PCA). In the detection procedure, after realigning the watermarked mesh model through the PCA, we map the 3-D data on the 2-D space via block segmentation and averaging operation. The 2-D data are processed by the inverse system, i.e., the FFT, the division of the complex sinusoid signal, and the inverse FFT. From the resulting 2-D signal, we detect the position of the maximum value as a signature. For 3-D bunny models, detection rates and information capacity are shown to evaluate the performance of the proposed method.
    Download PDF (1115K)
  • Pengyi HAO, Sei-ichiro KAMATA
    Article type: PAPER
    Subject area: Video Processing
    2012 Volume E95.D Issue 5 Pages 1280-1287
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    We are interested in retrieving video shots or videos containing particular people from a video dataset. Owing to the large variations in pose, illumination conditions, occlusions, hairstyles and facial expressions, face tracks have recently been researched in the fields of face recognition, face retrieval and name labeling from videos. However, when the number of face tracks is very large, conventional methods, which match all or some pairs of faces in face tracks, will not be effective. Therefore, in this paper, an efficient method for finding a given person from a video dataset is presented. In our study, in according to performing research on face tracks in a single video, we also consider how to organize all the faces in videos in a dataset and how to improve the search quality in the query process. Different videos may include the same person; thus, the management of individuals in different videos will be useful for their retrieval. The proposed method includes the following three points. (i) Face tracks of the same person appearing for a period in each video are first connected on the basis of scene information with a time constriction, then all the people in one video are organized by a proposed hierarchical clustering method. (ii) After obtaining the organizational structure of all the people in one video, the people are organized into an upper layer by affinity propagation. (iii) Finally, in the process of querying, a remeasuring method based on the index structure of videos is performed to improve the retrieval accuracy. We also build a video dataset that contains six types of videos: films, TV shows, educational videos, interviews, press conferences and domestic activities. The formation of face tracks in the six types of videos is first researched, then experiments are performed on this video dataset containing more than 1 million faces and 218,786 face tracks. The results show that the proposed approach has high search quality and a short search time.
    Download PDF (1799K)
  • Ichiro IDE, Tomoyoshi KINOSHITA, Tomokazu TAKAHASHI, Hiroshi MO, Norio ...
    Article type: PAPER
    Subject area: Video Processing
    2012 Volume E95.D Issue 5 Pages 1288-1300
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Recent advance in digital storage technology has enabled us to archive a large volume of video data. Thanks to this trend, we have archived more than 1,800 hours of video data from a daily Japanese news show in the last ten years. When considering the effective use of such a large news video archive, we assumed that analysis of its chronological and semantic structure becomes important. We also consider that providing the users with the development of news topics is more important to help their understanding of current affairs, rather than providing a list of relevant news stories as in most of the current news video retrieval systems. Therefore, in this paper, we propose a structuring method for a news video archive, together with an interface that visualizes the structure, so that users could track the development of news topics according to their interest, efficiently. The proposed news video structure, namely the “topic thread structure”, is obtained as a result of an analysis of the chronological and semantic relation between news stories. Meanwhile, the proposed interface, namely “mediaWalker II”, allows users to track the development of news topics along the topic thread structure, and at the same time watch the video footage corresponding to each news story. Analyses on the topic thread structures obtained by applying the proposed method to actual news video footages revealed interesting and comprehensible relations between news topics in the real world. At the same time, analyses on their size quantified the efficiency of tracking a user's topic-of-interest based on the proposed topic thread structure. We consider this as a first step towards facilitating video authoring by users based on existing contents in a large-scale news video archive.
    Download PDF (2491K)
  • Takayuki NAKACHI, Kan TOYOSHIMA, Yoshihide TONOMURA, Tatsuya FUJII
    Article type: PAPER
    Subject area: Video Processing
    2012 Volume E95.D Issue 5 Pages 1301-1312
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we propose a layered multicast encryption scheme that provides flexible access control to motion JPEG2000 code streams. JPEG2000 generates layered code streams and offers flexible scalability in characteristics such as resolution and SNR. The layered multicast encryption proposal allows a sender to multicast the encrypted JPEG2000 code streams such that only designated groups of users can decrypt the layered code streams. While keeping the layering functionality, the proposed method offers useful properties such as 1) video quality control using only one private key, 2) guaranteed security, and 3) low computational complexity comparable to conventional non-layered encryption. Simulation results show the usefulness of the proposed method.
    Download PDF (2093K)
  • Lei SUN, Jie LENG, Jia SU, Yiqing HUANG, Hiroomi MOTOHASHI, Takeshi IK ...
    Article type: PAPER
    Subject area: Video Processing
    2012 Volume E95.D Issue 5 Pages 1313-1323
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Scalable Video Coding (SVC) was standardized as an extension of H.264/AVC with the intention to provide flexible adaptation to heterogeneous networks and different end-user requirements, which provides great scalability in multi-point applications such as video conferencing. However, due to the existence of H.264/AVC-based systems, transcoding between AVC and SVC becomes necessary. Most existing works focus on temporal transcoding, quality transcoding or SVC-to-AVC spatial transcoding while the straightforward re-encoding method requires high computational cost. This paper proposes a low-complexity AVC-to-SVC spatial transcoder based on coarse-level mode mapping for video conferencing scenes. First, to omit unnecessary motion estimations (ME) for layers with reduced resolution, an ME skipping scheme based on AVC mode distribution is proposed with an adaptive search range. Then a probability-profile based scheme is proposed for further mode skipping. After that 3 coarse-level mode-mapping methods are presented for fast mode decision and the adaptive usage of the 3 methods is discussed. Finally, motion vector (MV) refinement is introduced for further lower-layer time reduction. As for the top layer, direct encapsulation is proposed to preserve better quality and another scheme involving inter-layer predictions is also provided for bandwidth-crucial applications. Simulation results show that proposed transcoder achieves up to 92.6% time reduction without significant coding efficiency loss compared to re-encoding method.
    Download PDF (2029K)
  • Peng SONG, Shuhong XU, Wee Teck FONG, Ching Ling CHIN, Gim Guan CHUA, ...
    Article type: PAPER
    Subject area: Signal Processing
    2012 Volume E95.D Issue 5 Pages 1324-1331
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    The development of new technologies has undoubtedly promoted the advances of modern education, among which Virtual Reality (VR) technologies have made the education more visually accessible for students. However, classroom education has been the focus of VR applications whereas not much research has been done in promoting sports education using VR technologies. In this paper, an immersive VR system is designed and implemented to create a more intuitive and visual way of teaching tennis. A scalable system architecture is proposed in addition to the hardware setup layout, which can be used for various immersive interactive applications such as architecture walkthroughs, military training simulations, other sports game simulations, interactive theaters, and telepresent exhibitions. Realistic interaction experience is achieved through accurate and robust hybrid tracking technology, while the virtual human opponent is animated in real time using shader-based skin deformation. Potential future extensions are also discussed to improve the teaching/learning experience.
    Download PDF (925K)
  • Jing WANG, Guangda SU
    Article type: LETTER
    Subject area: Image Processing
    2012 Volume E95.D Issue 5 Pages 1332-1335
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Sparse representation based classification (SRC) has emerged as a new paradigm for solving face recognition problems. Further research found that the main limitation of SRC is the assumption of pixel-accurate alignment between the test image and the training set. A. Wagner used a series of linear programs that iteratively minimize the sparsity of the registration error. In this paper, we propose another face registration method called three-point positioning method. Experiments show that our proposed method achieves better performance.
    Download PDF (175K)
  • Chien-Sheng CHEN, Jium-Ming LIN, Wen-Hsiung LIU, Ching-Lung CHI
    Article type: LETTER
    Subject area: Signal Processing
    2012 Volume E95.D Issue 5 Pages 1336-1340
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    To achieve more accurate measurements of the mobile station (MS) location, it is possible to integrate many kinds of measurements. In this paper we proposed several simpler methods that utilized time of arrival (TOA) at three base stations (BSs) and the angle of arrival (AOA) information at the serving BS to give location estimation of the MS in non-line-of-sight (NLOS) environments. From the viewpoint of geometric approach, for each a TOA value measured at any BS, one can generate a circle. Rather than applying the nonlinear circular lines of position (LOP), the proposed methods are much easier by using linear LOP to determine the MS. Numerical results demonstrate that the calculation time of using linear LOP is much less than employing circular LOP. Although the location precision of using linear LOP is only reduced slightly. However, the proposed efficient methods by using linear LOP can still provide precise solution of MS location and reduce the computational effort greatly. In addition, the proposed methods with less effort can mitigate the NLOS effect, simply by applying the weighted sum of the intersections between different linear LOP and the AOA line, without requiring priori knowledge of NLOS error statistics. Simulation results show that the proposed methods can always yield superior performance in comparison with Taylor series algorithm (TSA) and the hybrid lines of position algorithm (HLOP).
    Download PDF (378K)
Special Section on Formal Approach
  • Shoji YUEN
    2012 Volume E95.D Issue 5 Pages 1341
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Download PDF (60K)
  • Yasuhito ARIMOTO, Shusaku IIDA, Kokichi FUTATSUGI
    Article type: PAPER
    Subject area: Formal Methods
    2012 Volume E95.D Issue 5 Pages 1342-1354
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    It has been an important issue to deal with risks in business processes for achieving companies' goals. This paper introduces a method for applying a formal method to analysis of risks and control activities in business processes in order to evaluate control activities consistently, exhaustively, and to give us potential to have scientific discussion on the result of the evaluation. We focus on document flows in business activities and control activities and risks related to documents because documents play important roles in business. In our method, document flows including control activities are modeled and it is verified by OTS/CafeOBJ Method that risks about falsification of documents are avoided by control activities in the model. The verification is done by interaction between humans and CafeOBJ system with theorem proving, and it raises potential to discuss the result scientifically because the interaction gives us rigorous reasons why the result is derived from the verification.
    Download PDF (823K)
  • Jefferson O. ANDRADE, Yukiyoshi KAMEYAMA
    Article type: PAPER
    Subject area: Model Checking
    2012 Volume E95.D Issue 5 Pages 1355-1364
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Multi-valued Model Checking extends classical, two-valued model checking to multi-valued logic such as Quasi-Boolean logic. The added expressivity is useful in dealing with such concepts as incompleteness and uncertainty in target systems, while it comes with the cost of time and space. Chechik and others proposed an efficient reduction from multi-valued model checking problems to two-valued ones, but to the authors' knowledge, no study was done for multi-valued bounded model checking. In this paper, we propose a novel, efficient algorithm for multi-valued bounded model checking. A notable feature of our algorithm is that it is not based on reduction of multi-values into two-values; instead, it generates a single formula which represents multi-valuedness by a suitable encoding, and asks a standard SAT solver to check its satisfiability. Our experimental results show a significant improvement in the number of variables and clauses and also in execution time compared with the reduction-based one.
    Download PDF (808K)
  • Kenji HASHIMOTO, Hiroto KAWAI, Yasunori ISHIHARA, Toru FUJIWARA
    Article type: PAPER
    Subject area: Database Security
    2012 Volume E95.D Issue 5 Pages 1365-1374
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper discusses verification of the security against inference attacks on XML databases in the presence of a functional dependency. So far, we have provided the verification method for k-secrecy, which is a metric for the security against inference attacks on databases. Intuitively, k-secrecy means that the number of candidates of sensitive data (i.e., the result of unauthorized query) of a given database instance cannot be narrowed down to k-1 by using available information such as authorized queries and their results. In this paper, we consider a functional dependency on database instances as one of the available information. Functional dependencies help attackers to reduce the number of the candidates for the sensitive information. The verification method we have provided cannot be naively extended to the k-secrecy problem with a functional dependency. The method requires that the candidate set can be captured by a tree automaton, but the candidate set when a functional dependency is considered cannot be always captured by any tree automaton. We show that the ∞-secrecy problem in the presence of a functional dependency is decidable when a given unauthorized query is represented by a deterministic topdown tree transducer, without explicitly computing the candidate set.
    Download PDF (609K)
  • Shingo YAMAGUCHI
    Article type: LETTER
    Subject area: Formal Methods
    2012 Volume E95.D Issue 5 Pages 1375-1379
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    A workflow net (WF-net for short) is a Petri net which represents a workflow. There are two important subclasses of WF-nets: extended free-choice (EFC for short) and well-structured (WS for short). It is known that most actual workflows can be modeled as EFC WF-nets; Acyclic WS is a subclass of acyclic EFC but has more analysis methods. An acyclic EFC WF-net may be transformed to an acyclic WS WF-net without changing the external behavior of the net. We name such a transformation Acyclic EFC WF-net refactoring. We give a formal definition of acyclic EFC WF-net refactoring problem. We also give a necessary condition and a sufficient condition for solving the problem. Those conditions can be checked in polynomial time. These result in the enhancement of the analysis power of acyclic EFC WF-nets.
    Download PDF (232K)
  • Hyung Goo PAEK, Jeong Mo YEO, Kyong Hoon KIM, Wan Yeon LEE
    Article type: LETTER
    Subject area: System Analysis
    2012 Volume E95.D Issue 5 Pages 1380-1383
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    The proposed scheduling scheme minimizes the mean power consumption of real-time tasks with probabilistic computation amounts while meeting their deadlines. Our study formally solves the minimization problem under finitely discrete clock frequencies with irregular power consumptions, whereas state-of-the-arts studies did under infinitely continuous clock frequencies with regular power consumptions.
    Download PDF (140K)
Regular Section
  • Woosung JUNG, Eunjoo LEE, Chisu WU
    Article type: SURVEY PAPER
    Subject area: Software Engineering
    2012 Volume E95.D Issue 5 Pages 1384-1406
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper presents fundamental concepts, overall process and recent research issues of Mining Software Repositories. The data sources such as source control systems, bug tracking systems or archived communications, data types and techniques used for general MSR problems are also presented. Finally, evaluation approaches, opportunities and challenge issues are given.
    Download PDF (424K)
  • Yangjie CAO, Hongyang SUN, Depei QIAN, Weiguo WU
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2012 Volume E95.D Issue 5 Pages 1407-1416
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    The proliferation of many-core architectures has led to the explosive development of parallel applications using programming models, such as OpenMP, TBB, and Cilk/Cilk++. With increasing number of cores, however, it becomes even harder to efficiently schedule parallel applications on these resources since current many-core runtime systems still lack effective mechanisms to support collaborative scheduling of these applications. In this paper, we study feedback-driven adaptive scheduling based on work stealing, which provides an efficient solution for concurrently executing a set of applications on many-core systems. To dynamically estimate the number of cores desired by each application, a stable feedback-driven adaptive algorithm, called SAWS, is proposed using active workers and the length of active deques, which well captures the runtime characteristics of the applications. Furthermore, a prototype system is built by extending the Cilk runtime system, and the experimental results, which are obtained on a Sun Fire server, show that SAWS has more advantages for scheduling concurrent parallel applications. Specifically, compared with existing algorithms A-Steal and WS-EQUI, SAWS improves the performances by up to 12.43% and 21.32% with respect to mean response time respectively, and 25.78% and 46.98% with respect to processor utilization, respectively.
    Download PDF (622K)
  • Kuo-Yi CHEN, Chin-Yang LIN, Tien-Yan MA, Ting-Wei HOU
    Article type: PAPER
    Subject area: Software System
    2012 Volume E95.D Issue 5 Pages 1417-1426
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    With more digital home appliances and network devices having OSGi as the software management platform, the power-saving capability of the OSGi platform has become a critical issue. This paper is aimed at improving the power-efficiency of the OSGi platform, i.e. reducing the energy consumption with minimum performance degradation. The key to this study is an efficient power-saving technique which exploits the runtime information already available in a Java virtual machine (JVM), the base software of the OSGi platform, to best determine the timing of performing DVFS (Dynamic Voltage and Frequency Scaling). This, technically, involves a phase detection scheme that identifies the memory phase of the OSGi-enabled device/server in a correct and almost effortless way. The overhead of the power-saving procedure is thus minimized, and the system performance is well maintained. We have implemented and evaluated the proposed power-saving approach on an OSGi server, where the Apache Felix OSGi implementation and the DaCapo benchmarks were applied. The results show that this approach can achieve real power-efficiency for the OSGi platform, in which the power consumption is significantly reduced and the performance remains highly competitive, compared with the other power-saving techniques.
    Download PDF (1849K)
  • Chang-Sup PARK, Jun Pyo PARK, Yon Dohn CHUNG
    Article type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2012 Volume E95.D Issue 5 Pages 1427-1435
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Wireless broadcasting of heterogeneous XML data has become popular in many applications, where energy-efficient processing of user queries at the mobile client is a critical issue. This paper proposes a new index structure for wireless stream of heterogeneous XML data to enhance tuning time performance in processing path queries on the stream. The index called PrefixSummary stores for each location path in the XML data the address of a bucket in the stream which contains an XML node satisfying the location path and appearing first in the stream. We present algorithms to generate broadcast stream with the proposed index and to process a path query on the stream efficiently by exploiting the index. We also suggest a replication scheme of PrefixSummary within a broadcast cycle to reduce latency in query processing. By analysis and experiment we show the proposed PrefixSummary approach can reduce tuning time for processing path queries significantly while it can also achieve reasonable access time performance by means of replication of the index over the broadcast stream.
    Download PDF (1025K)
  • Xuemin ZHAO, Yuhong GUO, Jian LIU, Yonghong YAN, Qiang FU
    Article type: PAPER
    Subject area: Information Network
    2012 Volume E95.D Issue 5 Pages 1436-1445
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, a logarithmic adaptive quantization projection (LAQP) algorithm for digital watermarking is proposed. Conventional quantization index modulation uses a fixed quantization step in the watermarking embedding procedure, which leads to poor fidelity. Moreover, the conventional methods are sensitive to value-metric scaling attack. The LAQP method combines the quantization projection scheme with a perceptual model. In comparison to some conventional quantization methods with a perceptual model, the LAQP only needs to calculate the perceptual model in the embedding procedure, avoiding the decoding errors introduced by the difference of the perceptual model used in the embedding and decoding procedure. Experimental results show that the proposed watermarking scheme keeps a better fidelity and is robust against the common signal processing attack. More importantly, the proposed scheme is invariant to value-metric scaling attack.
    Download PDF (601K)
  • HyunYong LEE, Masahiro YOSHIDA, Akihiro NAKAO
    Article type: PAPER
    Subject area: Information Network
    2012 Volume E95.D Issue 5 Pages 1446-1453
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Despite its great success, BitTorrent suffers from the content unavailability problem where peers cannot complete their content downloads due to some missing chunks, which is caused by a shortage of seeders who hold the content in its entirety. The multi-swarm collaboration approach is a natural choice for improving content availability, since content unavailability cannot be overcome by one swarm easily. Most existing multi-swarm collaboration approaches, however, suffer from content-related limitations, which limit their application scopes. In this paper, we introduce a new kind of multi-swarm collaboration utilizing a swarm as temporal storage. In a nutshell, the collaborating swarms cache some chunks of each other that are likely to be unavailable before the content unavailability happens and share the cached chunks when the content unavailability happens. Our approach enables any swarms to collaborate with each other without the content-related limitations. Simulation results show that our approach increases the number of download completions by over 50% (26%) compared to normal BitTorrent (existing bundling approach) with low overhead. In addition, our approach shows around 30% improved download completion time compared to the existing bundling approach. The results also show that our approach enables the peers participating in our approach to enjoy better performance than other peers, which can be a peer incentive.
    Download PDF (856K)
  • Kai LI, Yanmeng GUO, Qiang FU, Junfeng LI, Yonghong YAN
    Article type: PAPER
    Subject area: Speech and Hearing
    2012 Volume E95.D Issue 5 Pages 1454-1464
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Traditional two-microphone noise reduction algorithms to deal with highly nonstationary directional noises generally use the direction of arrival or phase difference information. The performance of these algorithms deteriorate when diffuse noises coexist with nonstationary directional noises in realistic adverse environments. In this paper, we present a two-channel noise reduction algorithm using a spatial information-based speech estimator and a spatial-information-controlled soft-decision noise estimator to improve the noise reduction performance in realistic non-stationary noisy environments. A target presence probability estimator based on Bayes rules using both phase difference and magnitude squared coherence is proposed for soft-decision of noise estimation, so that they can share complementary advantages when both directional noises and diffuse noises are present. Performances of the proposed two-microphone noise reduction algorithm are evaluated by noise reduction, log-spectral distance (LSD) and word recognition rate (WRR) of a distant-talking ASR system in a real room's noisy environment. Experimental results show that the proposed algorithm achieves better noises suppression without further distorting the desired signal components over the comparative dual-channel noise reduction algorithms.
    Download PDF (1437K)
  • Takanobu OBA, Takaaki HORI, Atsushi NAKAMURA, Akinori ITO
    Article type: PAPER
    Subject area: Speech and Hearing
    2012 Volume E95.D Issue 5 Pages 1465-1474
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    This paper describes a technique for overcoming the model shrinkage problem in automatic speech recognition (ASR), which allows application developers and users to control the model size with less degradation of accuracy. Recently, models for ASR systems tend to be large and this can constitute a bottleneck for developers and users without special knowledge of ASR with respect to introducing the ASR function. Specifically, discriminative language models (DLMs) are usually designed in a high-dimensional parameter space, although DLMs have gained increasing attention as an approach for improving recognition accuracy. Our proposed method can be applied to linear models including DLMs, in which the score of an input sample is given by the inner product of its features and the model parameters, but our proposed method can shrink models in an easy computation by obtaining simple statistics, which are square sums of feature values appearing in a data set. Our experimental results show that our proposed method can shrink a DLM with little degradation in accuracy and perform properly whether or not the data for obtaining the statistics are the same as the data for training the model.
    Download PDF (467K)
  • Nitin SINGHAL, Jin Woo YOO, Ho Yeol CHOI, In Kyu PARK
    Article type: PAPER
    Subject area: Image Processing and Video Processing
    2012 Volume E95.D Issue 5 Pages 1475-1484
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we analyze the key factors underlying the implementation, evaluation, and optimization of image processing and computer vision algorithms on embedded GPU using OpenGL ES 2.0 shader model. First, we present the characteristics of the embedded GPU and its inherent advantage when compared to embedded CPU. Additionally, we propose techniques to achieve increased performance with optimized shader design. To show the effectiveness of the proposed techniques, we employ cartoon-style non-photorealistic rendering (NPR), speeded-up robust feature (SURF) detection, and stereo matching as our example algorithms. Performance is evaluated in terms of the execution time and speed-up achieved in comparison with the implementation on embedded CPU.
    Download PDF (1631K)
  • Jong-Min LEE, Whoi-Yul KIM
    Article type: PAPER
    Subject area: Image Recognition, Computer Vision
    2012 Volume E95.D Issue 5 Pages 1485-1493
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Determining the rotation angle between two images is essential when comparing images that may include rotational variation. While there are three representative methods that utilize the phases of Zernike moments (ZMs) to estimate rotation angles, very little work has been done to compare the performances of these methods. In this paper, we compare the performances of these three methods and propose a new, angular radial transform (ART)-based method. Our method extends Revaud et al.'s method [1] and uses the phase of angular radial transform coefficients instead of ZMs. We show that our proposed method outperforms the ZM-based method using the MPEG-7 shape dataset when computation times are compared or in terms of the root mean square error vs. coverage.
    Download PDF (2089K)
  • Wei ZHOU, Alireza AHRARY, Sei-ichiro KAMATA
    Article type: PAPER
    Subject area: Image Recognition, Computer Vision
    2012 Volume E95.D Issue 5 Pages 1494-1505
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we propose a novel approach for presenting the local features of digital image using 1D Local Patterns by Multi-Scans (1DLPMS). We also consider the extentions and simplifications of the proposed approach into facial images analysis. The proposed approach consists of three steps. At the first step, the gray values of pixels in image are represented as a vector giving the local neighborhood intensity distrubutions of the pixels. Then, multi-scans are applied to capture different spatial information on the image with advantage of less computation than other traditional ways, such as Local Binary Patterns (LBP). The second step is encoding the local features based on different encoding rules using 1D local patterns. This transformation is expected to be less sensitive to illumination variations besides preserving the appearance of images embedded in the original gray scale. At the final step, Grouped 1D Local Patterns by Multi-Scans (G1DLPMS) is applied to make the proposed approach computationally simpler and easy to extend. Next, we further formulate boosted algorithm to extract the most discriminant local features. The evaluated results demonstrate that the proposed approach outperforms the conventional approaches in terms of accuracy in applications of face recognition, gender estimation and facial expression.
    Download PDF (2594K)
  • Kazuhiro TOKUNAGA, Nobuyuki KAWABATA, Tetsuo FURUKAWA
    Article type: PAPER
    Subject area: Biocybernetics, Neurocomputing
    2012 Volume E95.D Issue 5 Pages 1506-1518
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    We propose a novel modular network called the Self-Evolving Modular Network (SEEM). The SEEM has a modular network architecture with a graph structure and these following advantages: (1) new modules are added incrementally to allow the network to adapt in a self-organizing manner, and (2) graph's paths are formed based on the relationships between the models represented by modules. The SEEM is expected to be applicable to evolving functions of an autonomous robot in a self-organizing manner through interaction with the robot's environment and categorizing large-scale information. This paper presents the architecture and an algorithm for the SEEM. Moreover, performance characteristic and effectiveness of the network are shown by simulations using cubic functions and a set of 3D-objects.
    Download PDF (2221K)
  • Chaochao FENG, Zhonghai LU, Axel JANTSCH, Minxuan ZHANG
    Article type: LETTER
    Subject area: Computer System
    2012 Volume E95.D Issue 5 Pages 1519-1522
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we propose a 1-cycle high-performance 3D bufferless router with a 3-stage permutation network. The proposed router utilizes the 3-stage permutation network instead of the serialized switch allocator and 7×7 crossbar to achieve the frequency of 1.25GHz in TSMC 65nm technology. Compared with the other two 3D bufferless routers, the proposed router occupies less area and consumes less power consumption. Simulation results under both synthetic and application workloads illustrate that the proposed router achieves less average packet latency than the other two 3D bufferless routers.
    Download PDF (371K)
  • Hwan Sik YUN, Kiho CHO, Nam Soo KIM
    Article type: LETTER
    Subject area: Information Network
    2012 Volume E95.D Issue 5 Pages 1523-1526
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Acoustic data transmission is a technique which embeds data in a sound wave imperceptibly and detects it at a receiver. The data are embedded in an original audio signal and transmitted through the air by playing back the data-embedded audio using a loudspeaker. At the receiver, the data are extracted from the received audio signal captured by a microphone. In our previous work, we proposed an acoustic data transmission system designed based on phase modification of the modulated complex lapped transform (MCLT) coefficients. In this paper, we propose the spectral magnitude adjustment (SMA) technique which not only enhances the quality of the data-embedded audio signal but also improves the transmission performance of the system.
    Download PDF (301K)
  • Xiaodong DENG, Mengtian RONG, Tao LIU
    Article type: LETTER
    Subject area: Information Network
    2012 Volume E95.D Issue 5 Pages 1527-1530
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    As RFID technology is being more widely adopted, it is fairly common to read mobile tags using RFID systems, such as packages on conveyer belt and unit loads on pallet jack or forklift truck. In RFID systems, multiple tags use a shared medium for communicating with a reader. It is quite possible that tags will exit the reading area without being read, which results in tag leaking. In this letter, a reliable tag anti-collision algorithm for mobile tags is proposed. It reliably estimates the expectation of the number of tags arriving during a time slot when new tags continually enter the reader's reading area and no tag leaves without being read. In addition, it gives priority to tags that arrived early among read cycles and applies the expectation of the number of tags arriving during a time slot to the determination of the number of slots in the initial inventory round of the next read cycle. Simulation results show that the reliability of the proposed algorithm is close to that of DFSA algorithm when the expectation of the number of tags entering the reading area during a time slot is a given, and is better than that of DFSA algorithm when the number of time slots in the initial inventory round of next read cycle is set to 1 assuming that the number of tags arriving during a time slot follows Poisson distribution.
    Download PDF (125K)
  • Jeonghun YOON, Dae-Won KIM
    Article type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2012 Volume E95.D Issue 5 Pages 1531-1535
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Classification based on predictive association rules (CPAR) is a widely used associative classification method. Despite its efficiency, the analysis results obtained by CPAR will be influenced by missing values in the data sets, and thus it is not always possible to correctly analyze the classification results. In this letter, we improve CPAR to deal with the problem of missing data. The effectiveness of the proposed method is demonstrated using various classification examples.
    Download PDF (241K)
  • Shi-Ze GUO, Zhe-Ming LU, Guang-Yu KANG, Zhe CHEN, Hao LUO
    Article type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2012 Volume E95.D Issue 5 Pages 1536-1538
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Small-world is a common property existing in many real-life social, technological and biological networks. Small-world networks distinguish themselves from others by their high clustering coefficient and short average path length. In the past dozen years, many probabilistic small-world networks and some deterministic small-world networks have been proposed utilizing various mechanisms. In this Letter, we propose a new deterministic small-world network model by first constructing a binary-tree structure and then adding links between each pair of brother nodes and links between each grandfather node and its four grandson nodes. Furthermore, we give the analytic solutions to several topological characteristics, which shows that the proposed model is a small-world network.
    Download PDF (95K)
  • Kwanho KIM, Jae-Yoon JUNG, Jonghun PARK
    Article type: LETTER
    Subject area: Office Information Systems, e-Business Modeling
    2012 Volume E95.D Issue 5 Pages 1539-1542
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Information diffusion analysis in social networks is of significance since it enables us to deeply understand dynamic social interactions among users. In this paper, we introduce approaches to discovering information diffusion process in social networks based on process mining. Process mining techniques are applied from three perspectives: social network analysis, process discovery and community recognition. We then present experimental results by using a real-life social network data. The proposed techniques are expected to employ as new analytical tools in online social networks such as blog and wikis for company marketers, politicians, news reporters and online writers.
    Download PDF (354K)
  • Jin Soo SEO
    Article type: LETTER
    Subject area: Speech and Hearing
    2012 Volume E95.D Issue 5 Pages 1543-1546
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Speaker change detection involves the identification of the time indices of an audio stream, where the identity of the speaker changes. This paper proposes novel measures for speaker change detection over the centroid model, which divides the feature space into non-overlapping clusters for effective speaker-change comparison. The centroid model is a computationally-efficient variant of the widely-used mixture-distribution based background models for speaker recognition. Experiments on both synthetic and real-world data were performed; the results show that the proposed approach yields promising results compared with the conventional statistical measures.
    Download PDF (132K)
  • Cagatay KARABAT, Hakan ERDOGAN
    Article type: LETTER
    Subject area: Image Recognition, Computer Vision
    2012 Volume E95.D Issue 5 Pages 1547-1551
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Face image hashing is an emerging method used in biometric verification systems. In this paper, we propose a novel face image hashing method based on a new technique called discriminative projection selection. We apply the Fisher criterion for selecting the rows of a random projection matrix in a user-dependent fashion. Moreover, another contribution of this paper is to employ a bimodal Gaussian mixture model at the quantization step. Our simulation results on three different databases demonstrate that the proposed method has superior performance in comparison to previously proposed random projection based methods.
    Download PDF (191K)
  • Chenbo SHI, Guijin WANG, Xiaokang PEI, Bei HE, Xinggang LIN
    Article type: LETTER
    Subject area: Image Recognition, Computer Vision
    2012 Volume E95.D Issue 5 Pages 1552-1555
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    In this paper, we propose an interleaving updating framework of disparity and confidence map (IUFDCM) for stereo matching to eliminate the redundant and interfere information from unreliable pixels. Compared with other propagation algorithms using matching cost as messages, IUFDCM updates the disparity map and the confidence map in an interleaving manner instead. Based on the Confidence-based Support Window (CSW), disparity map is updated adaptively to alleviate the effect of input parameters. The reassignment for unreliable pixels with larger probability keeps ground truth depending on reliable messages. Consequently, the confidence map is updated according to the previous disparity map and the left-right consistency. The top ranks on Middlebury benchmark corresponding to different error thresholds demonstrate that our algorithm is competitive with the best stereo matching algorithms at present.
    Download PDF (837K)
  • Hong BAO, De XU, Yingjun TANG
    Article type: LETTER
    Subject area: Image Recognition, Computer Vision
    2012 Volume E95.D Issue 5 Pages 1556-1559
    Published: May 01, 2012
    Released on J-STAGE: May 01, 2012
    JOURNAL FREE ACCESS
    Visually saliency detection provides an alternative methodology to image description in many applications such as adaptive content delivery and image retrieval. One of the main aims of visual attention in computer vision is to detect and segment the salient regions in an image. In this paper, we employ matrix decomposition to detect salient object in nature images. To efficiently eliminate high contrast noise regions in the background, we integrate global context information into saliency detection. Therefore, the most salient region can be easily selected as the one which is globally most isolated. The proposed approach intrinsically provides an alternative methodology to model attention with low implementation complexity. Experiments show that our approach achieves much better performance than that from the existing state-of-art methods.
    Download PDF (868K)
feedback
Top