IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E94.D , Issue 12
Showing 1-34 articles out of 34 articles from the selected issue
Special Section on Parallel and Distributed Computing and Networking
  • Shuichi ICHIKAWA
    2011 Volume E94.D Issue 12 Pages 2297
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Download PDF (53K)
  • Hideki MIWA, Ryutaro SUSUKITA, Hidetomo SHIBAMURA, Tomoya HIRAO, Jun M ...
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2298-2308
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In the near future, interconnection networks of massively parallel computer systems will connect more than a hundred thousands of computing nodes. The performance evaluation of the interconnection networks can provide real insights to help the development of efficient communication library. Hence, to evaluate the performance of such interconnection networks, simulation tools capable of modeling the networks with sufficient details, supporting a user-friendly interface to describe communication patterns, providing the users with enough performance information, completing simulations within a reasonable time, are a real necessity. This paper introduces a novel interconnection network simulator NSIM, for the evaluation of the performance of extreme-scale interconnection networks. The simulator implements a simplified simulation model so as to run faster without any loss of accuracy. Unlike the existing simulators, NSIM is built on the execution-driven simulation approach. The simulator also provides a MPI-compatible programming interface. Thus, the simulator can emulate parallel program execution and correctly simulate point-to-point and collective communications that are dynamically changed by network congestion. The experimental results in this paper showed sufficient accuracy of this simulator by comparing the simulator and the real machine. We also confirmed that the simulator is capable of evaluating ultra large-scale interconnection networks, consumes smaller memory area, and runs faster than the existing simulator. This paper also introduces a simulation service built on a cloud environment. Without installing NSIM, users can simulate interconnection networks with various configurations by using a web browser.
    Download PDF (1545K)
  • Jung-Lok YU, Hee-Jung BYUN
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2309-2318
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Coscheduling has been gained a resurgence of interest as an effective technique to enhance the performance of parallel applications in multi-programmed clusters. However, existing coscheduling schemes do not adequately handle priority boost conflicts, leading to significantly degraded performance. To address this problem, in our previous study, we devised a novel algorithm that reorders the scheduling sequence of conflicting processes based on the rescheduling latency of their correspondents in remote nodes. In this paper, we exhaustively explore the design issues and implementation details of our contention-aware coscheduling scheme over Myrinet-based cluster system. We also practically analyze the impact of various system parameters and job characteristics on the performance of all considered schemes on a heterogeneous Linux cluster using a generic coscheduling framework. The results show that our approach outperforms existing schemes (by up to 36.6% in avg. job response time), reducing both boost conflict ratio and overall message delay.
    Download PDF (1004K)
  • Junichi OHMURA, Takefumi MIYOSHI, Hidetsugu IRIE, Tsutomu YOSHINAGA
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2319-2327
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In this paper, we propose an approach to obtaining enhanced performance of the Linpack benchmark on a GPU-accelerated PC cluster connected via relatively slow inter-node connections. For one node with a quad-core Intel Xeon W3520 processor and a NVIDIA Tesla C1060 GPU card, we implement a CPU-GPU parallel double-precision general matrix-matrix multiplication (dgemm) operation, and achieve a performance improvement of 34% compared with the GPU-only case and 64% compared with the CPU-only case. For an entire 16-node cluster, each node of which is the same as the above and is connected with two gigabit Ethernet links, we use a computation-communication overlap scheme with GPU acceleration for the Linpack benchmark, and achieve a performance improvement of 28% compared with the GPU-accelerated high-performance Linpack benchmark (HPL) without overlapping. Our overlap GPU acceleration solution uses overlaps in which the main inter-node communication and data transfer to the GPU device memory are overlapped with the main computation task on the CPU cores. These overlaps use multi-core processors, which almost all of today's high-performance computers use. In particular, as well as using a CPU core for communication tasks, we also simultaneously use other CPU cores and the GPU for computation tasks. In order to enable overlap between inter-node communication and computation tasks, we eliminate their close dependence by breaking the main computation task into smaller tasks and rescheduling. Based on a scheme in which part of the CPU computation power is simultaneously used for tasks other than computation tasks, we experimentally find the optimal computation ratio for CPUs; this ratio differs from the case of parallel dgemm operation of one node.
    Download PDF (638K)
  • Pulung WASKITO, Shinobu MIWA, Yasue MITSUKURA, Hironori NAKAJO
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2328-2337
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In off-line analysis, the demand for high precision signal processing has introduced a new method called Empirical Mode Decomposition (EMD), which is used for analyzing a complex set of data. Unfortunately, EMD is highly compute-intensive. In this paper, we show parallel implementation of Empirical Mode Decomposition on a GPU. We propose the use of “partial+total” switching method to increase performance while keeping the precision. We also focused on reducing the computation complexity in the above method from O(N) on a single CPU to O(N/P log (N)) on a GPU. Evaluation results show our single GPU implementation using Tesla C2050 (Fermi architecture) achieves a 29.9x speedup partially, and a 11.8x speedup totally when compared to a single Intel dual core CPU.
    Download PDF (911K)
  • Sho ENDO, Jun SONODA, Motoyuki SATO, Takafumi AOKI
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2338-2344
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Finite difference time domain (FDTD) method has been accelerated on the Cell Broadband Engine (Cell B.E.). However the problem has arisen that speedup is limited by the bandwidth of the main memory on large-scale analysis. As described in this paper, we propose a novel algorithm and implement FDTD using it. We compared the novel algorithm with results obtained using region segmentation, thereby demonstrating that the proposed algorithm has shorter calculation time than that provided by region segmentation.
    Download PDF (405K)
  • Ling XU, Ryusuke EGAWA, Hiroyuki TAKIZAWA, Hiroaki KOBAYASHI
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2345-2352
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The social network model has been regarded as a promising mechanism to defend against Sybil attack. This model assumes that honest peers and Sybil peers are connected by only a small number of attack edges. Detection of the attack edges plays a key role in restraining the power of Sybil peers. In this paper, an attack-resisting, distributed algorithm, named Random walk and Social network model-based clustering (RSC), is proposed to detect the attack edges. In RSC, peers disseminate random walk packets to each other. For each edge, the number of times that the packets pass this edge reflects the betweenness of this edge. RSC observes that the betweennesses of attack edges are higher than those of the non-attack edges. In this way, the attack edges can be identified. To show the effectiveness of RSC, RSC is integrated into an existing social network model-based algorithm called SOHL. The results of simulations with real world social network datasets show that RSC remarkably improves the performance of SOHL.
    Download PDF (2095K)
  • Chunghan LEE, Hirotake ABE, Toshio HIROTSU, Kyoji UMEMURA
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2353-2361
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Network testbeds have been used for network measurement and experiments. In such testbeds, resources, such as CPU, memory, and I/O interfaces, are shared and virtualized to maximize node utility for many users. A few studies have investigated the impact of virtualization on precise network measurement and understood Internet traffic characteristics on virtualized testbeds. Although scheduling latency and heavy loads are reportedly affected in precise network measurement, no clear conditions or criteria have been established. Moreover, empirical-statistical criteria and methods that pick out anomalous cases for precise network experiments are required on userland because virtualization technology used in the provided testbeds is hardly replaceable. In this paper, we show that ‘oversize packet spacing’, which can be caused by CPU scheduling latency, is a major cause of throughput instability on a virtualized network testbed even when no significant changes occur in well-known network metrics. These are unusual anomalies on virtualized network environment. Empirical-statistical analysis results accord with results at previous work. If network throughput is decreased by the anomalies, we should carefully review measurement results. Our empirical approach enables anomalous cases to be identified. We present CPU availability as an important criterion for estimating the anomalies.
    Download PDF (1101K)
  • Ryusuke UEDERA, Satoshi FUJITA
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2362-2369
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In this paper, we consider Peer-to-Peer Video-on-Demand (P2P VoD) systems based on the BitTorrent file sharing protocol. Since the Rarest First policy adopted in the original BitTorrent protocol frequently fails to collect pieces corresponding to a video file by their playback time, we need to develop a new piece selection rule particularly designed for P2P VoDs. In the proposed scheme, we assume the existence of a media server which can upload any piece upon request, and try to bound the load of such media server with two techniques. The first technique is to estimate pieces which are not held by any peer and prefetch them from the media server. The second technique is to switch the mode of each peer according to the estimated size of the P2P network. The performance of the proposed scheme is evaluated by simulation.
    Download PDF (911K)
  • Yao-Hung WU, Wei-Mei CHEN
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2370-2377
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Wireless sensor networks are comprised of several sensor nodes that communicate via wireless technology. Locating the sensor nodes is a fundamental problem in developing applications for wireless sensor networks. In this paper, we introduce a distributed localization scheme, called the Rectangle Overlapping Approach (ROA), using a mobile beacon with GPS and a directional antenna. The node locations are computed by performing simple operations that rely on the rotation angle and position of the mobile beacon. Simulation results show that the proposed scheme is very efficient and that the node positions can be determined accurately when the beacon follows a random waypoint movement model.
    Download PDF (864K)
  • Md. Nazrul Islam MONDAL, Koji NAKANO, Yasuaki ITO
    Type: PAPER
    2011 Volume E94.D Issue 12 Pages 2378-2388
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Most of FPGAs have Configurable Logic Blocks (CLBs) to implement combinational and sequential circuits and block RAMs to implement Random Access Memories (RAMs) and Read Only Memories (ROMs). Circuit design that minimizes the number of clock cycles is easy if we use asynchronous read operations. However, most of FPGAs support synchronous read operations, but do not support asynchronous read operations. The main contribution of this paper is to provide one of the potent approaches to resolve this problem. We assume that a circuit using asynchronous ROMs designed by a non-expert or quickly designed by an expert is given. Our goal is to convert this circuit with asynchronous ROMs into an equivalent circuit with synchronous ones. The resulting circuit with synchronous ROMs can be embedded into FPGAs. We also discuss several techniques to decrease the latency and increase the clock frequency of the resulting circuits.
    Download PDF (477K)
  • Wan Yeon LEE, Hyogon KIM, Heejo LEE
    Type: LETTER
    2011 Volume E94.D Issue 12 Pages 2389-2392
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The proposed scheduling scheme minimizes the energy consumption of a real-time task on the multi-core processor with the dynamic voltage and frequency scaling capability. The scheme allocates a pertinent number of cores to the task execution, inactivates unused cores, and assigns the lowest frequency meeting the deadline. For a periodic real-time task with consecutive real-time instances, the scheme prepares the minimum-energy solutions for all input cases at off-line time, and applies one of the prepared solutions to each real-time instance at runtime.
    Download PDF (308K)
Regular Section
  • Takeyuki TAMURA, Yang CONG, Tatsuya AKUTSU, Wai-Ki CHING
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2011 Volume E94.D Issue 12 Pages 2393-2399
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The impact degree is a measure of the robustness of a metabolic network against deletion of single or multiple reaction(s). Although such a measure is useful for mining important enzymes/genes, it was defined only for networks without cycles. In this paper, we extend the impact degree for metabolic networks containing cycles and develop a simple algorithm to calculate the impact degree. Furthermore we improve this algorithm to reduce computation time for the impact degree by deletions of multiple reactions. We applied our method to the metabolic network of E. coli, that includes reference pathways, consisting of 3281 reaction nodes and 2444 compound nodes, downloaded from KEGG database, and calculate the distribution of the impact degree. The results of our computational experiments show that the improved algorithm is 18.4 times faster than the simple algorithm for deletion of reaction-pairs and 11.4 times faster for deletion of reaction-triplets. We also enumerate genes with high impact degrees for single and multiple reaction deletions.
    Download PDF (422K)
  • Hisayoshi KANO, Shingo YOSHIZAWA, Takashi GUNJI, Shougo OKAMOTO, Morio ...
    Type: PAPER
    Subject area: Computer System
    2011 Volume E94.D Issue 12 Pages 2400-2408
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The IEEE802.11ac task group has announced the use of a wider channel that extends the channel bandwidth to more than 80MHz. We present an experimental platform consisting of a baseband and a RF unit in a 2×2 MIMO-OFDM system for the wider channel and report its system performance results from a field experiment. The MIMO-OFDM transceiver in the baseband unit has been designed to detect real-time MIMO and provides a maximum data rate of 600Mbps. OFDM tends to cause high peak PAPR for wider channels and distorts the power amplifier performance in the RF unit. We have improved the non-linear distortion by optimizing the OFDM preamble and evaluated its performance by conducting a simulation integrated with baseband processing and a RF. In the field experiment, our platform tested the communication performance in a farm and a passage environment.
    Download PDF (3724K)
  • Yong-Luo SHEN, Seok-Jae KIM, Sang-Woo SEO, Hyun-Goo LEE, Hyeong-Cheol ...
    Type: PAPER
    Subject area: Computer System
    2011 Volume E94.D Issue 12 Pages 2409-2417
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    This paper introduces a hardware engine for rendering two-dimensional vector graphics based on the OpenVG standard in portable devices. We focus on two design challenges posed by the rendering engines: the number of vertices to represent the images and the amount of memory usage. Redundant vertices are eliminated using adaptive tessellation, in which the redundancy can be judged using a proposed cost-per-quality measure. A simplified edge-flag rendering algorithm and the scanline-based rendering scheme are adopted to reduce external memory access. The designed rendering engine occupies approximately 173K gates and can satisfy real-time requirements of many applications when it is implemented using a 0.18µm, 1.8V CMOS standard cell library. An FPGA prototype using a system-on-a-chip platform has been developed and tested.
    Download PDF (1989K)
  • Kazunori SAKAMOTO, Fuyuki ISHIKAWA, Hironori WASHIZAKI, Yoshiaki FUKAZ ...
    Type: PAPER
    Subject area: Software Engineering
    2011 Volume E94.D Issue 12 Pages 2418-2430
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Test coverage is an important indicator of whether software has been sufficiently tested. However, there are several problems with the existing measurement tools for test coverage, such as their cost of development and maintenance, inconsistency, and inflexibility in measurement. We propose a consistent and flexible measurement framework for test coverage that we call the Open Code Coverage Framework (OCCF). It supports multiple programming languages by extracting the commonalities from multiple programming languages using an abstract syntax tree to help in the development of the measurement tools for the test coverage of new programming languages. OCCF allows users to add programming language support independently of the test-coverage-criteria and also to add test-coverage-criteria support independently of programming languages in order to take consistent measurements in each programming language. Moreover, OCCF provides two methods for changin the measurement range and elements using XPath and adding user code in order to make more flexible measurements. We implemented a sample tool for C, Java, and Python using OCCF. OCCF can measure four test-coverage-criteria. We also confirmed that OCCF can support C#, Ruby, JavaScript, and Lua. Moreover, we reduced the lines of code (LOCs) required to implement measurement tools for test coverage by approximately 90% and the time to implement a new test-coverage-criterion by over 80% in an experiment that compared OCCF with the conventional non-framework-based tools.
    Download PDF (935K)
  • Yuqing LAN, Mingxia KUANG, Wenbin ZHOU
    Type: PAPER
    Subject area: Software Engineering
    2011 Volume E94.D Issue 12 Pages 2431-2439
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    A Linux operating system release is composed of a large number of software packages, with complex dependencies. The management of dependency relationship is the foundation of building and maintaining a Linux operating system release, and checking the integrity of the dependencies is the key of the dependency management. The widespread adoption of Linux operating systems in many areas of the information technology society has drawn the attention on the issues regarding how to check the integrity of complexity dependencies of Linux packages and how to manage a huge number of packages in a consistent and effective way. Linux distributions have already provided the tools for managing the tasks of installing, removing and upgrading the packages they were made of. A number of tools have been provided to handle these tasks on the client side. However, there is a lack of tools that could help the distribution editors to maintain the integrity of Linux package dependencies on the server side. In this paper we present a method based on conflict to check the integrity of Linux package dependencies. From the perspective of conflict, this method achieves the goal to check the integrity of package dependencies on the server side by removing the conflict associating with the packages. Our contribution provides an effective and automatic way to support distribution editors in handling those issues. Experiments using this method are very successful in checking the integrity of package dependencies in Linux software distributions.
    Download PDF (339K)
  • Shayma ALKOBAISI, Wan D. BAE, Sada NARAYANAPPA
    Type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2011 Volume E94.D Issue 12 Pages 2440-2459
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The increase in the advanced location based services such as traffic coordination and management necessitates the need for advanced models tracking the positions of Moving Objects(MOs) like vehicles. Due to computer processing limitations, it is impossible for MOs to continuously update their locations. This results in the uncertainty nature of a MO's location between any two reported positions. Efficiently managing and quantifying the uncertainty regions of MOs are needed in order to support different types of queries and to improve query response time. This challenging problem of modeling uncertainty regions associated with MO was recently addressed by researchers and resulted in models that ranged from linear which require few properties of MOs as input to the models, to non-linear that are able to more accurately represent uncertainty regions by considering higher degree input. This paper summarizes and discusses approaches in modeling uncertainty regions associated with MOs. It further illustrates the need for appropriate approximations especially in the case of non-linear models as the uncertainty regions become rather irregularly shaped and difficult to manage. Finally, we demonstrate through several experimental sets the advantage of non-linear models over linear models when the uncertainty regions of MOs are approximated by two different approximations; the Minimum Bounding Box (MBB) and the Tilted Minimum Bounding Box (TMBB).
    Download PDF (1537K)
  • HyunYong LEE, Akihiro NAKAO
    Type: PAPER
    Subject area: Information Network
    2011 Volume E94.D Issue 12 Pages 2460-2467
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In the network operator-friendly P2P traffic control technique such as P4P, peers are supposed to select their communication partners by following a guidance issued by the network operator. Thus, the guidance has significant impact on the traffic control. However, detailed performance study of available guidances is missing. Most existing approaches do not show how they affect intra-domain traffic control in detail while mostly focusing on inter-domain traffic control. In this paper, we try to understand how the guidances affect the intra and inter-domain traffic control for better guidance improving the traffic control. Through simulations, we reveal the following. The performance-based guidance reflecting the networking status shows attractive results in distributing the traffic over intra-domain links and in reducing the cross-domain traffic and the charging volume of inter-domain link compared to the distance-based guidance enforcing simple localization. However, the performance-based guidance shows one limitation that can cause unstable traffic control. To overcome the identified limitation, we propose peer-assisted measurement and traffic estimation approach. Then, we verify our approach through simulations.
    Download PDF (840K)
  • Morihiro HAYASHIDA, Tatsuya AKUTSU
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2011 Volume E94.D Issue 12 Pages 2468-2478
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    For measuring the similarity of biological sequences and structures such as DNA sequences, protein sequences, and tertiary structures, several compression-based methods have been developed. However, they are based on compression algorithms only for sequential data. For instance, protein structures can be represented by two-dimensional distance matrices. Therefore, it is expected that image compression is useful for measuring the similarity of protein structures because image compression algorithms compress data horizontally and vertically. This paper proposes series of methods for measuring the similarity of protein structures. In the methods, an original protein structure is transformed into a distance matrix, which is regarded as a two-dimensional image. Then, the similarity of two protein structures is measured by a kind of compression ratio of the concatenated image. We employed several image compression algorithms, JPEG, GIF, PNG, IFS, and SPC. Since SPC often gave better results among the other image compression methods, and it is simple and easy to be modified, we modified SPC and obtained MSPC. We applied the proposed methods to clustering of protein structures, and performed Receiver Operating Characteristic (ROC) analysis. The results of computational experiments suggest that MSPC has the best performance among existing compression-based methods. We also present some theoretical results on the time complexity and Kolmogorov complexity of image compression-based protein structure comparison.
    Download PDF (895K)
  • Teruyoshi SASAYAMA, Tetsuo KOBAYASHI
    Type: PAPER
    Subject area: Human-computer Interaction
    2011 Volume E94.D Issue 12 Pages 2479-2486
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    We developed a novel movement-imagery-based brain-computer interface (BCI) for untrained subjects without employing machine learning techniques. The development of BCI consisted of several steps. First, spline Laplacian analysis was performed. Next, time-frequency analysis was applied to determine the optimal frequency range and latencies of the electroencephalograms (EEGs). Finally, trials were classified as right or left based on β-band event-related synchronization using the cumulative distribution function of pretrigger EEG noise. To test the performance of the BCI, EEGs during the execution and imagination of right/left wrist-bending movements were measured from 63 locations over the entire scalp using eight healthy subjects. The highest classification accuracies were 84.4% and 77.8% for real movements and their imageries, respectively. The accuracy is significantly higher than that of previously reported machine-learning-based BCIs in the movement imagery task (paired t-test, p < 0.05). It has also been demonstrated that the highest accuracy was achieved even though subjects had never participated in movement imageries.
    Download PDF (712K)
  • Katsutoshi UEAOKI, Kazunori IWATA, Nobuo SUEMATSU, Akira HAYASHI
    Type: PAPER
    Subject area: Pattern Recognition
    2011 Volume E94.D Issue 12 Pages 2487-2494
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    A two-dimensional shape is generally represented with line drawings or object contours in a digital image. Shapes can be divided into two types, namely ordered and unordered shapes. An ordered shape is an ordered set of points, while an unordered shape is an unordered set. As a result, each type typically uses different attributes to define the local descriptors involved in representing the local distributions of points sampled from the shape. Throughout this paper, we focus on unordered shapes. Since most local descriptors of unordered shapes are not scale-invariant, we usually make the shapes in an image data set the same size through scale normalization, before applying shape matching procedures. Shapes obtained through scale normalization are suitable for such descriptors if the original whole shapes are similar. However, they are not suitable if parts of each original shape are drawn using different scales. Thus, in this paper, we present a scale-invariant descriptor constructed by von Mises distributions to deal with such shapes. Since this descriptor has the merits of being both scale-invariant and a probability distribution, it does not require scale normalization and can employ an arbitrary measure of probability distributions in matching shape points. In experiments on shape matching and retrieval, we show the effectiveness of our descriptor, compared to several conventional descriptors.
    Download PDF (700K)
  • Tomoko NARIAI, Kazuyo TANAKA
    Type: PAPER
    Subject area: Speech and Hearing
    2011 Volume E94.D Issue 12 Pages 2495-2502
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Certain irregularities in the utterances of words or phrases often occur in English spoken by Japanese native subject, referred to in this article as Japanese English. Japanese English is linguistically presumed to reflect the phonetic characteristics of Japanese. We consider the prosodic feature patterns as one of the most common causes of irregularities in Japanese English, and that Japanese English would have better prosodic patterns if its particular characteristics were modified. This study investigates prosodic differences between Japanese English and English speakers' English, and shows the quantitative results of a statistical analysis of pitch. The analysis leads to rules that show how to modify Japanese English to have pitch patterns closer to those of English speakers. On the basis of these rules, the pitch patterns of test speech samples of Japanese English are modified, and then re-synthesized. The modified speech is evaluated in a listening experiment by native English subjects. The result of the experiment shows that on average, over three-fold of the English subjects support the proposed modification against original speech. Therefore, the results of the experiments indicate practical verification of validity of the rules. Additionally, the results suggest that irregularities of prominence lie in Japanese English sentences. This can be explained by the prosodic transfer of first language prosodic characteristics on second language prosodic patterns.
    Download PDF (677K)
  • Omid DEHZANGI, Bin MA, Eng Siong CHNG, Haizhou LI
    Type: PAPER
    Subject area: Speech and Hearing
    2011 Volume E94.D Issue 12 Pages 2503-2512
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    This paper investigates a new method for fusion of scores generated by multiple classification sub-systems that help to further reduce the classification error rate in Spoken Language Recognition (SLR). In recent studies, a variety of effective classification algorithms have been developed for SLR. Hence, it has been a common practice in the National Institute of Standards and Technology (NIST) Language Recognition Evaluations (LREs) to fuse the results from several classification sub-systems to boost the performance of the SLR systems. In this work, we introduce a discriminative performance measure to optimize the performance of the fusion of 7 language classifiers developed as IIR's submission to the 2009 NIST LRE. We present an Error Corrective Fusion (ECF) method in which we iteratively learn the fusion weights to minimize error rate of the fusion system. Experiments conducted on the 2009 NIST LRE corpus demonstrate a significant improvement compared to individual sub-systems. Comparison study is also conducted to show the effectiveness of the ECF method.
    Download PDF (1044K)
  • Tsung-Han TSAI, Chung-Yuan LIN
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2011 Volume E94.D Issue 12 Pages 2513-2522
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Emerging video surveillance technologies are based on foreground detection to achieve event detection automatically. Integration foreground detection with a modern multi-camera surveillance system can significantly increase the surveillance efficiency. The foreground detection often leads to high computational load and increases the cost of surveillance system when a mass deployment of end cameras is needed. This paper proposes a DSP-based foreground detection algorithm. Our algorithm incorporates a temporal data correlation predictor (TDCP) which can exhibit the correlation of data and reduce computation based on this correlation. With the DSP-oriented foreground detection, an adaptive frame rate control is developed as a low cost solution for multi-camera surveillance system. The adaptive frame rate control automatically detects the computational load of foreground detection on multiple video sources and adaptively tunes the TDCP to meet the real-time specification. Therefore, no additional hardware cost is required when the number of deployed cameras is increased. Our method has been validated on a demonstration platform. Performance can achieve real-time CIF frame processing for a 16-camera surveillance system by single-DSP chip. Quantitative evaluation demonstrates that our solution provides satisfied detection rate, while significantly reducing the hardware cost.
    Download PDF (3694K)
  • Won-young CHUNG, Jae-won PARK, Seung-Woo LEE, Won Woo RO, Yong-surk LE ...
    Type: LETTER
    Subject area: Computer System
    2011 Volume E94.D Issue 12 Pages 2523-2527
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The message passing interface (MPI) broadcast communication commonly causes a severe performance bottleneck in multicore system that uses distributed memory. Thus, in this paper, we propose a novel algorithm and hardware structure for the MPI broadcast communication to reduce the bottleneck situation. The transmission order is set based on the state of each processing node that comprises the multicore system, so the novel algorithm minimizes the performance degradation caused by conflict. The proposed scoreboard MPI unit is evaluated by modeling it with SystemC and implemented using VerilogHDL. The size of the proposed scoreboard MPI unit occupies less than 1.03% of the whole chip, and it yields a highly improved performance up to 75.48% as its maximum with 16 processing nodes. Hence, with respect to low-cost design and scalability, this scoreboard MPI unit is particularly useful towards increasing overall performance of the embedded MPSoC.
    Download PDF (848K)
  • Seungjae BAEK, Heekwon PARK, Jongmoo CHOI
    Type: LETTER
    Subject area: Software System
    2011 Volume E94.D Issue 12 Pages 2528-2532
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In this paper, we propose three techniques to improve the performance of YAFFS (Yet Another Flash File System), while enhancing the reliability of the system. Specifically, we first propose to manage metadata and user data separately on segregated blocks. This modification not only leads to the reduction of the mount time but also reduces the garbage collection time. Second, we tailor the wear-leveling to the segregated metadata and user data blocks. That is, worn out blocks between the segregated blocks are swapped, which leads to more evenly worn out blocks increasing the lifetime of the system. Finally, we devise an analytic model to predict the expected garbage collection time. By accurately predicting the garbage collection time, the system can perform garbage collection at more opportune times when the user's perceived performance may not be negatively affected. Performance evaluation results based on real implementations show that our modifications enhance performance and reliability without incurring additional overheads. Specifically, the YAFFS with our proposed techniques outperforms the original YAFFS by six times in terms of mount speed and five times in terms of benchmark performance, while reducing the average erase count of blocks by 14%.
    Download PDF (940K)
  • Yong-Jun YOU, Sung-Do CHI, Jae-Ick KIM
    Type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2011 Volume E94.D Issue 12 Pages 2533-2536
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In most existing warships combat simulation system, the tactics of a warship is manipulated by human operators. For this reason, the simulation results are restricted due to the capabilities of human operators. To deal with this, we have employed the genetic algorithm for supporting the evolutionary simulation environment. In which, the tactical decision by human operators is replaced by the human model with a rule-based chromosome for representing tactics so that the population of simulations are created and hundreds of simulation runs are continued on the basis of the genetic algorithm without any human intervention until finding emergent tactics which shows the best performance throughout the simulation. Several simulation tests demonstrate the techniques.
    Download PDF (1182K)
  • Yu Gwang JIN, Nam Soo KIM, Joon-Hyuk CHANG
    Type: LETTER
    Subject area: Speech and Hearing
    2011 Volume E94.D Issue 12 Pages 2537-2540
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    In this letter, we propose a novel speech enhancement algorithm based on data-driven residual gain estimation. The entire system consists of two stages. At the first stage, a conventional speech enhancement algorithm enhances the input signal while estimating several signal-to-noise ratio (SNR)-related parameters. The residual gain, which is estimated by a data-driven method, is applied to further enhance the signal at the second stage. A number of experimental results show that the proposed speech enhancement algorithm outperforms the conventional speech enhancement technique based on soft decision and the data-driven approach using SNR grid look-up table.
    Download PDF (174K)
  • Bei HE, Guijin WANG, Xinggang LIN, Chenbo SHI, Chunxiao LIU
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2011 Volume E94.D Issue 12 Pages 2541-2544
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    This paper proposes a high-accuracy sub-pixel registration framework based on phase correlation for noisy images. First we introduce a denoising module, where the edge-preserving filter is adopted. This strategy not only filters off the noise but also preserves most of the original image signal. A confidence-weighted optimization module is then proposed to fit the linear phase plane discriminately and to achieve sub-pixel shifts. Experiments demonstrate the effectiveness of the combination of our modules and improvements of the accuracy and robustness against noise compared to other sub-pixel phase correlation methods in the Fourier domain.
    Download PDF (853K)
  • Xin HE, Huiyun JING, Qi HAN, Xiamu NIU
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2011 Volume E94.D Issue 12 Pages 2545-2548
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    We propose a novel saliency detection model based on Bayes' theorem. The model integrates the two parts of Bayes' equation to measure saliency, each part of which was considered separately in the previous models. The proposed model measures saliency by computing local kernel density estimation of features in the center-surround region and global kernel density estimation of features at each pixel across the whole image. Under the proposed model, a saliency detection method is presented that extracts DCT (Discrete Cosine Transform) magnitude of local region around each pixel as the feature. Experiments show that the proposed model not only performs competitively on psychological patterns and better than the current state-of-the-art models on human visual fixation data, but also is robust against signal uncertainty.
    Download PDF (1147K)
  • Quan MIAO, Guijin WANG, Xinggang LIN
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2011 Volume E94.D Issue 12 Pages 2549-2552
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    Object tracking is a major technique in image processing and computer vision. Tracking speed will directly determine the quality of applications. This paper presents a parallel implementation for a recently proposed scale- and rotation-invariant on-line object tracking system. The algorithm is based on NVIDIA's Graphics Processing Units (GPU) using Compute Unified Device Architecture (CUDA), following the model of single instruction multiple threads. Specifically, we analyze the original algorithm and propose the GPU-based parallel design. Emphasis is placed on exploiting the data parallelism and memory usage. In addition, we apply optimization technique to maximize the utilization of NVIDIA's GPU and reduce the data transfer time. Experimental results show that our GPGPU-based method running on a GTX480 graphics card could achieve up to 12X speed-up compared with the efficiency equivalence on an Intel E8400 3.0GHz CPU, including I/O time.
    Download PDF (792K)
  • Bong-Soo SOHN
    Type: LETTER
    Subject area: Computer Graphics
    2011 Volume E94.D Issue 12 Pages 2553-2556
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    We describe an efficient algorithm that extracts a connected component of an isosurface, or a contour, from a 3D rectilinear volume data. The efficiency of the algorithm is achieved by three factors: (i) directly working with rectilinear grids, (ii) parallel utilization of a multi-core CPU for extracting active cells, the cells containing the contour, and (iii) parallel utilization of a many-core GPU for computing the geometries of a contour surface in each active cell using CUDA. Experimental results show that our hybrid parallel implementation achieved up to 20x speedup over existing methods on an ordinary PC. Our work coupled with the Contour Tree framework is useful for quickly segmenting, displaying, and analyzing a feature of interest in 3D rectilinear volume data without being distracted by other features.
    Download PDF (1660K)
  • Woong-Kee LOH, Heejune AHN
    Type: LETTER
    Subject area: Biological Engineering
    2011 Volume E94.D Issue 12 Pages 2557-2560
    Published: December 01, 2011
    Released: December 01, 2011
    JOURNALS FREE ACCESS
    The suffix tree is one of most widely adopted indexes in the application of genome sequence alignment. Although it supports very fast alignment, it has a couple of shortcomings, such as a very long construction time and a very large volume size. Loh et al. [7] proposed a suffix tree construction algorithm with dramatically improved performance; however, the size still remains as a challenging problem. We propose an algorithm by extending the one by Loh et al. to reduce the suffix tree size. As a result of our experiments, our algorithm constructed a suffix tree of approximately 60% of the size within almost the same time period.
    Download PDF (405K)
feedback
Top