IEICE Transactions on Information and Systems

Special Section on Parallel and Distributed Computing and Networking

FOREWORD

Shuichi ICHIKAWA

2011Volume E94.DIssue 12 Pages 2297
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2297

JOURNAL FREE ACCESS

Download PDF (53K)
NSIM: An Interconnection Network Simulator for Extreme-Scale Parallel Computers

Hideki MIWA, Ryutaro SUSUKITA, Hidetomo SHIBAMURA, Tomoya HIRAO, Jun M ...

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2298-2308
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2298

JOURNAL FREE ACCESS

Show abstractHide abstract

In the near future, interconnection networks of massively parallel computer systems will connect more than a hundred thousands of computing nodes. The performance evaluation of the interconnection networks can provide real insights to help the development of efficient communication library. Hence, to evaluate the performance of such interconnection networks, simulation tools capable of modeling the networks with sufficient details, supporting a user-friendly interface to describe communication patterns, providing the users with enough performance information, completing simulations within a reasonable time, are a real necessity. This paper introduces a novel interconnection network simulator NSIM, for the evaluation of the performance of extreme-scale interconnection networks. The simulator implements a simplified simulation model so as to run faster without any loss of accuracy. Unlike the existing simulators, NSIM is built on the execution-driven simulation approach. The simulator also provides a MPI-compatible programming interface. Thus, the simulator can emulate parallel program execution and correctly simulate point-to-point and collective communications that are dynamically changed by network congestion. The experimental results in this paper showed sufficient accuracy of this simulator by comparing the simulator and the real machine. We also confirmed that the simulator is capable of evaluating ultra large-scale interconnection networks, consumes smaller memory area, and runs faster than the existing simulator. This paper also introduces a simulation service built on a cloud environment. Without installing NSIM, users can simulate interconnection networks with various configurations by using a web browser.

View full abstract

Download PDF (1545K)
Design and Implementation of a Contention-Aware Coscheduling Strategy on Multi-Programmed Heterogeneous Clusters

Jung-Lok YU, Hee-Jung BYUN

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2309-2318
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2309

JOURNAL FREE ACCESS

Show abstractHide abstract

Coscheduling has been gained a resurgence of interest as an effective technique to enhance the performance of parallel applications in multi-programmed clusters. However, existing coscheduling schemes do not adequately handle priority boost conflicts, leading to significantly degraded performance. To address this problem, in our previous study, we devised a novel algorithm that reorders the scheduling sequence of conflicting processes based on the rescheduling latency of their correspondents in remote nodes. In this paper, we exhaustively explore the design issues and implementation details of our contention-aware coscheduling scheme over Myrinet-based cluster system. We also practically analyze the impact of various system parameters and job characteristics on the performance of all considered schemes on a heterogeneous Linux cluster using a generic coscheduling framework. The results show that our approach outperforms existing schemes (by up to 36.6% in avg. job response time), reducing both boost conflict ratio and overall message delay.

View full abstract

Download PDF (1004K)
Computation-Communication Overlap of Linpack on a GPU-Accelerated PC Cluster

Junichi OHMURA, Takefumi MIYOSHI, Hidetsugu IRIE, Tsutomu YOSHINAGA

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2319-2327
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2319

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose an approach to obtaining enhanced performance of the Linpack benchmark on a GPU-accelerated PC cluster connected via relatively slow inter-node connections. For one node with a quad-core Intel Xeon W3520 processor and a NVIDIA Tesla C1060 GPU card, we implement a CPU-GPU parallel double-precision general matrix-matrix multiplication (dgemm) operation, and achieve a performance improvement of 34% compared with the GPU-only case and 64% compared with the CPU-only case. For an entire 16-node cluster, each node of which is the same as the above and is connected with two gigabit Ethernet links, we use a computation-communication overlap scheme with GPU acceleration for the Linpack benchmark, and achieve a performance improvement of 28% compared with the GPU-accelerated high-performance Linpack benchmark (HPL) without overlapping. Our overlap GPU acceleration solution uses overlaps in which the main inter-node communication and data transfer to the GPU device memory are overlapped with the main computation task on the CPU cores. These overlaps use multi-core processors, which almost all of today's high-performance computers use. In particular, as well as using a CPU core for communication tasks, we also simultaneously use other CPU cores and the GPU for computation tasks. In order to enable overlap between inter-node communication and computation tasks, we eliminate their close dependence by breaking the main computation task into smaller tasks and rescheduling. Based on a scheme in which part of the CPU computation power is simultaneously used for tasks other than computation tasks, we experimentally find the optimal computation ratio for CPUs; this ratio differs from the case of parallel dgemm operation of one node.

View full abstract

Download PDF (638K)
Evaluation of GPU-Based Empirical Mode Decomposition for Off-Line Analysis

Pulung WASKITO, Shinobu MIWA, Yasue MITSUKURA, Hironori NAKAJO

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2328-2337
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2328

JOURNAL FREE ACCESS

Show abstractHide abstract

In off-line analysis, the demand for high precision signal processing has introduced a new method called Empirical Mode Decomposition (EMD), which is used for analyzing a complex set of data. Unfortunately, EMD is highly compute-intensive. In this paper, we show parallel implementation of Empirical Mode Decomposition on a GPU. We propose the use of “partial+total” switching method to increase performance while keeping the precision. We also focused on reducing the computation complexity in the above method from O(N) on a single CPU to O(N/P log (N)) on a GPU. Evaluation results show our single GPU implementation using Tesla C2050 (Fermi architecture) achieves a 29.9x speedup partially, and a 11.8x speedup totally when compared to a single Intel dual core CPU.

View full abstract

Download PDF (911K)
Acceleration of FDTD Method Using a Novel Algorithm on the Cell B.E.

Sho ENDO, Jun SONODA, Motoyuki SATO, Takafumi AOKI

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2338-2344
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2338

JOURNAL FREE ACCESS

Show abstractHide abstract

Finite difference time domain (FDTD) method has been accelerated on the Cell Broadband Engine (Cell B.E.). However the problem has arisen that speedup is limited by the bandwidth of the main memory on large-scale analysis. As described in this paper, we propose a novel algorithm and implement FDTD using it. We compared the novel algorithm with results obtained using region segmentation, thereby demonstrating that the proposed algorithm has shorter calculation time than that provided by region segmentation.

View full abstract

Download PDF (405K)
A Network Clustering Algorithm for Sybil-Attack Resisting

Ling XU, Ryusuke EGAWA, Hiroyuki TAKIZAWA, Hiroaki KOBAYASHI

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2345-2352
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2345

JOURNAL FREE ACCESS

Show abstractHide abstract

The social network model has been regarded as a promising mechanism to defend against Sybil attack. This model assumes that honest peers and Sybil peers are connected by only a small number of attack edges. Detection of the attack edges plays a key role in restraining the power of Sybil peers. In this paper, an attack-resisting, distributed algorithm, named Random walk and Social network model-based clustering (RSC), is proposed to detect the attack edges. In RSC, peers disseminate random walk packets to each other. For each edge, the number of times that the packets pass this edge reflects the betweenness of this edge. RSC observes that the betweennesses of attack edges are higher than those of the non-attack edges. In this way, the attack edges can be identified. To show the effectiveness of RSC, RSC is integrated into an existing social network model-based algorithm called SOHL. The results of simulations with real world social network datasets show that RSC remarkably improves the performance of SOHL.

View full abstract

Download PDF (2095K)
Traffic Anomaly Analysis and Characteristics on a Virtualized Network Testbed

Chunghan LEE, Hirotake ABE, Toshio HIROTSU, Kyoji UMEMURA

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2353-2361
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2353

JOURNAL FREE ACCESS

Show abstractHide abstract

Network testbeds have been used for network measurement and experiments. In such testbeds, resources, such as CPU, memory, and I/O interfaces, are shared and virtualized to maximize node utility for many users. A few studies have investigated the impact of virtualization on precise network measurement and understood Internet traffic characteristics on virtualized testbeds. Although scheduling latency and heavy loads are reportedly affected in precise network measurement, no clear conditions or criteria have been established. Moreover, empirical-statistical criteria and methods that pick out anomalous cases for precise network experiments are required on userland because virtualization technology used in the provided testbeds is hardly replaceable. In this paper, we show that ‘oversize packet spacing’, which can be caused by CPU scheduling latency, is a major cause of throughput instability on a virtualized network testbed even when no significant changes occur in well-known network metrics. These are unusual anomalies on virtualized network environment. Empirical-statistical analysis results accord with results at previous work. If network throughput is decreased by the anomalies, we should carefully review measurement results. Our empirical approach enables anomalous cases to be identified. We present CPU availability as an important criterion for estimating the anomalies.

View full abstract

Download PDF (1101K)
Adaptive Prefetching Scheme for Peer-to-Peer Video-on-Demand Systems with a Media Server

Ryusuke UEDERA, Satoshi FUJITA

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2362-2369
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2362

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we consider Peer-to-Peer Video-on-Demand (P2P VoD) systems based on the BitTorrent file sharing protocol. Since the Rarest First policy adopted in the original BitTorrent protocol frequently fails to collect pieces corresponding to a video file by their playback time, we need to develop a new piece selection rule particularly designed for P2P VoDs. In the proposed scheme, we assume the existence of a media server which can upload any piece upon request, and try to bound the load of such media server with two techniques. The first technique is to estimate pieces which are not held by any peer and prefetch them from the media server. The second technique is to switch the mode of each peer according to the estimated size of the P2P network. The performance of the proposed scheme is evaluated by simulation.

View full abstract

Download PDF (911K)
Localization Using a Mobile Beacon with Directional Antenna for Wireless Sensor Networks

Yao-Hung WU, Wei-Mei CHEN

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2370-2377
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2370

JOURNAL FREE ACCESS

Show abstractHide abstract

Wireless sensor networks are comprised of several sensor nodes that communicate via wireless technology. Locating the sensor nodes is a fundamental problem in developing applications for wireless sensor networks. In this paper, we introduce a distributed localization scheme, called the Rectangle Overlapping Approach (ROA), using a mobile beacon with GPS and a directional antenna. The node locations are computed by performing simple operations that rely on the rotation angle and position of the mobile beacon. Simulation results show that the proposed scheme is very efficient and that the node positions can be determined accurately when the beacon follows a random waypoint movement model.

View full abstract

Download PDF (864K)
A Graph Rewriting Approach for Converting Asynchronous ROMs into Synchronous Ones

Md. Nazrul Islam MONDAL, Koji NAKANO, Yasuaki ITO

Article type: PAPER
2011Volume E94.DIssue 12 Pages 2378-2388
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2378

JOURNAL FREE ACCESS

Show abstractHide abstract

Most of FPGAs have Configurable Logic Blocks (CLBs) to implement combinational and sequential circuits and block RAMs to implement Random Access Memories (RAMs) and Read Only Memories (ROMs). Circuit design that minimizes the number of clock cycles is easy if we use asynchronous read operations. However, most of FPGAs support synchronous read operations, but do not support asynchronous read operations. The main contribution of this paper is to provide one of the potent approaches to resolve this problem. We assume that a circuit using asynchronous ROMs designed by a non-expert or quickly designed by an expert is given. Our goal is to convert this circuit with asynchronous ROMs into an equivalent circuit with synchronous ones. The resulting circuit with synchronous ROMs can be embedded into FPGAs. We also discuss several techniques to decrease the latency and increase the clock frequency of the resulting circuits.

View full abstract

Download PDF (477K)
Minimum-Energy Semi-Static Scheduling of a Periodic Real-Time Task on DVFS-Enabled Multi-Core Processors

Wan Yeon LEE, Hyogon KIM, Heejo LEE

Article type: LETTER
2011Volume E94.DIssue 12 Pages 2389-2392
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2389

JOURNAL FREE ACCESS

Show abstractHide abstract

The proposed scheduling scheme minimizes the energy consumption of a real-time task on the multi-core processor with the dynamic voltage and frequency scaling capability. The scheme allocates a pertinent number of cores to the task execution, inactivates unused cores, and assigns the lowest frequency meeting the deadline. For a periodic real-time task with consecutive real-time instances, the scheme prepares the minimum-energy solutions for all input cases at off-line time, and applies one of the prepared solutions to each real-time instance at runtime.

View full abstract

Download PDF (308K)

Regular Section

An Efficient Method of Computing Impact Degrees for Multiple Reactions in Metabolic Networks with Cycles

Takeyuki TAMURA, Yang CONG, Tatsuya AKUTSU, Wai-Ki CHING

Article type: PAPER
Subject area: Fundamentals of Information Systems
2011Volume E94.DIssue 12 Pages 2393-2399
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2393

JOURNAL FREE ACCESS

Show abstractHide abstract

The impact degree is a measure of the robustness of a metabolic network against deletion of single or multiple reaction(s). Although such a measure is useful for mining important enzymes/genes, it was defined only for networks without cycles. In this paper, we extend the impact degree for metabolic networks containing cycles and develop a simple algorithm to calculate the impact degree. Furthermore we improve this algorithm to reduce computation time for the impact degree by deletions of multiple reactions. We applied our method to the metabolic network of E. coli, that includes reference pathways, consisting of 3281 reaction nodes and 2444 compound nodes, downloaded from KEGG database, and calculate the distribution of the impact degree. The results of our computational experiments show that the improved algorithm is 18.4 times faster than the simple algorithm for deletion of reaction-pairs and 11.4 times faster for deletion of reaction-triplets. We also enumerate genes with high impact degrees for single and multiple reaction deletions.

View full abstract

Download PDF (422K)
Development and Outdoor Evaluation of an Experimental Platform in an 80-MHz Bandwidth 2×2 MIMO-OFDM System in 5.2-GHz Band

Hisayoshi KANO, Shingo YOSHIZAWA, Takashi GUNJI, Shougo OKAMOTO, Morio ...

Article type: PAPER
Subject area: Computer System
2011Volume E94.DIssue 12 Pages 2400-2408
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2400

JOURNAL FREE ACCESS

Show abstractHide abstract

The IEEE802.11ac task group has announced the use of a wider channel that extends the channel bandwidth to more than 80MHz. We present an experimental platform consisting of a baseband and a RF unit in a 2×2 MIMO-OFDM system for the wider channel and report its system performance results from a field experiment. The MIMO-OFDM transceiver in the baseband unit has been designed to detect real-time MIMO and provides a maximum data rate of 600Mbps. OFDM tends to cause high peak PAPR for wider channels and distorts the power amplifier performance in the RF unit. We have improved the non-linear distortion by optimizing the OFDM preamble and evaluated its performance by conducting a simulation integrated with baseband processing and a RF. In the field experiment, our platform tested the communication performance in a farm and a passage environment.

View full abstract

Download PDF (3724K)
Design of an OpenVG Hardware Rendering Engine

Yong-Luo SHEN, Seok-Jae KIM, Sang-Woo SEO, Hyun-Goo LEE, Hyeong-Cheol ...

Article type: PAPER
Subject area: Computer System
2011Volume E94.DIssue 12 Pages 2409-2417
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2409

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper introduces a hardware engine for rendering two-dimensional vector graphics based on the OpenVG standard in portable devices. We focus on two design challenges posed by the rendering engines: the number of vertices to represent the images and the amount of memory usage. Redundant vertices are eliminated using adaptive tessellation, in which the redundancy can be judged using a proposed cost-per-quality measure. A simplified edge-flag rendering algorithm and the scanline-based rendering scheme are adopted to reduce external memory access. The designed rendering engine occupies approximately 173K gates and can satisfy real-time requirements of many applications when it is implemented using a 0.18µm, 1.8V CMOS standard cell library. An FPGA prototype using a system-on-a-chip platform has been developed and tested.

View full abstract

Download PDF (1989K)
Open Code Coverage Framework: A Framework for Consistent, Flexible and Complete Measurement of Test Coverage Supporting Multiple Programming Languages

Kazunori SAKAMOTO, Fuyuki ISHIKAWA, Hironori WASHIZAKI, Yoshiaki FUKAZ ...

Article type: PAPER
Subject area: Software Engineering
2011Volume E94.DIssue 12 Pages 2418-2430
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2418

JOURNAL FREE ACCESS

Show abstractHide abstract

Test coverage is an important indicator of whether software has been sufficiently tested. However, there are several problems with the existing measurement tools for test coverage, such as their cost of development and maintenance, inconsistency, and inflexibility in measurement. We propose a consistent and flexible measurement framework for test coverage that we call the Open Code Coverage Framework (OCCF). It supports multiple programming languages by extracting the commonalities from multiple programming languages using an abstract syntax tree to help in the development of the measurement tools for the test coverage of new programming languages. OCCF allows users to add programming language support independently of the test-coverage-criteria and also to add test-coverage-criteria support independently of programming languages in order to take consistent measurements in each programming language. Moreover, OCCF provides two methods for changin the measurement range and elements using XPath and adding user code in order to make more flexible measurements. We implemented a sample tool for C, Java, and Python using OCCF. OCCF can measure four test-coverage-criteria. We also confirmed that OCCF can support C#, Ruby, JavaScript, and Lua. Moreover, we reduced the lines of code (LOCs) required to implement measurement tools for test coverage by approximately 90% and the time to implement a new test-coverage-criterion by over 80% in an experiment that compared OCCF with the conventional non-framework-based tools.

View full abstract

Download PDF (935K)
Conflict-Based Checking the Integrity of Linux Package Dependencies

Yuqing LAN, Mingxia KUANG, Wenbin ZHOU

Article type: PAPER
Subject area: Software Engineering
2011Volume E94.DIssue 12 Pages 2431-2439
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2431

JOURNAL FREE ACCESS

Show abstractHide abstract

A Linux operating system release is composed of a large number of software packages, with complex dependencies. The management of dependency relationship is the foundation of building and maintaining a Linux operating system release, and checking the integrity of the dependencies is the key of the dependency management. The widespread adoption of Linux operating systems in many areas of the information technology society has drawn the attention on the issues regarding how to check the integrity of complexity dependencies of Linux packages and how to manage a huge number of packages in a consistent and effective way. Linux distributions have already provided the tools for managing the tasks of installing, removing and upgrading the packages they were made of. A number of tools have been provided to handle these tasks on the client side. However, there is a lack of tools that could help the distribution editors to maintain the integrity of Linux package dependencies on the server side. In this paper we present a method based on conflict to check the integrity of Linux package dependencies. From the perspective of conflict, this method achieves the goal to check the integrity of package dependencies on the server side by removing the conflict associating with the packages. Our contribution provides an effective and automatic way to support distribution editors in handling those issues. Experiments using this method are very successful in checking the integrity of package dependencies in Linux software distributions.

View full abstract

Download PDF (339K)
Modeling Uncertainty in Moving Objects Databases

Shayma ALKOBAISI, Wan D. BAE, Sada NARAYANAPPA

Article type: PAPER
Subject area: Data Engineering, Web Information Systems
2011Volume E94.DIssue 12 Pages 2440-2459
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2440

JOURNAL FREE ACCESS

Show abstractHide abstract

The increase in the advanced location based services such as traffic coordination and management necessitates the need for advanced models tracking the positions of Moving Objects(MOs) like vehicles. Due to computer processing limitations, it is impossible for MOs to continuously update their locations. This results in the uncertainty nature of a MO's location between any two reported positions. Efficiently managing and quantifying the uncertainty regions of MOs are needed in order to support different types of queries and to improve query response time. This challenging problem of modeling uncertainty regions associated with MO was recently addressed by researchers and resulted in models that ranged from linear which require few properties of MOs as input to the models, to non-linear that are able to more accurately represent uncertainty regions by considering higher degree input. This paper summarizes and discusses approaches in modeling uncertainty regions associated with MOs. It further illustrates the need for appropriate approximations especially in the case of non-linear models as the uncertainty regions become rather irregularly shaped and difficult to manage. Finally, we demonstrate through several experimental sets the advantage of non-linear models over linear models when the uncertainty regions of MOs are approximated by two different approximations; the Minimum Bounding Box (MBB) and the Tilted Minimum Bounding Box (TMBB).

View full abstract

Download PDF (1537K)
Understanding of Network Operator-Friendly P2P Traffic Control Techniques

HyunYong LEE, Akihiro NAKAO

Article type: PAPER
Subject area: Information Network
2011Volume E94.DIssue 12 Pages 2460-2467
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2460

JOURNAL FREE ACCESS

Show abstractHide abstract

In the network operator-friendly P2P traffic control technique such as P4P, peers are supposed to select their communication partners by following a guidance issued by the network operator. Thus, the guidance has significant impact on the traffic control. However, detailed performance study of available guidances is missing. Most existing approaches do not show how they affect intra-domain traffic control in detail while mostly focusing on inter-domain traffic control. In this paper, we try to understand how the guidances affect the intra and inter-domain traffic control for better guidance improving the traffic control. Through simulations, we reveal the following. The performance-based guidance reflecting the networking status shows attractive results in distributing the traffic over intra-domain links and in reducing the cross-domain traffic and the charging volume of inter-domain link compared to the distance-based guidance enforcing simple localization. However, the performance-based guidance shows one limitation that can cause unstable traffic control. To overcome the identified limitation, we propose peer-assisted measurement and traffic estimation approach. Then, we verify our approach through simulations.

View full abstract

Download PDF (840K)
Measuring the Similarity of Protein Structures Using Image Compression Algorithms

Morihiro HAYASHIDA, Tatsuya AKUTSU

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2011Volume E94.DIssue 12 Pages 2468-2478
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2468

JOURNAL FREE ACCESS

Show abstractHide abstract

For measuring the similarity of biological sequences and structures such as DNA sequences, protein sequences, and tertiary structures, several compression-based methods have been developed. However, they are based on compression algorithms only for sequential data. For instance, protein structures can be represented by two-dimensional distance matrices. Therefore, it is expected that image compression is useful for measuring the similarity of protein structures because image compression algorithms compress data horizontally and vertically. This paper proposes series of methods for measuring the similarity of protein structures. In the methods, an original protein structure is transformed into a distance matrix, which is regarded as a two-dimensional image. Then, the similarity of two protein structures is measured by a kind of compression ratio of the concatenated image. We employed several image compression algorithms, JPEG, GIF, PNG, IFS, and SPC. Since SPC often gave better results among the other image compression methods, and it is simple and easy to be modified, we modified SPC and obtained MSPC. We applied the proposed methods to clustering of protein structures, and performed Receiver Operating Characteristic (ROC) analysis. The results of computational experiments suggest that MSPC has the best performance among existing compression-based methods. We also present some theoretical results on the time complexity and Kolmogorov complexity of image compression-based protein structure comparison.

View full abstract

Download PDF (895K)
Movement-Imagery Brain-Computer Interface: EEG Classification of Beta Rhythm Synchronization Based on Cumulative Distribution Function

Teruyoshi SASAYAMA, Tetsuo KOBAYASHI

Article type: PAPER
Subject area: Human-computer Interaction
2011Volume E94.DIssue 12 Pages 2479-2486
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2479

JOURNAL FREE ACCESS

Show abstractHide abstract

We developed a novel movement-imagery-based brain-computer interface (BCI) for untrained subjects without employing machine learning techniques. The development of BCI consisted of several steps. First, spline Laplacian analysis was performed. Next, time-frequency analysis was applied to determine the optimal frequency range and latencies of the electroencephalograms (EEGs). Finally, trials were classified as right or left based on β-band event-related synchronization using the cumulative distribution function of pretrigger EEG noise. To test the performance of the BCI, EEGs during the execution and imagination of right/left wrist-bending movements were measured from 63 locations over the entire scalp using eight healthy subjects. The highest classification accuracies were 84.4% and 77.8% for real movements and their imageries, respectively. The accuracy is significantly higher than that of previously reported machine-learning-based BCIs in the movement imagery task (paired t-test, p < 0.05). It has also been demonstrated that the highest accuracy was achieved even though subjects had never participated in movement imageries.

View full abstract

Download PDF (712K)
Matching Handwritten Line Drawings with Von Mises Distributions

Katsutoshi UEAOKI, Kazunori IWATA, Nobuo SUEMATSU, Akira HAYASHI

Article type: PAPER
Subject area: Pattern Recognition
2011Volume E94.DIssue 12 Pages 2487-2494
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2487

JOURNAL FREE ACCESS

Show abstractHide abstract

A two-dimensional shape is generally represented with line drawings or object contours in a digital image. Shapes can be divided into two types, namely ordered and unordered shapes. An ordered shape is an ordered set of points, while an unordered shape is an unordered set. As a result, each type typically uses different attributes to define the local descriptors involved in representing the local distributions of points sampled from the shape. Throughout this paper, we focus on unordered shapes. Since most local descriptors of unordered shapes are not scale-invariant, we usually make the shapes in an image data set the same size through scale normalization, before applying shape matching procedures. Shapes obtained through scale normalization are suitable for such descriptors if the original whole shapes are similar. However, they are not suitable if parts of each original shape are drawn using different scales. Thus, in this paper, we present a scale-invariant descriptor constructed by von Mises distributions to deal with such shapes. Since this descriptor has the merits of being both scale-invariant and a probability distribution, it does not require scale normalization and can employ an arbitrary measure of probability distributions in matching shape points. In experiments on shape matching and retrieval, we show the effectiveness of our descriptor, compared to several conventional descriptors.

View full abstract

Download PDF (700K)
A Study on Pitch Patterns in Japanese Speakers of English with Verification by Speech Re-Synthesis

Tomoko NARIAI, Kazuyo TANAKA

Article type: PAPER
Subject area: Speech and Hearing
2011Volume E94.DIssue 12 Pages 2495-2502
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2495

JOURNAL FREE ACCESS

Show abstractHide abstract

Certain irregularities in the utterances of words or phrases often occur in English spoken by Japanese native subject, referred to in this article as Japanese English. Japanese English is linguistically presumed to reflect the phonetic characteristics of Japanese. We consider the prosodic feature patterns as one of the most common causes of irregularities in Japanese English, and that Japanese English would have better prosodic patterns if its particular characteristics were modified. This study investigates prosodic differences between Japanese English and English speakers' English, and shows the quantitative results of a statistical analysis of pitch. The analysis leads to rules that show how to modify Japanese English to have pitch patterns closer to those of English speakers. On the basis of these rules, the pitch patterns of test speech samples of Japanese English are modified, and then re-synthesized. The modified speech is evaluated in a listening experiment by native English subjects. The result of the experiment shows that on average, over three-fold of the English subjects support the proposed modification against original speech. Therefore, the results of the experiments indicate practical verification of validity of the rules. Additionally, the results suggest that irregularities of prominence lie in Japanese English sentences. This can be explained by the prosodic transfer of first language prosodic characteristics on second language prosodic patterns.

View full abstract

Download PDF (677K)
Error Corrective Fusion of Classifier Scores for Spoken Language Recognition

Omid DEHZANGI, Bin MA, Eng Siong CHNG, Haizhou LI

Article type: PAPER
Subject area: Speech and Hearing
2011Volume E94.DIssue 12 Pages 2503-2512
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2503

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper investigates a new method for fusion of scores generated by multiple classification sub-systems that help to further reduce the classification error rate in Spoken Language Recognition (SLR). In recent studies, a variety of effective classification algorithms have been developed for SLR. Hence, it has been a common practice in the National Institute of Standards and Technology (NIST) Language Recognition Evaluations (LREs) to fuse the results from several classification sub-systems to boost the performance of the SLR systems. In this work, we introduce a discriminative performance measure to optimize the performance of the fusion of 7 language classifiers developed as IIR's submission to the 2009 NIST LRE. We present an Error Corrective Fusion (ECF) method in which we iteratively learn the fusion weights to minimize error rate of the fusion system. Experiments conducted on the 2009 NIST LRE corpus demonstrate a significant improvement compared to individual sub-systems. Comparison study is also conducted to show the effectiveness of the ECF method.

View full abstract

Download PDF (1044K)
Design of Real-Time Self-Frame-Rate-Control Foreground Detection for Multiple Camera Surveillance System

Tsung-Han TSAI, Chung-Yuan LIN

Article type: PAPER
Subject area: Image Recognition, Computer Vision
2011Volume E94.DIssue 12 Pages 2513-2522
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2513

JOURNAL FREE ACCESS

Show abstractHide abstract

Emerging video surveillance technologies are based on foreground detection to achieve event detection automatically. Integration foreground detection with a modern multi-camera surveillance system can significantly increase the surveillance efficiency. The foreground detection often leads to high computational load and increases the cost of surveillance system when a mass deployment of end cameras is needed. This paper proposes a DSP-based foreground detection algorithm. Our algorithm incorporates a temporal data correlation predictor (TDCP) which can exhibit the correlation of data and reduce computation based on this correlation. With the DSP-oriented foreground detection, an adaptive frame rate control is developed as a low cost solution for multi-camera surveillance system. The adaptive frame rate control automatically detects the computational load of foreground detection on multiple video sources and adaptively tunes the TDCP to meet the real-time specification. Therefore, no additional hardware cost is required when the number of deployed cameras is increased. Our method has been validated on a demonstration platform. Performance can achieve real-time CIF frame processing for a 16-camera surveillance system by single-DSP chip. Quantitative evaluation demonstrates that our solution provides satisfied detection rate, while significantly reducing the hardware cost.

View full abstract

Download PDF (3694K)
A Novel Sequential Tree Algorithm Based on Scoreboard for MPI Broadcast Communication

Won-young CHUNG, Jae-won PARK, Seung-Woo LEE, Won Woo RO, Yong-surk LE ...

Article type: LETTER
Subject area: Computer System
2011Volume E94.DIssue 12 Pages 2523-2527
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2523

JOURNAL FREE ACCESS

Show abstractHide abstract

The message passing interface (MPI) broadcast communication commonly causes a severe performance bottleneck in multicore system that uses distributed memory. Thus, in this paper, we propose a novel algorithm and hardware structure for the MPI broadcast communication to reduce the bottleneck situation. The transmission order is set based on the state of each processing node that comprises the multicore system, so the novel algorithm minimizes the performance degradation caused by conflict. The proposed scoreboard MPI unit is evaluated by modeling it with SystemC and implemented using VerilogHDL. The size of the proposed scoreboard MPI unit occupies less than 1.03% of the whole chip, and it yields a highly improved performance up to 75.48% as its maximum with 16 processing nodes. Hence, with respect to low-cost design and scalability, this scoreboard MPI unit is particularly useful towards increasing overall performance of the embedded MPSoC.

View full abstract

Download PDF (848K)
On Improving the Reliability and Performance of the YAFFS Flash File System

Seungjae BAEK, Heekwon PARK, Jongmoo CHOI

Article type: LETTER
Subject area: Software System
2011Volume E94.DIssue 12 Pages 2528-2532
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2528

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose three techniques to improve the performance of YAFFS (Yet Another Flash File System), while enhancing the reliability of the system. Specifically, we first propose to manage metadata and user data separately on segregated blocks. This modification not only leads to the reduction of the mount time but also reduces the garbage collection time. Second, we tailor the wear-leveling to the segregated metadata and user data blocks. That is, worn out blocks between the segregated blocks are swapped, which leads to more evenly worn out blocks increasing the lifetime of the system. Finally, we devise an analytic model to predict the expected garbage collection time. By accurately predicting the garbage collection time, the system can perform garbage collection at more opportune times when the user's perceived performance may not be negatively affected. Performance evaluation results based on real implementations show that our modifications enhance performance and reliability without incurring additional overheads. Specifically, the YAFFS with our proposed techniques outperforms the original YAFFS by six times in terms of mount speed and five times in terms of benchmark performance, while reducing the average erase count of blocks by 14%.

View full abstract

Download PDF (940K)
Simulation-Based Tactics Generation for Warship Combat Using the Genetic Algorithm

Yong-Jun YOU, Sung-Do CHI, Jae-Ick KIM

Article type: LETTER
Subject area: Artificial Intelligence, Data Mining
2011Volume E94.DIssue 12 Pages 2533-2536
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2533

JOURNAL FREE ACCESS

Show abstractHide abstract

In most existing warships combat simulation system, the tactics of a warship is manipulated by human operators. For this reason, the simulation results are restricted due to the capabilities of human operators. To deal with this, we have employed the genetic algorithm for supporting the evolutionary simulation environment. In which, the tactical decision by human operators is replaced by the human model with a rule-based chromosome for representing tactics so that the population of simulations are created and hundreds of simulation runs are continued on the basis of the genetic algorithm without any human intervention until finding emergent tactics which shows the best performance throughout the simulation. Several simulation tests demonstrate the techniques.

View full abstract

Download PDF (1182K)
Speech Enhancement Based on Data-Driven Residual Gain Estimation

Yu Gwang JIN, Nam Soo KIM, Joon-Hyuk CHANG

Article type: LETTER
Subject area: Speech and Hearing
2011Volume E94.DIssue 12 Pages 2537-2540
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2537

JOURNAL FREE ACCESS

Show abstractHide abstract

In this letter, we propose a novel speech enhancement algorithm based on data-driven residual gain estimation. The entire system consists of two stages. At the first stage, a conventional speech enhancement algorithm enhances the input signal while estimating several signal-to-noise ratio (SNR)-related parameters. The residual gain, which is estimated by a data-driven method, is applied to further enhance the signal at the second stage. A number of experimental results show that the proposed speech enhancement algorithm outperforms the conventional speech enhancement technique based on soft decision and the data-driven approach using SNR grid look-up table.

View full abstract

Download PDF (174K)
High-Accuracy Sub-Pixel Registration for Noisy Images Based on Phase Correlation

Bei HE, Guijin WANG, Xinggang LIN, Chenbo SHI, Chunxiao LIU

Article type: LETTER
Subject area: Image Processing and Video Processing
2011Volume E94.DIssue 12 Pages 2541-2544
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2541

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a high-accuracy sub-pixel registration framework based on phase correlation for noisy images. First we introduce a denoising module, where the edge-preserving filter is adopted. This strategy not only filters off the noise but also preserves most of the original image signal. A confidence-weighted optimization module is then proposed to fit the linear phase plane discriminately and to achieve sub-pixel shifts. Experiments demonstrate the effectiveness of the combination of our modules and improvements of the accuracy and robustness against noise compared to other sub-pixel phase correlation methods in the Fourier domain.

View full abstract

Download PDF (853K)
A Novel Bayes' Theorem-Based Saliency Detection Model

Xin HE, Huiyun JING, Qi HAN, Xiamu NIU

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2011Volume E94.DIssue 12 Pages 2545-2548
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2545

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a novel saliency detection model based on Bayes' theorem. The model integrates the two parts of Bayes' equation to measure saliency, each part of which was considered separately in the previous models. The proposed model measures saliency by computing local kernel density estimation of features in the center-surround region and global kernel density estimation of features at each pixel across the whole image. Under the proposed model, a saliency detection method is presented that extracts DCT (Discrete Cosine Transform) magnitude of local region around each pixel as the feature. Experiments show that the proposed model not only performs competitively on psychological patterns and better than the current state-of-the-art models on human visual fixation data, but also is robust against signal uncertainty.

View full abstract

Download PDF (1147K)
Implementation of Scale and Rotation Invariant On-Line Object Tracking Based on CUDA

Quan MIAO, Guijin WANG, Xinggang LIN

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2011Volume E94.DIssue 12 Pages 2549-2552
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2549

JOURNAL FREE ACCESS

Show abstractHide abstract

Object tracking is a major technique in image processing and computer vision. Tracking speed will directly determine the quality of applications. This paper presents a parallel implementation for a recently proposed scale- and rotation-invariant on-line object tracking system. The algorithm is based on NVIDIA's Graphics Processing Units (GPU) using Compute Unified Device Architecture (CUDA), following the model of single instruction multiple threads. Specifically, we analyze the original algorithm and propose the GPU-based parallel design. Emphasis is placed on exploiting the data parallelism and memory usage. In addition, we apply optimization technique to maximize the utilization of NVIDIA's GPU and reduce the data transfer time. Experimental results show that our GPGPU-based method running on a GTX480 graphics card could achieve up to 12X speed-up compared with the efficiency equivalence on an Intel E8400 3.0GHz CPU, including I/O time.

View full abstract

Download PDF (792K)
Hybrid Parallel Extraction of Isosurface Components from 3D Rectilinear Volume Data

Bong-Soo SOHN

Article type: LETTER
Subject area: Computer Graphics
2011Volume E94.DIssue 12 Pages 2553-2556
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2553

JOURNAL FREE ACCESS

Show abstractHide abstract

We describe an efficient algorithm that extracts a connected component of an isosurface, or a contour, from a 3D rectilinear volume data. The efficiency of the algorithm is achieved by three factors: (i) directly working with rectilinear grids, (ii) parallel utilization of a multi-core CPU for extracting active cells, the cells containing the contour, and (iii) parallel utilization of a many-core GPU for computing the geometries of a contour surface in each active cell using CUDA. Experimental results show that our hybrid parallel implementation achieved up to 20x speedup over existing methods on an ordinary PC. Our work coupled with the Contour Tree framework is useful for quickly segmenting, displaying, and analyzing a feature of interest in 3D rectilinear volume data without being distracted by other features.

View full abstract

Download PDF (1660K)
A Storage-Efficient Suffix Tree Construction Algorithm for Human Genome Sequences

Woong-Kee LOH, Heejune AHN

Article type: LETTER
Subject area: Biological Engineering
2011Volume E94.DIssue 12 Pages 2557-2560
Published: December 01, 2011
Released on J-STAGE: December 01, 2011

DOIhttps://doi.org/10.1587/transinf.E94.D.2557

JOURNAL FREE ACCESS

Show abstractHide abstract

The suffix tree is one of most widely adopted indexes in the application of genome sequence alignment. Although it supports very fast alignment, it has a couple of shortcomings, such as a very long construction time and a very large volume size. Loh et al. [7] proposed a suffix tree construction algorithm with dramatically improved performance; however, the size still remains as a challenging problem. We propose an algorithm by extending the one by Loh et al. to reduce the suffix tree size. As a result of our experiments, our algorithm constructed a suffix tree of approximately 60% of the size within almost the same time period.

View full abstract

Download PDF (405K)

Register with J-STAGE for free!