IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E98.D , Issue 2
Showing 1-35 articles out of 35 articles from the selected issue
Special Section on Reconfigurable Systems
• Tetsuo HIRONAKA
2015 Volume E98.D Issue 2 Pages 219
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
• Kei KINOSHITA, Yoshiki YAMAGUCHI, Daisuke TAKANO, Tomoyuki OKAMURA, Te ...
Type: PAPER
Subject area: Architecture
2015 Volume E98.D Issue 2 Pages 220-229
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
This paper seeks to improve power-performance efficiency of embedded systems by the use of dynamic reconfiguration. Programmable logic devices (PLDs) have the competence to optimize the power consumption by the use of partial and/or dynamic reconfiguration. It is a non-exclusive approach, which can use other power-reduction techniques simultaneous, and thus it is applicable to a myriad of systems. The power-performance improvement by dynamic reconfiguration was evaluated through an augmented reality system that translates Japanese into English. It is a wearable and mobile system with a head-mounted display (HMD). In the system, the computing core detects a Japanese word from an input video frame and the translated term will be output to the HMD. It includes various image processing approaches such as pattern recognition and object tracking, and these functions run sequentially. The system does not need to prepare all functions simultaneously, which provides a function by reconfiguration only when it is needed. In other words, by dynamic reconfiguration, the spatiotemporal module-based pipeline can introduce the reduction of its circuit amount and power consumption compared to the naive approach. The approach achieved marked improvements; the computational speed was the same but the power consumption was reduced to around $\frac{1}{6}$.
• Yu PENG, Shouyi YIN, Leibo LIU, Shaojun WEI
Type: PAPER
Subject area: Architecture
2015 Volume E98.D Issue 2 Pages 230-242
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Coarse-grained Reconfigurable Architecture (CGRA) is a promising mobile computing platform that provides both high performance and high energy efficiency. In an application, loop nests are usually mapped onto CGRA for further acceleration, so optimizing the mapping is an important goal for design of CGRAs. Moreover, obviously almost all of mobile devices are powered by batteries, how to reduce energy consumption also becomes one of primary concerns in using CGRAs. This paper makes three contributions: a) Proposing an energy consumption model for CGRA; b) Formulating loop nests mapping problem to minimize the battery charge loss; c) Extract an efficient heuristic algorithm called BPMap. Experiment results on most kernels of the benchmarks and real-life applications show that our methods can improve the performance of the kernels and lower the energy consumption.
• Bing XU, Shouyi YIN, Leibo LIU, Shaojun WEI
Type: PAPER
Subject area: Architecture
2015 Volume E98.D Issue 2 Pages 243-251
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Coarse Grained Reconfigurable Architectures (CGRAs) are promising platform based on its high-performance and low cost. Researchers have developed efficient compilers for mapping compute-intensive applications on CGRA using modulo scheduling. In order to generate loop kernel, every stage of kernel are forced to have the same execution time which is determined by the critical PE. Hence non-critical PEs can decrease the supply voltage according to its slack time. The variable Dual-VDD CGRA incorporates this feature to reduce power consumption. Previous work mainly focuses on calculating a global optimal VDDL using overall optimization method that does not fully exploit the flexibility of architecture. In this brief, we adopt variable optimal VDDL in each stage of kernel concerning their pattern respectively instead of the fixed simulated global optimal VDDL. Experiment shows our proposed heuristic approach could reduce the power by 27.6% on average without decreasing performance. The compilation time is also acceptable.
• Motoki AMAGASAKI, Qian ZHAO, Masahiro IIDA, Morihiro KUGA, Toshinori S ...
Type: PAPER
Subject area: Architecture
2015 Volume E98.D Issue 2 Pages 252-261
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
In this paper, we propose fault-tolerant field-programmable gate array (FPGA) architectures and their design framework for intellectual property (IP) cores in system-on-chip (SoC). Unlike discrete FPGAs, in which the integration scale can be made relatively large, programmable IP cores must correspond to arrays of various sizes. The key features of our architectures are a regular tile structure, spare modules and bypass wires for fault avoidance, and a configuration mechanism for single-cycle reconfiguration. In addition, we utilize routing tools, namely EasyRouter for proposed architecture. This tool can handle various array sizes corresponding to developed programmable IP cores. In this evaluation, we compared the performances of conventional FPGAs and the proposed fault-tolerant FPGA architectures. On average, our architectures have less than 1.82 times the area and 1.11 times the delay compared with traditional island-style FPGAs. At the same time, our FPGA shows a higher fault tolerant performance.
• Hiroki NAKAHARA, Tsutomu SASAO, Munehiro MATSUURA, Hisashi IWAMOTO, Ya ...
Type: PAPER
Subject area: Architecture
2015 Volume E98.D Issue 2 Pages 262-271
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
In the era of IPv6, since the number of IPv6 addresses rapidly increases and the required speed is more than Giga lookups per second (GLPS), an area-efficient and high-speed IP lookup architecture is desired. This paper shows a parallel index generation unit (IGU) for memory-based IPv6 lookup architecture. To reduce the size of memory in the IGU, we use a linear transformation and a row-shift decomposition. A single-memory realization requires O(2l log k) memory size, where l denotes the length of prefix, while the realization using IGU requires O(kl) memory size, where k denotes the number of prefixes. In IPv6 prefix lookup, since l is at most 64 and k is about 340 K, the IGU drastically reduces the memory size. Also, to reduce the cost, we realize the parallel IGU by using both on-chip and off-chip memories. We show a design algorithm for the parallel IGU to store given off-chip and on-chip memories. The parallel IGU has a simple architecture and performs lookup by using complete pipelines those insert the pipeline registers in all the paths. We loaded more than 340 K IPv6 pseudo prefixes on the Xilinx Virtex 6 FPGA with off-chip DDRII+ Static RAMs (SRAMs). Its lookup speed is 1.100 giga lookups per second (GLPS) which is sufficient for the required speed for a next generation 400 Gbps link throughput. As for the normalized area and lookup speed, our implementation outperforms existing FPGA implementations.
• Hao XIAO, Ning WU, Fen GE, Guanyu ZHU, Lei ZHOU
Type: LETTER
Subject area: Architecture
2015 Volume E98.D Issue 2 Pages 272-275
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
This paper presents a synchronization mechanism to effectively implement the lock and barrier protocols in a decentralized manner through explicit message passing. In the proposed solution, a simple and efficient synchronization control mechanism is proposed to support queued synchronization without contention. By using state-of-the-art Application-Specific Instruction-set Processor (ASIP) technology, we embed the synchronization functionality into a baseline processor, making the proposed mechanism feature ultra-low overhead. Experimental results show the proposed synchronization achieves ultra-low latency and almost ideal scalability when the number of processors increases.
• Rui SHI, Shouyi YIN, Leibo LIU, Qiongbing LIU, Shuang LIANG, Shaojun W ...
Type: PAPER
Subject area: Application
2015 Volume E98.D Issue 2 Pages 276-287
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Video Up-scaling is a hotspot in TV display area; as an important brunch of Video Up-scaling, Texture-Based Video Up-scaling (TBVU) method shows great potential of hardware implementation. Coarse-grained Reconfigurable Architecture (CGRA) is a very promising processor; it is a parallel computing platform which provides high performance of hardware, high flexibility of software, and dynamical reconfiguration ability. In this paper we propose an implementation of TBVU on CGRA. We fully exploit the characters of TBVU and utilize several techniques to reduce memory I/O operation and total execution time. Experimental results show that our work can greatly reduce the I/O operation and the execution time compared with the non-optimized ones. We also compare our work with other platforms and find great advantage in execution time and resource utilization rate.
• Stewart DENHOLM, Hiroaki INOUE, Takashi TAKENAKA, Tobias BECKER, Wayne ...
Type: PAPER
Subject area: Application
2015 Volume E98.D Issue 2 Pages 288-297
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Financial exchanges provide market data feeds to update their members about changes in the market. Feed messages are often used in time-critical automated trading applications, and two identical feeds (A and B feeds) are provided in order to reduce message loss. A key challenge is to support A/B line arbitration efficiently to compensate for missing packets, while offering flexibility for various operational modes such as prioritising for low latency or for high data reliability. This paper presents a reconfigurable acceleration approach for A/B arbitration operating at the network level, capable of supporting any messaging protocol. Two modes of operation are provided simultaneously: one prioritising low latency, and one prioritising high reliability with three dynamically configurable windowing methods. We also present a model for message feed processing latencies that is useful for evaluating scalability in future applications. We outline a new low latency, high throughput architecture and demonstrate a cycle-accurate testing framework to measure the actual latency of packets within the FPGA. We implement and compare the performance of the NASDAQ TotalView-ITCH, OPRA and ARCA market data feed protocols using a Xilinx Virtex-6 FPGA. For high reliability messages we achieve latencies of 42ns for TotalView-ITCH and 36.75ns for OPRA and ARCA. 6ns and 5.25ns are obtained for low latency messages. The most resource intensive protocol, TotalView-ITCH, is also implemented in a Xilinx Virtex-5 FPGA within a network interface card; it is used to validate our approach with real market data. We offer latencies 10 times lower than an FPGA-based commercial design and 4.1 times lower than the hardware-accelerated IBM PowerEN processor, with throughputs more than double the required 10Gbps line rate.
• Keisuke DOHI, Koji OKINA, Rie SOEJIMA, Yuichiro SHIBATA, Kiyoshi OGURI
Type: PAPER
Subject area: Application
2015 Volume E98.D Issue 2 Pages 298-308
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
In this paper, we discuss performance modeling of 3-D stencil computing on an FPGA accelerator with a high-level synthesis environment, aiming for efficient exploration of user-space design parameters. First, we analyze resource utilization and performance to formulate these relationships as mathematical models. Then, in order to evaluate our proposed models, we implement heat conduction simulations as a benchmark application, by using MaxCompiler, which is a high-level synthesis tool for FPGAs, and MaxGenFD, which is a domain specific framework of the MaxCompiler for finite-difference equation solvers. The experimental results with various settings of architectural design parameters show the best combination of design parameters for pipeline structure can be systematically found by using our models. The effects of changing arithmetic accuracy and using data stream compression are also discussed.
• Hitoshi UKAWA, Tetsu NARUMI
Type: LETTER
Subject area: Application
2015 Volume E98.D Issue 2 Pages 309-312
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
The fast multipole method (FMM) for N-body simulations is attracting much attention since it requires minimal communication between computing nodes. We implemented hardware pipelines specialized for the FMM on an FPGA device, the GRAPE-9. An N-body simulation with 1.6×107 particles ran 16 times faster than that on a CPU. Moreover the particle-to-particle stage of the FMM on the GRAPE-9 executed 2.5 times faster than on a GPU in a limited case.
Regular Section
• Jiang LI, Yusuke ATSUMARI, Hiromasa KUBO, Yuichi OGISHIMA, Satoru YOKO ...
Type: PAPER
Subject area: Computer System
2015 Volume E98.D Issue 2 Pages 313-324
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
A processing system with multiple field programmable gate array (FPGA) cards is described. Each FPGA card can interconnect using six I/O (up, down, left, right, front, and back) terminals. The communication network among FPGAs is scalable according to user design. When the system operates multi-dimensional applications, transmission efficiency among FPGA improved through user-adjusted dimensionality and network topologies for different applications. We provide a fast and flexible circuit configuration method for FPGAs of a multi-dimensional FPGA array. To demonstrate the effectiveness of the proposed method, we assess performance and power consumption of a circuit that calculated 3D Poisson equations using the finite difference method.
• Eunjong CHOI, Norihiro YOSHIDA, Yoshiki HIGO, Katsuro INOUE
Type: PAPER
Subject area: Software Engineering
2015 Volume E98.D Issue 2 Pages 325-333
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
So far, many approaches for detecting code clones have been proposed based on the different degrees of normalizations (e.g. removal of white spaces, tokenization, and regularization of identifiers). Different degrees of normalizations lead to different granularities of source code to be detect as code clones. To investigate how the normalizations impact the code clone detection, this study proposes six approaches for detecting code clones with preprocessing input source files using different degrees of normalizations. More precisely, each normalization is applied to the input source files and then equivalence class partitioning is performed to the files in the preprocessing. After that, code clones are detected from a set of files that are representatives of each equivalence class using a token-based code clone detection tool named CCFinder. The proposed approaches can be categorized into two types, approaches with non-normalization and normalization. The former is the detection of only identical files without any normalization. Meanwhile, the latter category is the detection of identical files with different degrees of normalizations such as removal of all lines containing macros. From the case study, we observed that our proposed approaches detect code clones faster than the approach that uses only CCFinder. We also found the approach with non-normalization is the fastest among the proposed approaches in many cases.
• Xi CHANG, Zhuo ZHANG, Yan LEI, Jianjun ZHAO
Type: PAPER
Subject area: Software Engineering
2015 Volume E98.D Issue 2 Pages 334-345
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Concurrency bugs do significantly affect system reliability. Although many efforts have been made to address this problem, there are still many bugs that cannot be detected because of the complexity of concurrent programs. Compared with atomicity violations, order violations are always neglected. Efficient and effective approaches to detecting order violations are therefore in urgent need. This paper presents a bidirectional predictive trace analysis approach, BIPED, which can detect order violations in parallel based on a recorded program execution. BIPED collects an expected-order execution trace into a layered bidirectional prediction model, which intensively represents two types of expected-order data flows in the bottom layer and combines the lock sets and the bidirectionally order constraints in the upper layer. BIPED then recognizes two types of candidate violation intervals driven by the bottom-layer model and then checks these recognized intervals bidirectionally based on the upper-layer constraint model. Consequently, concrete schedules can be generated to expose order violation bugs. Our experimental results show that BIPED can effectively detect real order violation bugs and the analysis speed is 2.3x-10.9x and 1.24x-1.8x relative to the state-of-the-art predictive dynamic analysis approaches and hybrid model based static prediction analysis approaches in terms of order violation bugs.
• Yusheng LI, Meina SONG, Haihong E
Type: PAPER
Subject area: Data Engineering, Web Information Systems
2015 Volume E98.D Issue 2 Pages 346-354
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Social recommendation systems that make use of the user's social information have recently attracted considerable attention. These recommendation approaches partly solve cold-start and data sparsity problems and significantly improve the performance of recommendation systems. The essence of social recommendation methods is to utilize the user's explicit social connections to improve recommendation results. However, this information is not always available in real-world recommender systems. In this paper, a solution to this problem of explicit social information unavailability is proposed. The existing user-item rating matrix is used to compute implicit social information, and then an ISRec (implicit social recommendation algorithm) which integrates this implicit social information and the user-item rating matrix for social recommendation is introduced. Experimental results show that our method performs much better than state-of-the-art approaches; moreover, complexity analysis indicates that our approach can be applied to very large datasets because it scales linearly with respect to the number of observations in the matrices.
• Nagayoshi YAMASHITA, Masayuki NUMAO, Ryutaro ICHISE
Type: PAPER
Subject area: Artificial Intelligence, Data Mining
2015 Volume E98.D Issue 2 Pages 355-362
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Since it is difficult to understand or predict research trends, we proposed methodologies for understanding and predicting research trends in the sciences, focusing on the structures of grants in the Japan Society for the Promotion of Science (JSPS), a Japanese funding agency. Grant applications are suitable for predicting research trends because these are research plans for the future, different from papers, which report research outcomes in the past. We investigated research trends in science focusing on research histories identified in grant application data of JSPS. Then we proposed a model for predicting research trends, assuming that breakthrough research encourages researchers to change from their current research field to an entirely new research field. Using breakthrough research, we aim to obtain higher precision in the prediction results. In our experimental results, we found that research fields in Informatics correlate well with actual scientific research trends. We also demonstrated that our prediction models are effective in actively interacting research areas, which include Informatics and Social Sciences.
• Zhuo JIANG, Junhao WEN, Jun ZENG, Yihao ZHANG, Xibin WANG, Sachio HIRO ...
Type: PAPER
Subject area: Artificial Intelligence, Data Mining
2015 Volume E98.D Issue 2 Pages 363-371
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
The success of heuristic search in AI planning largely depends on the design of the heuristic. On the other hand, previous experience contains potential domain information that can assist the planning process. In this context, we have studied dynamic macro-based heuristic planning through action relationship analysis. We present an approach for analyzing the action relationship and design an algorithm that learns macros in solved cases. We then propose a dynamic macro-based heuristic that appropriately reuses the macros rather than immediately assigning them to domains. The above ideas are incorporated into a working planning system called Dynamic Macro-based Fast Forward planner. Finally, we evaluate our method in a series of experiments. Our method effectively optimizes planning since it reduces the result length by an average of 10% relative to the FF, in a time-economic manner. The efficiency is especially improved when invoking an action consumes time.
• Dongkyu JEON, Wooju KIM
Type: PAPER
Subject area: Artificial Intelligence, Data Mining
2015 Volume E98.D Issue 2 Pages 372-380
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
In recent years, there has been a significant growth in the importance of data mining of graph-structured data due to this technology's rapid increase in both scale and application areas. Many previous studies have investigated decision tree learning on Semantic Web-based linked data to uncover implicit knowledge. In the present paper, we suggest a new random forest algorithm for linked data to overcome the underlying limitations of the decision tree algorithm, such as local optimal decisions and generalization error. Moreover, we designed a parallel processing environment for random forest learning to manage large-size linked data and increase the efficiency of multiple tree generation. For this purpose, we modified the previous candidate feature searching method of the decision tree algorithm for linked data to reduce the feature searching space of random forest learning and developed feature selection methods that are adjusted to linked data. Using a distributed index-based search engine, we designed a parallel random forest learning system for linked data to generate random forests in parallel. Our proposed system enables users to simultaneously generate multiple decision trees from distributed stored linked data. To evaluate the performance of the proposed algorithm, we performed experiments to compare the classification accuracy when using the single decision tree algorithm. The experimental results revealed that our random forest algorithm is more accurate than the single decision tree algorithm.
• Keiko TAGUCHI, Andrew FINCH, Seiichi YAMAMOTO, Eiichiro SUMITA
Type: PAPER
Subject area: Artificial Intelligence, Data Mining
2015 Volume E98.D Issue 2 Pages 381-393
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
In this article we present a novel corpus-based method for inducing romanization systems for languages through a bilingual alignment of transliteration word pairs. First, the word pairs are aligned using a non-parametric Bayesian approach, and then for each grapheme sequence to be romanized, a particular romanization is selected according to a user-specified criterion. As far as we are aware, this paper is the only one to describe a method for automatically deriving complete romanization systems. Unlike existing human-derived romanization systems, the proposed method is able to discover induced romanization systems tailored for specific purposes, for example, for use in data mining, or efficient user input methods. Our experiments study the romanization of four totally different languages: Russian, Japanese, Hindi and Myanmar. The first two languages already have standard romanization systems in regular use, Hindi has a large number of diverse systems, and Myanmar has no standard system for romanization. We compare our induced romanization system to existing systems for Russian and Japanese. We find that the systems so induced are almost identical to Russian, and 69% identical to Japanese. We applied our approach to the task of transliteration mining, and used Levenshtein distance as the romanization selection criterion. Our experiments show that our induced romanization system was able to match the performance of the human created system for Russian, and offer substantially improved mining performance for Japanese. We provide an analysis of the mechanism our approach uses to improve mining performance, and also analyse the differences in characteristics between the induced system for Japanese and the official Japanese Nihon-shiki system. In order to investigate the limits of our approach, we studied the romanization of Myanmar, a low-resource language with a large vocabulary of graphemes. We estimate the approximate corpus size required to effectively romanize the most frequency k graphemes in the language for all values of k up to 1800.
• Jigisha N PATEL, Jerin JOSE, Suprava PATNAIK
Type: PAPER
Subject area: Image Processing and Video Processing
2015 Volume E98.D Issue 2 Pages 394-403
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
The concept of sparse representation is gaining momentum in image processing applications, especially in image compression, from last one decade. Sparse coding algorithms represent signals as a sparse linear combination of atoms of an overcomplete dictionary. Earlier works shows that sparse coding of images using learned dictionaries outperforms the JPEG standard for image compression. The conventional method of image compression based on sparse coding, though successful, does not adapting the compression rate based on the image local block characteristics. Here, we have proposed a new framework in which the image is classified into three classes by measuring the block activities followed by sparse coding each of the classes using dictionaries learned specific to each class. K-SVD algorithm has been used for dictionary learning. The sparse coefficients for each class are Huffman encoded and combined to form a single bit stream. The model imparts some rate-distortion attributes to compression as there is provision for setting a different constraint for each class depending on its characteristics. We analyse and compare this model with the conventional model. The outcomes are encouraging and the model makes way for an efficient sparse representation based image compression.
• Xiaoxiong XING, Yoshinori DOBASHI, Tsuyoshi YAMAMOTO, Yosuke KATSURA, ...
Type: PAPER
Subject area: Computer Graphics
2015 Volume E98.D Issue 2 Pages 404-411
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
We present an algorithm for efficient rendering of animated hair under a dynamic, low-frequency lighting environment. We use spherical harmonics (SH) to represent the environmental light. The transmittances between a point on a hair strand and the light sources are also represented by SH functions. Then, a convolution of SH functions and the scattering function of a hair strand is precomputed. This allows us to efficiently compute the intensity at a point on the hair. However, the computation of the transmittance is very time-consuming. We address this problem by using a voxel-based approach: the transmittance is computed by using a voxelized hair model. We further accelerate the computation by sampling the voxels. By using our method, we can render a hair model consisting of tens of thousands of hair strands at interactive frame rates.
• Haoyan GUO, Changyong GUO, Yuanzhi CHENG, Shinichi TAMURA
Type: PAPER
Subject area: Biological Engineering
2015 Volume E98.D Issue 2 Pages 412-428
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
To determine the thickness from MR images, segmentation, that is, boundary detection, of the two adjacent thin structures (e.g., femoral cartilage and acetabular cartilage in the hip joint) is needed before thickness determination. Traditional techniques such as zero-crossings of the second derivatives are not suitable for the detection of these boundaries. A theoretical simulation analysis reveals that the zero-crossing method yields considerable biases in boundary detection and thickness measurement of the two adjacent thin structures from MR images. This paper studies the accurate detection of hip cartilage boundaries in the image plane, and a new method based on a model of the MR imaging process is proposed for this application. Based on the newly developed model, a hip cartilage boundary detection algorithm is developed. The in-plane thickness is computed based on the boundaries detected using the proposed algorithm. In order to correct the image plane thickness for overestimation due to oblique slicing, a three-dimensional (3-D) thickness computation approach is introduced. Experimental results show that the thickness measurement obtained by the new thickness computation approach is more accurate than that obtained by the existing thickness computation approaches.
• Chuzo IWAMOTO, Yuta MATSUI
Type: LETTER
Subject area: Fundamentals of Information Systems
2015 Volume E98.D Issue 2 Pages 429-432
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Forty Thieves is a solitaire game with two 52-card decks. The object is to move all cards from ten tableau piles of four cards to eight foundations. Each foundation is built up by suit from ace to king of the same suit, and each tableau pile is built down by suit. You may move the top card from any tableau pile to a tableau or foundation pile, and from the stock to a foundation pile. We prove that the generalized version of Forty Thieves is NP-complete.
• Chen CHEN, Kai LU, Xiaoping WANG, Xu ZHOU, Zhendong WU
Type: LETTER
Subject area: Software System
2015 Volume E98.D Issue 2 Pages 433-436
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Most existing deterministic multithreading systems are costly on pipeline parallel programs due to load imbalance. In this letter, we propose a Load-Balanced Deterministic Runtime (LBDR) for pipeline parallelism. LBDR deterministically takes some tokens from non-synchronization-intensive threads to synchronization-intensive threads. Experimental results show that LBDR outperforms the state-of-the-art design by an average of 22.5%.
• Keehang KWON, Kyunghwan PARK, Mi-Young PARK
Type: LETTER
Subject area: Software System
2015 Volume E98.D Issue 2 Pages 437-438
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
To represent interactive objects, we propose a choice-disjunctive declaration statement of the form $S \add R$ where S, R are the (procedure or field) declaration statements within a class. This statement has the following semantics: request the user to choose one between S and R when an object of this class is created. This statement is useful for representing interactive objects that require interaction with the user.
• Ki-Seong LEE, Chan-Gun LEE
Type: LETTER
Subject area: Software Engineering
2015 Volume E98.D Issue 2 Pages 439-443
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Modularity is an effective evaluation approach for understanding the structural quality of evolutionary software. However, there are many diverse ways to measure it. In this paper, we analyze and compare various modularity metrics that have been studied in different domains to assess their applicability to evolutionary software analysis. Through extensive experiments with artificial DSMs and open-source software, we find that the correlations of those metrics are generally high despite their differences. However, our experiments show that a certain metric can be more sensitive to particular modular factors, hence applying of comprehensive modularity metrics must be taken into consideration.
• Woo-Lam KANG, Hyeon-Gyu KIM, Yoon-Joon LEE
Type: LETTER
Subject area: Data Engineering, Web Information Systems
2015 Volume E98.D Issue 2 Pages 444-447
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
This paper presents a method to reduce I/O cost in MapReduce when online analytical processing (OLAP) queries are used for data analysis. The proposed method consists of two basic ideas. First, to reduce network transmission cost, mappers are organized to receive only data necessary to perform a map task, not an entire set of input data. Second, to reduce storage consumption, only record IDs are stored for checkpointing, not the raw records. Experiments conducted with TPC-H benchmark show that the proposed method is about 40% faster than Hive, the well-known data warehouse solution for MapReduce, while reducing the size of data stored for checkpoining to about 80%.
• Cheng ZHANG, Yuzhang GU, Zhengmin ZHANG, Yunlong ZHAN
Type: LETTER
Subject area: Pattern Recognition
2015 Volume E98.D Issue 2 Pages 448-452
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
In this paper, we propose a face representation approach using multi-orientation Log-Gabor local binary pattern (MOLGLBP) for realizing face recognition under facial expressions, illuminations and partial occlusions. Log-Gabor filters with different scales (frequencies) and orientations are applied on Y, I, and Q channel image in the YIQ color space respectively. Then Log-Gabor images of different orientations at the same scale are combined to form a multi-orientation Log-Gabor image (MOLGI) and two LBP operators are applied to it. For face recognition, histogram intersection metric is utilized to measure the similarity of faces. The proposed approach is evaluated on the CurtinFaces database and experiments demonstrate that the proposed approach is effectiveness against two simultaneous variations: expression & illumination, and illumination & occlusion.
• Javad Rahimipour ANARAKI, Mahdi EFTEKHARI, Chang Wook AHN
Type: LETTER
Subject area: Pattern Recognition
2015 Volume E98.D Issue 2 Pages 453-456
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Feature Selection (FS) is widely used to resolve the problem of selecting a subset of information-rich features; Fuzzy-Rough QuickReduct (FRQR) is one of the most successful FS methods. This paper presents two variants of the FRQR algorithm in order to improve its performance: 1) Combining Fuzzy-Rough Dependency Degree with Correlation-based FS merit to deal with a dilemma situation in feature subset selection and 2) Hybridizing the newly proposed method with the threshold based FRQR. The effectiveness of the proposed approaches are proven over sixteen UCI datasets; smaller subsets of features and higher classification accuracies are achieved.
• Lifeng HE, Bin YAO, Xiao ZHAO, Yun YANG, Yuyan CHAO, Atsushi OHTA
Type: LETTER
Subject area: Pattern Recognition
2015 Volume E98.D Issue 2 Pages 457-461
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
This paper proposes a graph-theory-based Euler number computing algorithm. According to the graph theory and the analysis of a mask's configuration, the Euler number of a binary image in our algorithm is calculated by counting four patterns of the mask. Unlike most conventional Euler number computing algorithms, we do not need to do any processing of the background pixels. Experimental results demonstrated that our algorithm is much more efficient than conventional Euler number computing algorithms.
• Jiangfeng YANG, Zheng MA
Type: LETTER
Subject area: Image Processing and Video Processing
2015 Volume E98.D Issue 2 Pages 462-466
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Recently, locality-constrained linear coding (LLC) as a coding strategy has attracted much attention, due to its better reconstruction than sparse coding and vector quantization. However, LLC ignores the weight information of codewords during the coding stage, and assumes that every selected base has same credibility, even if their weights are different. To further improve the discriminative power of LLC code, we propose a weighted LLC algorithm that considers the codeword weight information. Experiments on the KTH and UCF datasets show that the recognition system based on WLLC achieves better performance than that based on the classical LLC and VQ, and outperforms the recent classical systems.
• Mengmeng ZHANG, Yang ZHANG, Huihui BAI
Type: LETTER
Subject area: Image Processing and Video Processing
2015 Volume E98.D Issue 2 Pages 467-470
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
The high efficiency video coding (HEVC) standard has significantly improved compression performance for many applications, including remote desktop and desktop sharing. Screen content video coding is widely used in applications with a high demand for real-time performance. HEVC usually introduces great computational complexity, which makes fast algorithms necessary to offset the limited computing power of HEVC encoders. In this study, a statistical analysis of several screen content sequences is first performed to better account for the completely different statistics of natural images and videos. Second, a fast coding unit (CU) splitting method is proposed, which aims to reduce HEVC intra coding computational complexity, especially in screen content coding. In the proposed scheme, CU size decision is made by checking the smoothness of the luminance values in every coding tree unit. Experiments demonstrate that in HEVC range extension standard, the proposed scheme can save an average of 29% computational complexity with 0.9% Bjøntegaard Delta rate (BD-rate) increase compared with HM13.0+RExt6.0 anchor for screen content sequences. For default HEVC, the proposed scheme can reduce encoding time by an average of 38% with negligible loss of coding efficiency.
• Shujuan GAO, Insuk KIM, Seong Tae JHANG
Type: LETTER
Subject area: Image Recognition, Computer Vision
2015 Volume E98.D Issue 2 Pages 471-474
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
Robust yet efficient techniques for detecting and tracking targets in infrared (IR) images are a significant component of automatic target recognition (ATR) systems. In our previous works, we have proposed infrared target detection and tracking systems based on sparse representation method. The proposed infrared target detection and tracking algorithms are based on sparse representation and Bayesian probabilistic techniques, respectively. In this paper, we adopt Naïve Bayes Nearest Neighbor (NBNN) that is an extremely simple, efficient algorithm that requires no training phase. State-of-the-art image classification techniques need a comprehensive learning and training step (e.g., using Boosting, SVM, etc.) In contrast, non-parametric Nearest Neighbor based image classifiers need no training time and they also have other more advantageous properties. Results of tracking in infrared sequences demonstrated that our algorithm is robust to illumination changes, and the tracking algorithm is found to be suitable for real-time tracking of a moving target in infrared sequences and its performance was quite good.
• Inseong HWANG, Seungwoo JEON, Beobkeun CHO, Yoonsik CHOE
Type: LETTER
Subject area: Image Recognition, Computer Vision
2015 Volume E98.D Issue 2 Pages 475-478
Published: 2015
Released: February 01, 2015
JOURNALS FREE ACCESS
This paper proposes a novel image classification scheme for cloth pattern recognition. The rotation and scale invariant delta-HOG (DHOG)-based descriptor and the entire recognition process using random ferns with this descriptor are proposed independent from pose and scale changes. These methods consider maximun orientation and various radii of a circular patch window for fast and efficient classification even when cloth patches are rotated and the scale is changed. It exhibits good performance in cloth pattern recognition experiments. It found a greater number of similar cloth patches than dense-SIFT in 20 tests out of a total of 36 query tests. In addition, the proposed method is much faster than dense-SIFT in both training and testing; its time consumption is decreased by 57.7% in training and 41.4% in testing. The proposed method, therefore, is expected to contribute to real-time cloth searching service applications that update vast numbers of cloth images posted on the Internet.