IEICE Transactions on Information and Systems

Special Section on Data Engineering and Information Management

FOREWORD

Shinsuke NAKAJIMA

2015Volume E98.DIssue 5 Pages 1000
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014DAF0001

JOURNAL FREE ACCESS

Download PDF (59K)
Cache-Conscious Data Access for DBMS in Multicore Environments

Fang XI, Takeshi MISHIMA, Haruo YOKOTA

Article type: PAPER
2015Volume E98.DIssue 5 Pages 1001-1012
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014DAP0004

JOURNAL FREE ACCESS

Show abstractHide abstract

In recent years, dramatic improvements have been made to computer hardware. In particular, the number of cores on a chip has been growing exponentially, enabling an ever-increasing number of processes to be executed in parallel. Having been originally developed for single-core processors, database (DB) management systems (DBMSs) running on multicore processors suffer from cache conflicts as the number of concurrently executing DB processes (DBPs) increases. Therefore, a cache-efficient solution for arranging the execution of concurrent DBPs on multicore platforms would be highly attractive for DBMSs. In this paper, we propose CARIC-DA, middleware for achieving higher performance in DBMSs on multicore processors, by reducing cache misses with a new cache-conscious dispatcher for concurrent queries. CARIC-DA logically range-partitions the dataset into multiple subsets. This enables different processor cores to access different subsets by ensuring that different DBPs are pinned to different cores and by dispatching queries to DBPs according to the data-partitioning information. In this way, CARIC-DA is expected to achieve better performance via a higher cache hit rate for the private cache of each core. It can also balance the loads between cores by changing the range of each subset. Note that CARIC-DA is pure middleware, meaning that it avoids any modification to existing operating systems (OSs) and DBMSs, thereby making it more practical. This is important because the source code for existing DBMSs is large and complex, making it very expensive to modify. We implemented a prototype that uses unmodified existing Linux and PostgreSQL environments, and evaluated the effectiveness of our proposal on three different multicore platforms. The performance evaluation against benchmarks revealed that CARIC-DA achieved improved cache hit rates and higher performance.

View full abstract

Download PDF (4169K)
Accordion: An Efficient Gear-Shifting for a Power-Proportional Distributed Data-Placement Method

Hieu Hanh LE, Satoshi HIKIDA, Haruo YOKOTA

Article type: PAPER
2015Volume E98.DIssue 5 Pages 1013-1026
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014DAP0007

JOURNAL FREE ACCESS

Show abstractHide abstract

Power-aware distributed file systems for efficient Big Data processing are increasingly moving towards power-proportional designs. However, current data placement methods for such systems have not given careful consideration to the effect of gear-shifting during operations. If the system wants to shift to a higher gear, it must reallocate the updated datasets that were modified in a lower gear when a subset of the nodes was inactive, but without disrupting the servicing of requests from clients. Inefficient gear-shifting that requires a large amount of data reallocation greatly degrades the system performance. To address this challenge, this paper proposes a data placement method known as Accordion, which uses data replication to arrange the data layout comprehensively and provide efficient gear-shifting. Compared with current methods, Accordion reduces the amount of data transferred, which significantly shortens the period required to reallocate the updated data during gear-shifting then able to improve the performance of the systems. The effect of this reduction is larger with higher gears, so Accordion is suitable for smooth gear-shifting in multigear systems. Moreover, the times when the active nodes serve the requests are well distributed, so Accordion is capable of higher scalability than existing methods based on the I/O throughput performance. Accordion does not require any strict constraint on the number of nodes in the system therefore our proposed method is expected to work well in practical environments. Extensive empirical experiments using actual machines with an Accordion prototype based on the Hadoop Distributed File System demonstrated that our proposed method significantly reduced the period required to transfer updated data, i.e., by 66% compared with an existing method.

View full abstract

Download PDF (1265K)
k-Dominant Skyline Query Computation in MapReduce Environment

Md. Anisuzzaman SIDDIQUE, Hao TIAN, Yasuhiko MORIMOTO

Article type: PAPER
2015Volume E98.DIssue 5 Pages 1027-1034
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014DAP0010

JOURNAL FREE ACCESS

Show abstractHide abstract

Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.

View full abstract

Download PDF (848K)
3D Objects Tracking by MapReduce GPGPU-Enhanced Particle Filter

Jieyun ZHOU, Xiaofeng LI, Haitao CHEN, Rutong CHEN, Masayuki NUMAO

Article type: PAPER
2015Volume E98.DIssue 5 Pages 1035-1044
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014DAP0020

JOURNAL FREE ACCESS

Show abstractHide abstract

Objects tracking methods have been wildly used in the field of video surveillance, motion monitoring, robotics and so on. Particle filter is one of the promising methods, but it is difficult to apply to real-time objects tracking because of its high computation cost. In order to reduce the processing cost without sacrificing the tracking quality, this paper proposes a new method for real-time 3D objects tracking, using parallelized particle filter algorithms by MapReduce architecture which is running on GPGPU. Our methods are as follows. First, we use a Kinect to get the 3D information of objects. Unlike the conventional 2D-based objects tracking, 3D objects tracking adds depth information. It can track not only from the x and y axis but also from the z axis, and the depth information can correct some errors in 2D objects tracking. Second, to solve the high computation cost problem, we use the MapReduce architecture on GPGPU to parallelize the particle filter algorithm. We implement the particle filter algorithms on GPU and evaluate the performance by actually running a program on CUDA5.5.

View full abstract

Download PDF (13699K)
A Linguistics-Driven Approach to Statistical Parsing for Low-Resourced Languages

Prachya BOONKWAN, Thepchai SUPNITHI

Article type: PAPER
2015Volume E98.DIssue 5 Pages 1045-1052
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014DAP0024

JOURNAL FREE ACCESS

Show abstractHide abstract

Developing a practical and accurate statistical parser for low-resourced languages is a hard problem, because it requires large-scale treebanks, which are expensive and labor-intensive to build from scratch. Unsupervised grammar induction theoretically offers a way to overcome this hurdle by learning hidden syntactic structures from raw text automatically. The accuracy of grammar induction is still impractically low because frequent collocations of non-linguistically associable units are commonly found, resulting in dependency attachment errors. We introduce a novel approach to building a statistical parser for low-resourced languages by using language parameters as a guide for grammar induction. The intuition of this paper is: most dependency attachment errors are frequently used word orders which can be captured by a small prescribed set of linguistic constraints, while the rest of the language can be learned statistically by grammar induction. We then show that covering the most frequent grammar rules via our language parameters has a strong impact on the parsing accuracy in 12 languages.

View full abstract

Download PDF (642K)

Regular Section

A Detection and Measurement Approach for Memory Leaked Objects in Java Programs

Qiao YU, Shujuan JIANG, Yingqi LIU

Article type: PAPER
Subject area: Software Engineering
2015Volume E98.DIssue 5 Pages 1053-1061
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7320

JOURNAL FREE ACCESS

Show abstractHide abstract

Memory leak occurs when useless objects cannot be released for a long time during program execution. Memory leaked objects may cause memory overflow, system performance degradation and even cause the system to crash when they become serious. This paper presents a dynamic approach for detecting and measuring memory leaked objects in Java programs. First, our approach tracks the program by JDI and records heap information to find out the potentially leaked objects. Second, we present memory leaking confidence to measure the influence of these objects on the program. Finally, we select three open-source programs to evaluate the efficiency of our approach. Furthermore, we choose ten programs from DaCapo 9.12 benchmark suite to reveal the time overhead of our approach. The experimental results show that our approach is able to detect and measure memory leaked objects efficiently.

View full abstract

Download PDF (774K)
A Similarity-Based Concepts Mapping Method between Ontologies

Jie LIU, Linlin QIN, Jing GAO, Aidong ZHANG

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2015Volume E98.DIssue 5 Pages 1062-1072
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7188

JOURNAL FREE ACCESS

Show abstractHide abstract

Ontology mapping is important in many areas, such as information integration, semantic web and knowledge management. Thus the effectiveness of ontology mapping needs to be further studied. This paper puts forward a mapping method between different ontology concepts in the same field. Firstly, the algorithms of calculating four individual similarities (the similarities of concept name, property, instance and structure) between two concepts are proposed. The algorithm features of four individual similarities are as follows: a new WordNet-based method is used to compute semantic similarity between concept names; property similarity algorithm is used to form property similarity matrix between concepts, then the matrix will be processed into a numerical similarity; a new vector space model algorithm is proposed to compute the individual similarity of instance; structure parameters are added to structure similarity calculation, structure parameters include the number of properties, instances, sub-concepts, and the hierarchy depth of two concepts. Then similarity of each of ontology concept pairs is represented by a vector. Finally, Support Vector Machine (SVM) is used to accomplish mapping discovery by training and learning the similarity vectors. In this algorithm, Harmony and reliability are used as the weights of the four individual similarities, which increases the accuracy and reliability of the algorithm. Experiments achieve good results and the results show that the proposed method outperforms many other methods of similarity-based algorithms.

View full abstract

Download PDF (1398K)
Direct Density Ratio Estimation with Convolutional Neural Networks with Application in Outlier Detection

Hyunha NAM, Masashi SUGIYAMA

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2015Volume E98.DIssue 5 Pages 1073-1079
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7335

JOURNAL FREE ACCESS

Show abstractHide abstract

Recently, the ratio of probability density functions was demonstrated to be useful in solving various machine learning tasks such as outlier detection, non-stationarity adaptation, feature selection, and clustering. The key idea of this density ratio approach is that the ratio is directly estimated so that difficult density estimation is avoided. So far, parametric and non-parametric direct density ratio estimators with various loss functions have been developed, and the kernel least-squares method was demonstrated to be highly useful both in terms of accuracy and computational efficiency. On the other hand, recent study in pattern recognition exhibited that deep architectures such as a convolutional neural network can significantly outperform kernel methods. In this paper, we propose to use the convolutional neural network in density ratio estimation, and experimentally show that the proposed method tends to outperform the kernel-based method in outlying image detection.

View full abstract

Download PDF (604K)
Robust Visual Tracking via Coupled Randomness

Chao ZHANG, Yo YAMAGATA, Takuya AKASHI

Article type: PAPER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 5 Pages 1080-1088
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7210

JOURNAL FREE ACCESS

Show abstractHide abstract

Tracking algorithms for arbitrary objects are widely researched in the field of computer vision. At the beginning, an initialized bounding box is given as the input. After that, the algorithms are required to track the objective in the later frames on-the-fly. Tracking-by-detection is one of the main research branches of online tracking. However, there still exist two issues in order to improve the performance. 1) The limited processing time requires the model to extract low-dimensional and discriminative features from the training samples. 2) The model is required to be able to balance both the prior and new objectives' appearance information in order to maintain the relocation ability and avoid the drifting problem. In this paper, we propose a real-time tracking algorithm called coupled randomness tracking (CRT) which focuses on dealing with these two issues. One randomness represents random projection, and the other randomness represents online random forests (ORFs). In CRT, the gray-scale feature is compressed by a sparse measurement matrix, and ORFs are used to train the sample sequence online. During the training procedure, we introduce a tree discarding strategy which helps the ORFs to adapt fast appearance changes caused by illumination, occlusion, etc. Our method can constantly adapt to the objective's latest appearance changes while keeping the prior appearance information. The experimental results show that our algorithm performs robustly with many publicly available benchmark videos and outperforms several state-of-the-art algorithms. Additionally, our algorithm can be easily utilized into a parallel program.

View full abstract

Download PDF (1896K)
A Hybrid Topic Model for Multi-Document Summarization

JinAn XU, JiangMing LIU, Kenji ARAKI

Article type: PAPER
Subject area: Natural Language Processing
2015Volume E98.DIssue 5 Pages 1089-1094
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7229

JOURNAL FREE ACCESS

Show abstractHide abstract

Topic features are useful in improving text summarization. However, independency among topics is a strong restriction on most topic models, and alleviating this restriction can deeply capture text structure. This paper proposes a hybrid topic model to generate multi-document summaries using a combination of the Hidden Topic Markov Model (HTMM), the surface texture model and the topic transition model. Based on the topic transition model, regular topic transition probability is used during generating summary. This approach eliminates the topic independence assumption in the Latent Dirichlet Allocation (LDA) model. Meanwhile, the results of experiments show the advantage of the combination of the three kinds of models. This paper includes alleviating topic independency, and integrating surface texture and shallow semantic in documents to improve summarization. In short, this paper attempts to realize an advanced summarization system.

View full abstract

Download PDF (546K)
Noise Tolerant Heart Rate Extraction Algorithm Using Short-Term Autocorrelation for Wearable Healthcare Systems

Shintaro IZUMI, Masanao NAKANO, Ken YAMASHITA, Yozaburo NAKAI, Hiroshi ...

Article type: PAPER
Subject area: Biological Engineering
2015Volume E98.DIssue 5 Pages 1095-1103
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7161

JOURNAL FREE ACCESS

Show abstractHide abstract

This report describes a robust method of instantaneous heart rate (IHR) extraction from noisy electrocardiogram (ECG) signals. Generally, R-waves are extracted from ECG using a threshold to calculate the IHR from the interval of R-waves. However, noise increases the incidence of misdetection and false detection in wearable healthcare systems because the power consumption and electrode distance are limited to reduce the size and weight. To prevent incorrect detection, we propose a short-time autocorrelation (STAC) technique. The proposed method extracts the IHR by determining the search window shift length which maximizes the correlation coefficient between the template window and the search window. It uses the similarity of the QRS complex waveform beat-by-beat. Therefore, it has no threshold calculation process. Furthermore, it is robust against noisy environments. The proposed method was evaluated using MIT-BIH arrhythmia and noise stress test databases. Simulation results show that the proposed method achieves a state-of-the-art success rate of IHR extraction in a noise stress test using a muscle artifact and a motion artifact.

View full abstract

Download PDF (2609K)
On the Probability of Certificate Revocation in Combinatorial Certificate Management Schemes

Dae Hyun YUM

Article type: LETTER
Subject area: Information Network
2015Volume E98.DIssue 5 Pages 1104-1107
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8012

JOURNAL FREE ACCESS

Show abstractHide abstract

To enhance the privacy of vehicle owners, combinatorial certificate management schemes assign each certificate to a large enough group of vehicles so that it will be difficult to link a certificate to any particular vehicle. When an innocent vehicle shares a certificate with a misbehaving vehicle and the certificate on the misbehaving vehicle has been revoked, the certificate on the innocent vehicle also becomes invalid and is said to be covered. When a group of misbehaving vehicles collectively share all the certificates assigned to an innocent vehicle and these certificates are revoked, the innocent vehicle is said to be covered. We point out that the previous analysis of the vehicle cover probability is not correct and then provide a new and exact analysis of the vehicle cover probability.

View full abstract

Download PDF (82K)
A Deduplication-Enabled P2P Protocol for VM Image Distribution

Choonhwa LEE, Sungho KIM, Eunsam KIM

Article type: LETTER
Subject area: Information Network
2015Volume E98.DIssue 5 Pages 1108-1111
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8180

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a novel peer-to-peer protocol to efficiently distribute virtual machine images in a datacenter. A primary idea of it is to improve the performance of peer-to-peer content delivery by employing deduplication to take advantage of similarity both among and within VM images in cloud datacenters. The efficacy of the proposed scheme is validated through an evaluation that demonstrates substantial performance gains.

View full abstract

Download PDF (671K)
Face Verification Based on the Age Progression Rules

Kai FANG, Shuoyan LIU

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 5 Pages 1112-1115
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8178

JOURNAL FREE ACCESS

Show abstractHide abstract

Appearance changes conform to certain rules for a same person,while for different individuals the changes are uncontrolled. Hence, this paper studies the age progression rules to tackle face verification task. The age progression rules are discovered in the difference space of facial image pairs. For this, we first represent an image pair as a matrix whose elements are the difference of a set of visual words. Thereafter, the age progression rules are trained using Support Vector Machine (SVM) based on this matrix representation. Finally, we use these rules to accomplish the face verification tasks. The proposed approach is tested on the FGnet dataset and a collection of real-world images from identification card. The experimental results demonstrate the effectiveness of the proposed method for verification of identity.

View full abstract

Download PDF (1097K)
Discriminative Dictionary Learning with Low-Rank Error Model for Robust Crater Recognition

An LIU, Maoyin CHEN, Donghua ZHOU

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 5 Pages 1116-1119
Published: May 01, 2015
Released on J-STAGE: May 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8254

JOURNAL FREE ACCESS

Show abstractHide abstract

Robust crater recognition is a research focus on deep space exploration mission, and sparse representation methods can achieve desirable robustness and accuracy. Due to destruction and noise incurred by complex topography and varied illumination in planetary images, a robust crater recognition approach is proposed based on dictionary learning with a low-rank error correction model in a sparse representation framework. In this approach, all the training images are learned as a compact and discriminative dictionary. A low-rank error correction term is introduced into the dictionary learning to deal with gross error and corruption. Experimental results on crater images show that the proposed method achieves competitive performance in both recognition accuracy and efficiency.

View full abstract

Download PDF (471K)

Register with J-STAGE for free!