IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E95.D, Issue 12
Displaying 1-49 of 49 articles from this issue
Special Section on Parallel and Distributed Computing and Networking
• Hideharu AMANO
2012 Volume E95.D Issue 12 Pages 2749
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
• Liqiang ZHANG, Chao LI, Haoliang SUN, Changwen ZHENG, Pin LV
Article type: PAPER
Subject area: Parallel and Distributed Computing
2012 Volume E95.D Issue 12 Pages 2750-2758
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Due to the complicated composition of cloud and its disordered transformation, the rendering of cloud does not perfectly meet actual prospect by current methods. Based on physical characteristics of cloud, a physical cellular automata model of Dynamic cloud is designed according to intrinsic factor of cloud, which describes the rules of hydro-movement, deposition and accumulation and diffusion. Then a parallel computing architecture is designed to compute the large-scale data set required by the rendering of dynamical cloud, and a GPU-based ray-casting algorithm is implemented to render the cloud volume data. The experiment shows that cloud rendering method based on physical cellular automata model is very efficient and able to adequately exhibit the detail of cloud.
• Kazuya MATSUMOTO, Naohito NAKASATO, Stanislav G. SEDUKHIN
Article type: PAPER
Subject area: Parallel and Distributed Computing
2012 Volume E95.D Issue 12 Pages 2759-2768
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
This paper presents a blocked united algorithm for the all-pairs shortest paths (APSP) problem. This algorithm simultaneously computes both the shortest-path distance matrix and the shortest-path construction matrix for a graph. It is designed for a high-speed APSP solution on hybrid CPU-GPU systems. In our implementation, two most compute intensive parts of the algorithm are performed on the GPU. The first part is to solve the APSP sub-problem for a block of sub-matrices, and the other part is a matrix-matrix “multiplication” for the APSP problem. Moreover, the amount of data communication between CPU (host) memory and GPU memory is reduced by reusing blocks once sent to the GPU. When a problem size (the number of vertices in a graph) is large enough compared to a block size, our implementation of the blocked algorithm requires CPU $\rightleftharpoons$ GPU exchanging of three blocks during a block computation on the GPU. We measured the performance of the algorithm implementation on two different CPU-GPU systems. A system containing an Intel Sandy Bridge CPU (Core i7 2600K) and an AMD Cayman GPU (Radeon HD 6970) achieves the performance up to 1.1 TFlop/s in a single precision.
• Arne KUTZNER, Pok-Son KIM, Won-Kwang PARK
Article type: PAPER
Subject area: Parallel and Distributed Computing
2012 Volume E95.D Issue 12 Pages 2769-2777
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
We propose a family of algorithms for efficiently merging on contemporary GPUs, so that each algorithm requires $O(m\log(\frac{n}{m}+1))$ element comparisons, where m and n are the sizes of the input sequences with mn. According to the lower bounds for merging all proposed algorithms are asymptotically optimal regarding the number of necessary comparisons. First we introduce a parallely structured algorithm that splits a merging problem of size 2l into 2i subproblems of size 2l-i, for some arbitrary i with (0≤il). This algorithm represents a merger for i=l but it is rather inefficient in this case. The efficiency is boosted by moving to a two stage approach where the splitting process stops at some predetermined level and transfers control to several parallely operating block-mergers. We formally prove the asymptotic optimality of the splitting process and show that for symmetrically sized inputs our approach delivers up to 4 times faster runtimes than the thrust::merge function that is part of the Thrust library. For assessing the value of our merging technique in the context of sorting we construct and evaluate a MergeSort on top of it. In the context of our benchmarking the resulting MergeSort clearly outperforms the MergeSort implementation provided by the Thrust library as well as Cederman's GPU optimized variant of QuickSort.
• Mamoru OHARA, Takashi YAMAGUCHI
Article type: PAPER
Subject area: Parallel and Distributed Computing
2012 Volume E95.D Issue 12 Pages 2778-2786
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In numerical simulations using massively parallel computers like GPGPU (General-Purpose computing on Graphics Processing Units), we often need to transfer computational results from external devices such as GPUs to the main memory or secondary storage of the host machine. Since size of the computation results is sometimes unacceptably large to hold them, it is desired that the data is compressed and stored. In addition, considering overheads for transferring data between the devices and host memories, it is preferable that the data is compressed in a part of parallel computation performed on the devices. Traditional compression methods for floating-point numbers do not always show good parallelism. In this paper, we propose a new compression method for massively-parallel simulations running on GPUs, in which we combine a few successive floating-point numbers and interleave them to improve compression efficiency. We also present numerical examples of compression ratio and throughput obtained from experimental implementations of the proposed method runnig on CPUs and GPUs.
• Keisuke DOHI, Yuichiro SHIBATA, Kiyoshi OGURI, Takafumi FUJIMOTO
Article type: PAPER
Subject area: Parallel and Distributed Computing
2012 Volume E95.D Issue 12 Pages 2787-2795
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper, we propose and discuss efficient GPU implementation techniques of absorbing boundary conditions (ABCs) for a 3D finite-difference time-domain (FDTD) electromagnetic field simulation for antenna design. In view of architectural nature of GPUs, the idea of a periodic boundary condition is introduced to implementation of perfect matched layers (PMLs) as well as a transformation technique of PML equations for partial boundaries. We also present efficient implementation method of a non-uniform grid. The evaluation results with a typical simulation model reveal that our proposed technique almost double the simulation performance and eventually achieve the 55.8% of the peak memory bandwidth of a target GPU.
• Chih-Min LO, Sun-Jen HUANG
Article type: PAPER
Subject area: Computer System and Services
2012 Volume E95.D Issue 12 Pages 2796-2809
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
The globalization of commerce has increased the importance of retrieving and updating complex and distributed information efficiently. Web services currently show that the most promise for building distributed application systems and model-driven architecture is a new approach to developing such applications. The expanding scale and complexity of enterprise information systems (EISs) under distributed computing environments has made sharing and exchanging data particularly challenging. Data services are applications tailored specifically for information oriented tasks to deal with business service requirements, and are heavily dependent on the distributed architecture of consumer data processing. The implementation of a data service can eliminate inconsistency among various application systems in the exchange of data. This paper proposes a data-oriented model-driven developmental framework to deal with these issues, in which a platform independent model (PIM) is divided into a service model, a logic data model, and a service composition model. We also divide a platform specific model (PSM) into a physical data model and a data service model. In this development method, we define five meta-models and outline a set of rules governing the transformation from PIMs into PSMs. A code generator is also included to transform each PSM into the application code. We include a case study to demonstrate the feasibility and merits of the proposed development framework with a case study.
• Yan CUI, Yu CHEN, Yuanchun SHI
Article type: PAPER
Subject area: Computer System and Services
2012 Volume E95.D Issue 12 Pages 2810-2820
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Multicore processor architectures have become ubiquitous in today's computing platforms, especially in parallel computing installations, with their power and cost advantages. While the technology trend continues towards having hundreds of cores on a chip in the foreseeable future, an urgent question posed to system designers as well as application users is whether applications can receive sufficient support on today's operating systems for them to scale to many cores. To this end, people need to understand the strengths and weaknesses on their support on scalability and to identify major bottlenecks limiting the scalability, if any. As open-source operating systems are of particular interests in the research and industry communities, in this paper we choose three operating systems (Linux, Solaris and FreeBSD) to systematically evaluate and compare their scalability by using a set of highly-focused microbenchmarks for broad and detailed understanding their scalability on an AMD 32-core system. We use system profiling tools and analyze kernel source codes to find out the root cause of each observed scalability bottleneck. Our results reveal that there is no single operating system among the three standing out on all system aspects, though some system(s) can prevail on some of the system aspects. For example, Linux outperforms Solaris and FreeBSD significantly for file-descriptor- and process-intensive operations. For applications with intensive sockets creation and deletion operations, Solaris leads FreeBSD, which scales better than Linux. With the help of performance tools and source code instrumentation and analysis, we find that synchronization primitives protecting shared data structures in the kernels are the major bottleneck limiting system scalability.
• Tsung-Han LIN, Yuki KINEBUCHI, Tatsuo NAKAJIMA
Article type: PAPER
Subject area: Computer System and Services
2012 Volume E95.D Issue 12 Pages 2821-2832
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper, we propose a virtualization architecture for a multi-core embedded system to provide more system reliability and security while maintaining performance and without introducing additional special hardware supports or implementing a complex protection mechanism in the virtualization layer. Embedded systems, especially consumer electronics, have often used virtualization. Virtualization is not a new technique, as there are various uses for both GPOS (General Purpose Operating System) and RTOS (Real Time Operating System). The surge of the multi-core platforms in embedded systems also helps consolidate the virtualization system for better performance and lower power consumption. Embedded virtualization design usually uses two approaches. The first is to use the traditional VMM, but it is too complicated for use in the embedded environment without additional special hardware support. The other approach uses the microkernel, which imposes a modular design. The guest systems, however, would suffer from considerable modifications in this approach, as the microkernel allows guest systems to run in the user space. For some RTOSes and their applications originally running in the kernel space, this second approach is more difficult to use because those codes use many privileged instructions. To achieve better reliability and keep the virtualization layer design lightweight, this work uses a common hardware component adopted in multi-core embedded processors. In most embedded platforms, vendors provide additional on-chip local memory for each physical core, and these local memory areas are only private to their cores. By taking advantage of this memory architecture, we can mitigate the above-mentioned problems at once. We choose to re-map the virtualization layer's program on the local memory, called SPUMONE, which runs all guest systems in the kernel space. Doing so, it can provide additional reliability and security for the entire system because the SPUMONE design in a multi-core platform has each instance installed on a separate processor core. This design differs from traditional virtualization layer design, and the content of each SPUMONE is inaccessible to the others. We also achieve this goal without adding overhead to the overall performance.
• Wei ZHANG, Li RUAN, Mingfa ZHU, Limin XIAO, Jiajun LIU, Xiaolan TANG, ...
Article type: PAPER
Subject area: Computer System and Services
2012 Volume E95.D Issue 12 Pages 2833-2843
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In order to reduce cost and improve efficiency, many data centers adopt virtualization solutions. The advent of virtualization allows multiple virtual machines hosted on a single physical server. However, this poses new challenges for resource management. Web workloads which are dominant in data centers are known to vary dynamically with time. In order to meet application's service level agreement (SLA), how to allocate resources for virtual machines has become an important challenge in virtualized server environments, especially when dealing with fluctuating workloads and complex server applications. User experience is an important manifestation of SLA and attracts more attention. In this paper, the SLA is defined by server-side response time. Traditional resource allocation based on resource utilization has some drawbacks. We argue that dynamic resource allocation directly based on real-time user experience is more reasonable and also has practical significance. To address the problem, we propose a system architecture that combines response time measurements and analysis of user experience for resource allocation. An optimization model is introduced to dynamically allocate the resources among virtual machines. When resources are insufficient, we provide service differentiation and firstly guarantee resource requirements of applications that have higher priorities. We evaluate our proposal using TPC-W and Webbench. The experimental results show that our system can judiciously allocate system resources. The system helps stabilize applications' user experience. It can reduce the mean deviation of user experience from desired targets.
Article type: PAPER
Subject area: Network and Communication
2012 Volume E95.D Issue 12 Pages 2844-2851
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Network survivability is defined as the ability of a network keeping connected under failures and/or attacks. In this paper, we propose two stochastic models; binomial model and negative binomial model, to quantify the network survivability and compare them with the existing Poisson model. We give mathematical formulae of approximate network survivability for respective models and use them to carry out the sensitivity analysis on model parameters. Throughout numerical examples it is shown that the network survivability can change drastically when the number of network nodes is relatively small under a severe attack mode which is called the Black hole attack.
• Yutaka KAWAI, Adil HASAN, Go IWAI, Takashi SASAKI, Yoshiyuki WATASE
Article type: PAPER
Subject area: Network and Communication
2012 Volume E95.D Issue 12 Pages 2852-2859
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper we report on an approach inspired by Ant Colony Optimization (ACO) to provide a fault tolerant and efficient means of transferring data in dynamic environments. We investigate the problem of distributing data between a client and server by using pheromone equations. Ants choose the best source of food by selecting the strongest pheromone trail leaving the nest. The pheromone decays over-time and needs to be continually reinforced to define the optimum route in a dynamic environment. This resembles the dynamic environment for the distribution of data between clients and servers. Our approach uses readily available network and server information to construct a pheromone that determines the best server from which to download data. We demonstrate that the approach is self-optimizing and capable of adapting to dynamic changes in the environment.
• Naoya MAKI, Takayuki NISHIO, Ryoichi SHINKUMA, Tatsuya MORI, Noriaki K ...
Article type: PAPER
Subject area: Network and Communication
2012 Volume E95.D Issue 12 Pages 2860-2869
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In content services where people purchase and download large-volume contents, minimizing network traffic is crucial for the service provider and the network operator since they want to lower the cost charged for bandwidth and the cost for network infrastructure, respectively. Traffic localization is an effective way of reducing network traffic. Network traffic is localized when a client can obtain the requested content files from other a near-by altruistic client instead of the source servers. The concept of the peer-assisted content distribution network (CDN) can reduce the overall traffic with this mechanism and enable service providers to minimize traffic without deploying or borrowing distributed storage. To localize traffic effectively, content files that are likely to be requested by many clients should be cached locally. This paper presents a novel traffic engineering scheme for peer-assisted CDN models. Its key idea is to control the behavior of clients by using content-oriented incentive mechanism. This approach enables us to optimize traffic flows by letting altruistic clients download content files that are most likely contributed to localizing traffic among clients. In order to let altruistic clients request the desired files, we combine content files while keeping the price equal to the one for a single content. This paper presents a solution for optimizing the selection of content files to be combined so that cross traffic in a network is minimized. We also give a model for analyzing the upper-bound performance and the numerical results.
• Chunghan LEE, Hirotake ABE, Toshio HIROTSU, Kyoji UMEMURA
Article type: PAPER
Subject area: Network and Communication
2012 Volume E95.D Issue 12 Pages 2870-2878
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Predicting network throughput is important for network-aware applications. Network throughput depends on a number of factors, and many throughput prediction methods have been proposed. However, many of these methods are suffering from the fact that a distribution of traffic fluctuation is unclear and the scale and the bandwidth of networks are rapidly increasing. Furthermore, virtual machines are used as platforms in many network research and services fields, and they can affect network measurement. A prediction method that uses pairs of differently sized connections has been proposed. This method, which we call connection pair, features a small probe transfer using the TCP that can be used to predict the throughput of a large data transfer. We focus on measurements, analyses, and modeling for precise prediction results. We first clarified that the actual throughput for the connection pair is non-linearly and monotonically changed with noise. Second, we built a previously proposed predictor using the same training data sets as for our proposed method, and it was unsuitable for considering the above characteristics. We propose a throughput prediction method based on the connection pair that uses ν-support vector regression and the polynomial kernel to deal with prediction models represented as a non-linear and continuous monotonic function. The prediction results of our method compared to those of the previous predictor are more accurate. Moreover, under an unstable network state, the drop in accuracy is also smaller than that of the previous predictor.
• Koji SATO, Shiro SAKATA
Article type: PAPER
Subject area: Network and Communication
2012 Volume E95.D Issue 12 Pages 2879-2887
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
This paper proposes a distributed TDMA slot scheduling algorithm with power control, which the slot allocation priority is controlled by distance measurement information. In the proposed scheme, Lamport's bakery algorithm for mutual exclusion is applied for prioritized slot allocation based on the distance measurement information between nodes, and a packet-based transmission power control scheme is combined. This aims at achieving media access control methods which can construct a local network practically by limiting the scope. The proposed scheme can be shown as a possible replacement of DRAND algorithm for Z-MAC scheme in a distance-measurement-oriented manner. The scheme can contribute to the efficient TDMA slot allocation.
• Tao BAN, Shanqing GUO, Masashi ETO, Daisuke INOUE, Koji NAKAO
Article type: PAPER
Subject area: Network and Communication
2012 Volume E95.D Issue 12 Pages 2888-2897
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Characterization of peer-to-peer (P2P) traffic is an essential step to develop workload models towards capacity planning and cyber-threat countermeasure over P2P networks. In this paper, we present a classification scheme for characterizing P2P file-sharing hosts based on transport layer statistical features. The proposed scheme is accessed on a virtualized environment that simulates a P2P-friendly cloud system. The system shows high accuracy in differentiating P2P file-sharing hosts from ordinary hosts. Its tunability regarding monitoring cost, system response time, and prediction accuracy is demonstrated by a series of experiments. Further study on feature selection is pursued to identify the most essential discriminators that contribute most to the classification. Experimental results show that an equally accurate system could be obtained using only 3 out of the 18 defined discriminators, which further reduces the monitoring cost and enhances the adaptability of the system.
• Dajiang LIU, Shouyi YIN, Chongyong YIN, Leibo LIU, Shaojun WEI
Article type: PAPER
Subject area: Computer Architecture
2012 Volume E95.D Issue 12 Pages 2898-2907
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Reconfigurable computing system is a class of parallel architecture with the ability of computing in hardware to increase performance, while remaining much of flexibility of a software solution. This architecture is particularly suitable for running regular and compute-intensive tasks, nevertheless, most compute-intensive tasks spend most of their running time in nested loops. Polyhedron model is a powerful tool to give a reasonable transformation on such nested loops. In this paper, a number of issues are addressed towards the goal of optimization of affine loop nests for reconfigurable cell array (RCA), such as approach to make the most use of processing elements (PE) while minimizing the communication volume by loop transformation in polyhedron model, determination of tilling form by the intra-statement dependence analysis and determination of tilling size by the tilling form and the RCA size. Experimental results on a number of kernels demonstrate the effectiveness of the mapping optimization approaches developed. Compared with DFG-based optimization approach, the execution performances of 1-d jacobi and matrix multiplication are improved by 28% and 48.47%. Lastly, the run-time complexity is acceptable for the practical cases.
• Quanyou FENG, Huanzhong LI, Wenhua DOU
Article type: PAPER
Subject area: Computer Architecture
2012 Volume E95.D Issue 12 Pages 2908-2918
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
With the trend towards increasing number of cores, for example, 1000 cores, interconnection network in manycore chips has become the critical bottleneck for providing communication infrastructures among on-chip cores as well as to off-chip memory. However, conventional on-chip mesh topologies do not scale up well because remote cores are generally separated by too many hops due to the small-radix routers within these networks. Moreover, projected scaling of electrical processor-memory network appears unlikely to meet the enormous demand for memory bandwidth while satisfying stringent power budget. Fortunately, recent advances in 3D integration technology and silicon photonics have provided potential solutions to these challenges. In this paper, we propose a hybrid photonic burst-switched interconnection network for large-scale manycore processors. We embed an electric low-diameter flattened butterfly into 3D stacking layers using integer linear programming, which results in a scalable low-latency network for inter-core packets exchange. Furthermore, we use photonic burst switching (PBS) for processor-memory network. PBS is an adaptation of optical burst switching for chip-scale communication, which can significantly improve the power efficiency by leveraging sub-wavelength, bandwidth-efficient optical switching. Using our physically-accurate network-level simulation environment, we examined the system feasibility and performances. Simulation results show that our hybrid network achieves up to 25% of network latency reduction and up to 6 times energy savings, compared to conventional on-chip mesh network and optical circuit-switched memory access scheme.
• Yasin OGE, Takefumi MIYOSHI, Hideyuki KAWASHIMA, Tsutomu YOSHINAGA
Article type: PAPER
Subject area: Computer Architecture
2012 Volume E95.D Issue 12 Pages 2919-2927
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
A novel design is proposed to implement highly parallel stream join operators on a field-programmable gate array (FPGA), by examining handshake join algorithm for hardware implementation. The proposed design is evaluated in terms of the hardware resource usage, the maximum clock frequency, and the performance. Experimental results indicate that the proposed implementation can handle considerably high input rates, especially at low match rates. Results of simulation conducted to optimize size of buffers included in join and merge units give a new intuition regarding static and adaptive buffer tuning in handshake join.
• Hidetsugu IRIE, Takefumi MIYOSHI, Goki HONJO, Kei HIRAKI, Tsutomu YOSH ...
Article type: PAPER
Subject area: Computer Architecture
2012 Volume E95.D Issue 12 Pages 2928-2938
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
One of the significant issues of processor architecture is to overcome memory latency. Prefetching can greatly improve cache performance, but it has the drawback of cache pollution, unless its aggressiveness is properly set. Several techniques that have been proposed for prefetcher throttling use accuracy as a metric, but their robustness were not sufficient because of the variations in programs' working set sizes and cache capacities. In this study, we revisit prefetcher throttling from the viewpoint of data lifetime. Exploiting the characteristics of cache line reuse, we propose Cache-Convection-Control-based Prefetch Optimization Plus (CCCPO+), which enhances the feedback algorithm of our previous CCCPO. Evaluation results showed that this novel approach achieved a 30% improvement over no prefetching in the geometric mean of the SPEC CPU 2006 benchmark suite with 256KB LLC, 1.8% over the latest prefetcher throttling, and 0.5% over our previous CCCPO. Moreover, it showed superior stability compared to related works, while lowering the hardware cost.
• Bei HUANG, Kaidi YOU, Yun CHEN, Zhiyi YU, Xiaoyang ZENG
Article type: PAPER
Subject area: Computer Architecture
2012 Volume E95.D Issue 12 Pages 2939-2947
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Reed-Solomon (RS) codes are widely used in digital communication and storage systems. Unlike usual VLSI approaches, this paper presents a high throughput fully programmable Reed-Solomon decoder on a multi-core processor. The multi-core processor platform is a 2-Dimension mesh array of Single Instruction Multiple Data (SIMD) cores, and it is well suited for digital communication applications. By fully extracting the parallelizable operations of the RS decoding process, we propose multiple optimization techniques to improve system throughput, including: task level parallelism on different cores, data level parallelism on each SIMD core, minimizing memory access, and route length minimized task mapping techniques. For RS(255,239,8), experimental results show that our 12-core implementation achieve a throughput of 4.35Gbps, which is much better than several other published implementations. From the results, it is predictable that the throughput is linear with the number of cores by our approach.
• Kyong Hoon KIM, Guy Martin TCHAMGOUE, Yong-Kee JUN, Wan Yeon LEE
Article type: LETTER
2012 Volume E95.D Issue 12 Pages 2948-2951
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In large-scale collaborative computing, users and resource providers organize various Virtual Organizations (VOs) to share resources and services. A VO organizes other sub-VOs for the purpose of achieving the VO goal, which forms hierarchical VO environments. VO participants agree upon a certain policies, such as resource sharing amount or user accesses. In this letter, we provide an optimal resource sharing mechanism in hierarchical VO environments under resource sharing agreements. The proposed algorithm enhances resource utilization and reduces mean response time of each user.
• Shotaro IWANAGA, Shinji FUKUMA, Shin-ichiro MORI
Article type: LETTER
2012 Volume E95.D Issue 12 Pages 2952-2953
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper, a hybrid parallel implementation of inverse matrix computation using SMW formula is proposed. By aggregating the memory bandwidth in the hybrid parallel implementation, the bottleneck due to the memory bandwidth limitation in the authors previous multicore implementation has been dissolved. More than 8 times of speed up is also achieved with dual-core 8-nodes implementation which leads more than 20 simulation steps per second, or near real-time performance.
• Jiuling ZHANG, Beixing DENG, Xing LI, Xiao-lei ZHANG
Article type: LETTER
2012 Volume E95.D Issue 12 Pages 2954-2955
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Ranking the encrypted documents stored on secure cloud computing servers is becoming prominent with the expansion of the encrypted data collection. In our work, order preserving encryption is employed to pre-rank the encrypted documents. Paillier's additive homomorphic encryption is used to re-rank the top pre-ranked documents of some considerate scale.
• Changwoo MIN, Hyung Kook JUN, Won Tae KIM, Young Ik EOM
Article type: LETTER
2012 Volume E95.D Issue 12 Pages 2956-2957
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
A concurrent FIFO queue is a widely used fundamental data structure for parallelizing software. In this letter, we introduce a novel concurrent FIFO queue algorithm for multicore architecture. We achieve better scalability by reducing contention among concurrent threads, and improve performance by optimizing cache-line usage. Experimental results on a server with eight cores show that our algorithm outperforms state-of-the-art algorithms by a factor of two.
• Chee Yik KEONG, Poo Kuan HOONG, Choo-Yee TING
Article type: LETTER
2012 Volume E95.D Issue 12 Pages 2958-2959
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper, we propose an adaptive chunk scheduling for mesh-based peer-to-peer live streaming system, a hybrid class of push and pull chunk delivery approach. The proposed rule-based push-pull scheduler simultaneously pull video chunk from lower latency peers to fill up missing chunks and push video chunk adaptively for rapid chunk delivery. We performed comparative simulation study against rarest first push-pull and status-wise push-pull to prove the efficiency of our proposed algorithm. Mesh-push is made possible by effectively exploiting the information through buffer map exchange. The findings of performance evaluation have suggested a better video continuity and achieved lower source to end delay.
Regular Section
• Tatsuya AKUTSU, Yang ZHAO, Morihiro HAYASHIDA, Takeyuki TAMURA
Article type: PAPER
Subject area: Fundamentals of Information Systems
2012 Volume E95.D Issue 12 Pages 2960-2970
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
The Boolean network (BN) can be used to create discrete mathematical models of gene regulatory networks. In this paper, we consider three problems on BNs that are known to be NP-hard: detection of a singleton attractor, finding a control strategy that shifts a BN from a given initial state to the desired state, and control of attractors. We propose integer programming-based methods which solve these problems in a unified manner. Then, we present results of computational experiments which suggest that the proposed methods are useful for solving moderate size instances of these problems. We also show that control of attractors is $\Sigma_2^p$-hard, which suggests that control of attractors is harder than the other two problems.
• Joonseok PARK, Mikyeong MOON, Keunhyuk YEOM
Article type: PAPER
Subject area: Software Engineering
2012 Volume E95.D Issue 12 Pages 2971-2981
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Software product-line engineering is the successful reuse of technology when applied to component-based software development. The main concept and structure of this technology is developing reusable core assets by applying commonality and variability, and then developing new software reusing these core assets. Recently, the emergence of service-oriented environments, called SOA, has provided flexible reuse environments by reusing pre-developed component structure as service units; this is platform-independent and can integrate into heterogeneous environments. The core asset of an SOA is the service. Therefore, we can increase the reusability of an SOA by combining it with the concept of a product-line. These days, there exists research that combines SOA and product-lines, taking into account reusability. However, current research does not consider the interaction between the provider and consumer in SOA environments. Furthermore, this research tends to focus on more fragmentary aspects of product-line engineering, such as modeling and proposing variability in services. In this paper, we propose a mechanism named 2-Level SOA, including a supporting environment. This proposed mechanism deploys and manages the reusable service. In addition, by reusing and customizing this reusable service, we can develop and generate new services. Our proposed approach provides a structure to maximize the flexibility of SOA, develops services that consider systematic reuse, and constructs service-oriented applications by reusing this pre-developed reusable service. Therefore, our approach can increase both efficiency and productivity when developing service-oriented applications.
• Mohamed Ezzeldin A. BASHIR, Kwang Sun RYU, Unil YUN, Keun Ho RYU
Article type: PAPER
Subject area: Data Engineering, Web Information Systems
2012 Volume E95.D Issue 12 Pages 2982-2990
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
A reliable detection of atrial fibrillation (AF) in Electrocardiogram (ECG) monitoring systems is significant for early treatment and health risk reduction. Various ECG mining and analysis studies have addressed a wide variety of clinical and technical issues. However, there is still room for improvement mostly in two areas. First, the morphological descriptors not only between different patients or patient clusters but also within the same patient are potentially changing. As a result, the model constructed using an old training data no longer needs to be adjusted in order to identify new concepts. Second, the number and types of ECG parameters necessary for detecting AF arrhythmia with high quality encounter a massive number of challenges in relation to computational effort and time consumption. We proposed a mixture technique that caters to these limitations. It includes an active learning method in conjunction with an ECG parameter customization technique to achieve a better AF arrhythmia detection in real-time applications. The performance of our proposed technique showed a sensitivity of 95.2%, a specificity of 99.6%, and an overall accuracy of 99.2%.
• Ming-Kung SUN, Michael CHANG, Hsiao-Ching LIN, Chi-Sung LAIH, Hui-Tang ...
Article type: PAPER
Subject area: Data Engineering, Web Information Systems
2012 Volume E95.D Issue 12 Pages 2991-3000
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Digital Rights Management (DRM) ensures that the usage of digital media adheres to the intentions of the copyright holder and prevents the unauthorized modification or distribution of media. Due to the widespread adoption of digital content use, DRM has received a fair amount of attention and has seen implementation in many commercial models. Although many DRM schemes have been introduced in the literature, they still suffer from some security issues and may not guarantee the quality of performance. In this paper, we propose a trust-distributed DRM model to provide improvements for realistic DRM environments to bring more functionality to users. We use the features of the smart cards to provide an option of anonymity for the consumer while continuing to protect the rights of the copyright holder and the financial interests of the media industry. We also classify the security criteria of DRM systems and show that our proposed smart card based DRM scheme satisfies all of these criteria.
• Michiko INOUE, Akira TAKETANI, Tomokazu YONEDA, Hideo FUJIWARA
Article type: PAPER
Subject area: Dependable Computing
2012 Volume E95.D Issue 12 Pages 3001-3009
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Nano-scale VLSI design is facing the problems of increased test data volume. Small delay defects are becoming possible sources of test escapes, and high delay test quality and therefore a greater volume of test data are required. The increased test data volume requires more tester memory and test application time, and both result in test cost inflation. Test pattern ordering gives a practical solution to reduce test cost, where test patterns are ordered so that more defects can be detected as early as possible. In this paper, we propose a test pattern ordering method based on SDQL (Statistical Delay Quality Level), which is a measure of delay test quality considering small delay defects. Our proposed method orders test patterns so that SDQL shrinks fast, which means more delay defects can be detected as early as possible. The proposed method efficiently orders test patterns with minimal usage of time-consuming timing-aware fault simulation. Experimental results demonstrate that our method can obtain test pattern ordering within a reasonable time, and also suggest how to prepare test sets suitable as inputs of test pattern ordering.
• Kam Swee NG, Hyung-Jeong YANG, Soo-Hyung KIM, Sun-Hee KIM
Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2012 Volume E95.D Issue 12 Pages 3010-3016
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper, we propose a novel incremental method for discovering latent variables from multivariate data with high efficiency. It integrates non-Gaussianity and an adaptive incremental model in an unsupervised way to extract informative features. Our proposed method discovers a small number of compact features from a very large number of features and can still achieve good predictive performance in EEG signals. The promising EEG signal classification results from our experiments prove that this approach can successfully extract important features. Our proposed method also has low memory requirements and computational costs.
• Keisuke OTAKI, Mahito SUGIYAMA, Akihiro YAMAMOTO
Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2012 Volume E95.D Issue 12 Pages 3017-3025
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
We present a privacy preserving method based on inserting dummy data into original data on the data structure called Zero-suppressed BDDs (ZDDs). Our task is distributed itemset mining, which is frequent itemset mining from horizontally partitioned databases stored in distributed places called sites. We focus on the fundamental case in which there are two sites and each site has a database managed by its owner. By dividing the process of distributed itemset mining into the set union and the set intersection, we show how to make the operations secure in the sense of undistinguishability of data, which is our criterion for privacy preserving based on the already proposed criterion, p-indistinguishability. Our method conceals the original data in each operation by inserting dummy data, where ZDDs, BDD-based directed acyclic graphs, are adopted to represent sets of itemsets compactly and to implement the set operations in constructing the distributed itemset mining process. As far as we know, this is the first technique which gives a concrete representation of sets of itemsets and an implementation of set operations for privacy preserving in distributed itemset mining. Our experiments show that the proposed method provides undistinguishability of dummy data. Furthermore, we compare our method with Secure Multiparty Computation (SMC), which is one of the well-known techniques of secure computation.
• Masaji KATAGIRI, Minoru ETOH
Article type: PAPER
Subject area: Office Information Systems, e-Business Modeling
2012 Volume E95.D Issue 12 Pages 3026-3036
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
This paper presents an algorithmic approach to acquiring the influencing relationships among users by discovering implicit influencing group structure from smartphone usage. The method assumes that a time series of users' application downloads and activations can be represented by individual inter-personal influence factors. To achieve better predictive performance and also to avoid over-fitting, a latent feature model is employed. The method tries to extract the latent structures by monitoring cross validating predictive performances on approximated influence matrices with reduced ranks, which are generated based on an initial influence matrix obtained from a training set. The method adopts Nonnegative Matrix Factorization (NMF) to reduce the influence matrix dimension and thus to extract the latent features. To validate and demonstrate its ability, about 160 university students voluntarily participated in a mobile application usage monitoring experiment. An empirical study on real collected data reveals that the influencing structure consisted of six influencing groups with two types of mutual influence, i.e. intra-group influence and inter-group influence. The results also highlight the importance of sparseness control on NMF for discovering latent influencing groups. The obtained influencing structure provides better predictive performance than state-of-the-art collaborative filtering methods as well as conventional methods such as user-based collaborative filtering techniques and simple popularity.
• Chamidu ATUPELAGE, Hiroshi NAGAHASHI, Masahiro YAMAGUCHI, Tokiya ABE, ...
Article type: PAPER
Subject area: Pattern Recognition
2012 Volume E95.D Issue 12 Pages 3037-3045
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Histopathology is a microscopic anatomical study of body tissues and widely used as a cancer diagnosing method. Generally, pathologists examine the structural deviation of cellular and sub-cellular components to diagnose the malignancy of body tissues. These judgments may often subjective to pathologists' skills and personal experiences. However, computational diagnosis tools may circumvent these limitations and improve the reliability of the diagnosis decisions. This paper proposes a prostate image classification method by extracting textural behavior using multifractal analysis. Fractal geometry is used to describe the complexity of self-similar structures as a non-integer exponent called fractal dimension. Natural complex structures (or images) are not self-similar, thus a single exponent (the fractal dimension) may not be adequate to describe the complexity of such structures. Multifractal analysis technique has been introduced to describe the complexity as a spectrum of fractal dimensions. Based on multifractal computation of digital imaging, we obtain two textural feature descriptors; i) local irregularity: α and ii) global regularity: f(α). We exploit these multifractal feature descriptors with a texton dictionary based classification model to discriminate cancer/non-cancer tissues of histopathology images of H&E stained prostate biopsy specimens. Moreover, we examine other three feature descriptors; Gabor filter bank, LM filter bank and Haralick features to benchmark the performance of the proposed method. Experiment results indicated that the performance of the proposed multifractal feature descriptor outperforms the other feature descriptors by achieving over 94% of correct classification accuracy.
• Junbo ZHANG, Fuping PAN, Bin DONG, Qingwei ZHAO, Yonghong YAN
Article type: PAPER
Subject area: Speech and Hearing
2012 Volume E95.D Issue 12 Pages 3046-3052
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
This paper presents our investigation into improving the performance of our previous automatic reading quality assessment system. The method of the baseline system is calculating the average value of the Phone Log-Posterior Probability (PLPP) of all phones in the voice to be assessed, and the average value is used as the reading quality assessment feature. In this paper, we presents three improvements. First, we cluster the triphones, and then calculate the average value of the normalized PLPP for each classification separately, and use this average values as the multi-dimensional assessment features instead of the original one-dimensional assessment feature. This method is simple but effective, which made the score difference of the machine scoring and manual scoring decrease by 30.2% relatively. Second, in order to assess the reading rhythm, we train Gaussian Mixture Models (GMM), which contain the information of each triphone's relative duration under standard pronunciation. Using the GMM, we can calculate the probability that the relative duration of each phone is conform to the standard pronunciation, and the average value of the probabilities is added to the assessment feature vector as a dimension of feature, which decreased the score difference between the machine scoring and manual scoring by 9.7% relatively. Third, we detect Filled Pauses (FP) by analyzing the formant curve, and then calculate the relative duration of FP, and add the relative duration of FP to the assessment feature vector as a dimension of feature. This method made the score difference between the machine scoring and manual scoring be further decreased by 10.2% relatively. Finally, when the feature vector extracted by the three methods are used together, the score difference between the machine scoring and manual scoring was decreased by 43.9% relatively compared to the baseline system.
• Kwangki KIM, Minsoo HAHN, Jinsul KIM
Article type: PAPER
Subject area: Speech and Hearing
2012 Volume E95.D Issue 12 Pages 3053-3059
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
MPEG spatial audio object coding (SAOC) is a new audio coding standard which efficiently represents various audio objects as a down-mix signal and spatial parameters. MPEG SAOC has a backward compatibility with existing playback systems for the down-mix signal. If a mastering signal is used for providing CD-like sound quality instead of the down-mix signal, an output signal decoded with the mastering signal may be easily degraded due to the difference between the down-mix and the mastering signals. To successfully use the mastering signal in MPEG SAOC, the difference between two signals should be eliminated. As a simple mastering signal processing, we propose a mastering signal processing using the mastering down-mix gain (MDG) which is similar to the arbitrary down-mix gain of MPEG Surround. Also, we propose an enhanced mastering signal processing using the MDG bias in order to reduce quantization errors of the MDG. Experimental results show that the proposed schemes can improve sound quality of the output signal decoded with the mastering signal. Especially, the enhanced method shows better performance than the simple method in the aspects of the quantization errors and the sound quality.
• Shuang BAI, Tetsuya MATSUMOTO, Yoshinori TAKEUCHI, Hiroaki KUDO, Nobor ...
Article type: PAPER
Subject area: Image Recognition, Computer Vision
2012 Volume E95.D Issue 12 Pages 3060-3068
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Bag of visual words is a promising approach to object categorization. However, in this framework, ambiguity exists in patch encoding by visual words, due to information loss caused by vector quantization. In this paper, we propose to incorporate patch-level contextual information into bag of visual words for reducing the ambiguity mentioned above. To achieve this goal, we construct a hierarchical codebook in which visual words in the upper hierarchy contain contextual information of visual words in the lower hierarchy. In the proposed method, from each sample point we extract patches of different scales, all of which are described by the SIFT descriptor. Then, we build the hierarchical codebook in which visual words created from coarse scale patches are put in the upper hierarchy, while visual words created from fine scale patches are put in the lower hierarchy. At the same time, by employing the corresponding relationship among these extracted patches, visual words in different hierarchies are associated with each other. After that, we design a method to assign patch pairs, whose patches are extracted from the same sample point, to the constructed codebook. Furthermore, to utilize image information effectively, we implement the proposed method based on two sets of features which are extracted through different sampling strategies and fuse them using a probabilistic approach. Finally, we evaluate the proposed method on dataset Caltech 101 and dataset Caltech 256. Experimental results demonstrate the effectiveness of the proposed method.
• Norimichi UKITA, Kazuki MATSUDA
Article type: PAPER
Subject area: Image Recognition, Computer Vision
2012 Volume E95.D Issue 12 Pages 3069-3077
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
This paper proposes a method for reconstructing accurate 3D surface points. To this end, robust and dense reconstruction with Shape-from-Silhouettes (SfS) and accurate multiview stereo are integrated. Unlike gradual shape shrinking and/or bruteforce large space search by existing space carving approaches, our method obtains 3D points by SfS and stereo independently, and then selects correct ones from them. The point selection is achieved in accordance with spatial consistency and smoothness of 3D point coordinates and normals. The globally optimized points are selected by graph-cuts. Experimental results with several subjects containing complex shapes demonstrate that our method outperforms existing approaches and our previous method.
• Wei ZHOU, Alireza AHRARY, Sei-ichiro KAMATA
Article type: PAPER
Subject area: Image Recognition, Computer Vision
2012 Volume E95.D Issue 12 Pages 3078-3087
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper, we propose Local Curvelet Binary Patterns (LCBP) and Learned Local Curvelet Patterns (LLCP) for presenting the local features of facial images. The proposed methods are based on Curvelet transform which can overcome the weakness of traditional Gabor wavelets in higher dimensions, and better capture the curve singularities and hyperplane singularities of facial images. LCBP can be regarded as a combination of Curvelet features and LBP operator while LLCP designs several learned codebooks from patch sets, which are constructed by sampling patches from Curvelet filtered facial images. Each facial image can be encoded into multiple pattern maps and block-based histograms of these patterns are concatenated into an histogram sequence to be used as a face descriptor. During the face representation phase, one input patch is encoded by one pattern in LCBP while multi-patterns in LLCP. Finally, an effective classifier called Weighted Histogram Spatially constrained Earth Mover's Distance (WHSEMD) which utilizes the discriminative powers of different facial parts, the different patterns and the spatial information of face is proposed. Performance assessment in face recognition and gender estimation under different challenges shows that the proposed approaches are superior than traditional ones.
• Yong CHENG, Jiangchun REN, Zhiying WANG, Songzhu MEI, Jie ZHOU
Article type: LETTER
Subject area: Data Engineering, Web Information Systems
2012 Volume E95.D Issue 12 Pages 3088-3091
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this letter, we introduce a novel keys distribution optimization scheme for CP-ABE based access control. This scheme integrates roles, role hierarchies and objects grouping to accelerate keys distribution, meanwhile the CP-ABE encrypting overhead is reduced by adopting deterministic cryptographic function. Experiments show that our scheme obtains noticeable improvement over the original one, especially when the number of objects is much greater than that of users.
• Yuanwei JING, Yan WANG
Article type: LETTER
Subject area: Information Network
2012 Volume E95.D Issue 12 Pages 3092-3096
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Geographic routing uses the geographical location information provided by nodes to make routing decisions. However, the nodes can not obtain accurate location information due to the effect of measurement error. A new routing strategy using maximum expected distance and angle (MEDA) algorithm is proposed to improve the performance and promote the successive transmission rate. We firstly introduce the expected distance and angle, and then we employ the principal component analysis to construct the object function for selecting the next hop node. We compare the proposed algorithm with maximum expectation within transmission range (MER) and greedy routing scheme (GRS) algorithms. Simulation results show that the proposed MEDA algorithm outperforms the MER and GRS algorithms with higher successive transmission rate.
• Hwai-Tsu HU, Chu YU
Article type: LETTER
Subject area: Information Network
2012 Volume E95.D Issue 12 Pages 3097-3100
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
This study presents an adaptive quantization index modulation scheme applicable on a small audio segment, which in turn allows the watermarking technique to withstand time-shifting and cropping attacks. The exploitation of auditory masking further ensures the robustness and imperceptibility of the embedded watermark. Experimental results confirmed the efficacy of this scheme against common signal processing attacks.
• Guoqi LI
Article type: LETTER
Subject area: Dependable Computing
2012 Volume E95.D Issue 12 Pages 3101-3105
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
Software FMEA is valuable and practically used for embedded software of safety-critical systems. In this paper, a novel method for Software FMEA is presented based on co-analysis of system model and software model. The method is hopeful to detect quantitative and dynamic effects by a targeted software failure. A typical application of the method is provided to illustrate the procedure and the applicable scenarios. In addition, a pattern is refined from the application for further reuse.
• Takanori AYANO, Joe SUZUKI
Article type: LETTER
Subject area: Artificial Intelligence, Data Mining
2012 Volume E95.D Issue 12 Pages 3106-3108
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this paper we consider the two-class classification problem with high-dimensional data. It is important to find a class of distributions such that we cannot expect good performance in classification for any classifier. In this paper, when two population variance-covariance matrices are different, we give a reasonable sufficient condition for distributions such that the misclassification rate converges to the worst value as the dimension of data tends to infinity for any classifier. Our results can give guidelines to decide whether or not an experiment is worth performing in many fields such as bioinformatics.
• Yue nan LI, Hao LUO
Article type: LETTER
Subject area: Image Processing and Video Processing
2012 Volume E95.D Issue 12 Pages 3109-3112
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
In this letter, the problem of feature quantization in robust hashing is studied from the perspective of approximate nearest neighbor (ANN). We model the features of perceptually identical media as ANNs in the feature set and show that ANN indexing can well meet the robustness and discrimination requirements of feature quantization. A feature quantization algorithm is then developed by exploiting the random-projection based ANN indexing. For performance study, the distortion tolerance and randomness of the quantizer are analytically derived. Experimental results demonstrate that the proposed work is superior to state-of-the-art quantizers, and its random nature can provide robust hashing with security against hash forgery.
• Chuanjun WANG, Li LI, Xuefeng BAI, Xiamu NIU
Article type: LETTER
Subject area: Image Recognition, Computer Vision
2012 Volume E95.D Issue 12 Pages 3113-3116
Published: December 01, 2012
Released on J-STAGE: December 01, 2012
JOURNAL FREE ACCESS
The accuracy of non-rigid 3D face recognition is highly influenced by the capability to model the expression deformations. Given a training set of non-neutral and neutral 3D face scan pairs from the same subject, a set of Fourier series coefficients for each face scan is reconstructed. The residues on each frequency of the Fourier series between the finely aligned pairs contain the expression deformation patterns and PCA is applied to learn these patterns. The proposed expression deformation model is then built by the eigenvectors with top eigenvalues from PCA. Recognition experiments are conducted on a 3D face database that features a rich set of facial expression deformations, and experimental results demonstrate the feasibility and merits of the proposed model.