IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E101.D , Issue 2
Showing 1-37 articles out of 37 articles from the selected issue
Special Section on Reconfigurable Systems
  • Minoru WATANABE
    2018 Volume E101.D Issue 2 Pages 277
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS
    Download PDF (73K)
  • Motoki AMAGASAKI, Masato IKEBE, Qian ZHAO, Masahiro IIDA, Toshinori SU ...
    Type: PAPER
    Subject area: Device and Architecture
    2018 Volume E101.D Issue 2 Pages 278-287
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Three-dimensional (3D) field-programmable gate arrays (FPGAs) are expected to offer higher logic density as well as improved delay and power performance by utilizing 3D integrated circuit technology. However, because through-silicon-vias (TSVs) for conventional 3D FPGA interlayer connections have a large area overhead, there is an inherent tradeoff between connectivity and small size. To find a balance between cost and performance, and to explore 3D FPGAs with realistic 3D integration processes, we propose two types of 3D FPGA and construct design tool sets for architecture exploration. In previous research, we created a TSV-free 3D FPGA with a face-down integration method; however, this was limited to two layers. In this paper, we discuss the face-up stacking of several face-down stacked FPGAs. To minimize the number of TSVs, we placed TSVs peripheral to the FPGAs for 3D-FPGA with 4 layers. According to our results, a 2-layer 3D FPGA has reasonable performance when limiting the design to two layers, but a 4-layer 3D FPGA is a better choice when area is emphasized.

    Download PDF (1574K)
  • Hoang-Gia VU, Shinya TAKAMAEDA-YAMAZAKI, Takashi NAKADA, Yasuhiko NAKA ...
    Type: PAPER
    Subject area: Device and Architecture
    2018 Volume E101.D Issue 2 Pages 288-302
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Modern FPGAs have been integrated in computing systems as accelerators for long running applications. This integration puts more pressure on the fault tolerance of computing systems, and the requirement for dependability becomes essential. As in the case of CPU-based system, checkpoint/restart techniques are also expected to improve the dependability of FPGA-based computing. Three issues arise in this situation: how to checkpoint and restart FPGAs, how well this checkpoint/restart model works with the checkpoint/restart model of the whole computing system, and how to build the model by a software tool. In this paper, we first present a new checkpoint/restart architecture along with a checkpointing mechanism on FPGAs. We then propose a method to capture consistent snapshots of FPGA and the rest of the computing system. Third, we provide “fine-grained” management for checkpointing to reduce performance degradation. For the host CPU, we also provide a stack which includes API functions to manage checkpoint/restart procedures on FPGAs. Fourth, we present a Python-based tool to insert checkpointing infrastructure. Experimental results show that the checkpointing architecture causes less than 10% maximum clock frequency degradation, low checkpointing latencies, small memory footprints, and small increases in power consumption, while the LUT overhead varies from 17.98% (Dijkstra) to 160.67% (Matrix Multiplication).

    Download PDF (2185K)
  • Toshihiro KATASHITA, Masakazu HIOKI, Yohei HORI, Hanpei KOIKE
    Type: PAPER
    Subject area: Device and Architecture
    2018 Volume E101.D Issue 2 Pages 303-313
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Field-programmable gate array (FPGA) devices are applied for accelerating specific calculations and reducing power consumption in a wide range of areas. One of the challenges associated with FPGAs is reducing static power for enforcing their power effectiveness. We propose a method involving fine-grained reconfiguration of body biases of logic and net resources to reduce the static power of FPGA devices. In addition, we develop an FPGA device called Flex Power FPGA with SOTB technology and demonstrate its power reduction function with a 32-bit counter circuit. In this paper, we describe the construction of an experimental platform to precisely evaluate power consumption and the maximum operating frequency of the device under various operating voltages and body biases with various practical circuits. Using the abovementioned platform, we evaluate the Flex Power FPGA chip at operating voltages of 0.5-1.0 V and at body biases of 0.0-0.5 V. In the evaluation, we use a 32-bit adder, 16-bit multiplier, and an SBOX circuit for AES cryptography. We operate the chip virtually with uniformed body bias voltage to drive all of the logic resources with the same threshold voltage. We demonstrate the advantage of the Flex Power FPGA by comparing its performance with non-reconfigurable biasing.

    Download PDF (4171K)
  • Hidenori GYOTEN, Masayuki HIROMOTO, Takashi SATO
    Type: PAPER
    Subject area: Device and Architecture
    2018 Volume E101.D Issue 2 Pages 314-323
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    An area-efficient FPGA-based annealing processor that is based on Ising model is proposed. The proposed processor eliminates random number generators (RNGs) and temperature schedulers, which are the key components in the conventional annealing processors and occupying a large portion of the design. Instead, a shift-register-based spin flipping scheme successfully helps the Ising model from stucking in the local optimum solutions. An FPGA implementation and software-based evaluation on max-cut problems of 2D-grid torus structure demonstrate that our annealing processor solves the problems 10-104 times faster than conventional optimization algorithms to obtain the solution of equal accuracy.

    Download PDF (995K)
  • Akira YAMAWAKI, Seiichi SERIKAWA
    Type: PAPER
    Subject area: Design Methodology and Platform
    2018 Volume E101.D Issue 2 Pages 324-334
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    This paper shows a describing method of an image processing software in C for high-level synthesis (HLS) technology considering function chaining to realize an efficient hardware. A sophisticated image processing would be built on the sequence of several primitives represented as sub-functions like the gray scaling, filtering, binarization, thinning, and so on. Conventionally, generic describing methods for each sub-function so that HLS technology can generate an efficient hardware module have been shown. However, few studies have focused on a systematic describing method of the single top function consisting of the sub-functions chained. According to the proposed method, any number of sub-functions can be chained, maintaining the pipeline structure. Thus, the image processing can achieve the near ideal performance of 1 pixel per clock even when the processing chain is long. In addition, implicitly, the deadlock due to the mismatch of the number of pushes and pops on the FIFO connecting the functions is eliminated and the interpolation of the border pixels is done. The case study on a canny edge detection including the chain of some sub-functions demonstrates that our proposal can easily realize the expected hardware mentioned above. The experimental results on ZYNQ FPGA show that our proposal can be converted to the pipelined hardware with moderate size and achieve the performance gain of more than 70 times compared to the software execution. Moreover, the reconstructed C software program following our proposed method shows the small performance degradation of 8% compared with the pure C software through a comparative evaluation preformed on the Cortex A9 embedded processor in ZYNQ FPGA. This fact indicates that a unified image processing library using HLS software which can be executed on CPU or hardware module for HW/SW co-design can be established by using our proposed describing method.

    Download PDF (1526K)
  • Qian ZHAO, Motoki AMAGASAKI, Masahiro IIDA, Morihiro KUGA, Toshinori S ...
    Type: PAPER
    Subject area: Design Methodology and Platform
    2018 Volume E101.D Issue 2 Pages 335-343
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Major cloud service providers, including Amazon and Microsoft, have started employing field-programmable gate arrays (FPGAs) to build high-performance and low-power-consumption cloud capability. However, utilizing an FPGA-enabled cloud is still challenging because of two main reasons. First, the introduction of software and hardware co-design leads to high development complexity. Second, FPGA virtualization and accelerator scheduling techniques are not fully researched for cluster deployment. In this paper, we propose an open-source FPGA-as-a-service (FaaS) platform, the hCODE, to simplify the design, management and deployment of FPGA accelerators at cluster scale. The proposed platform implements a Shell-and-IP design pattern and an open accelerator repository to reduce design and management costs of FPGA projects. Efficient FPGA virtualization and accelerator scheduling techniques are proposed to deploy accelerators on the FPGA-enabled cluster easily. With the proposed hCODE, hardware designers and accelerator users can be organized on one platform to efficiently build open-hardware ecosystem.

    Download PDF (1956K)
  • Shimpei SATO, Ryohei KOBAYASHI, Kenji KISE
    Type: PAPER
    Subject area: Design Methodology and Platform
    2018 Volume E101.D Issue 2 Pages 344-353
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    LSIs are generally designed through four stages including architectural design, logic design, circuit design, and physical design. In architectural design and logic design, designers describe their target hardware in RTL. However, they generally use different languages for each phase. Typically a general purpose programming language such as C or C++ and a hardware description language such as Verilog HDL or VHDL are used for architectural design and logic design, respectively. That is time-consuming way for designing a hardware and more efficient design environment is required. In this paper, we propose a new hardware modeling and high-speed simulation environment for architectural design and logic design. Our environment realizes writing and verifying hardware by one language. The environment consists of (1) a new hardware description language called ArchHDL, which enables to simulate hardware faster than Verilog HDL simulation, and (2) a source code translation tool from ArchHDL code to Verilog HDL code. ArchHDL is a new language for hardware RTL modeling based on C++. The key features of this language are that (1) designers describe a combinational circuit as a function and (2) the ArchHDL library realizes non-blocking assignment in C++. Using these features, designers are able to write a hardware transparently from abstracted level description to RTL description in Verilog HDL-like style. Source codes in ArchHDL is converted to Verilog HDL codes by the translation tool and they are used to synthesize for FPGAs or ASICs. As the evaluation of our environment, we implemented a practical many-core processor in ArchHDL and measured the simulation speed on an Intel CPU and an Intel Xeon Phi processor. The simulation speed for the Intel CPU by ArchHDL achieves about 4.5 times faster than the simulation speed by Synopsys VCS. We also confirmed that the RTL simulation by ArchHDL is efficiently parallelized on the Intel Xeon Phi processor. We convert the ArchHDL code to a Verilog HDL code and estimated the hardware utilization on an FPGA. To implement a 48-node many-core processor, 71% of entire resources of a Virtex-7 FPGA are consumed.

    Download PDF (852K)
  • Akira JINGUJI, Shimpei SATO, Hiroki NAKAHARA
    Type: PAPER
    Subject area: Emerging Applications
    2018 Volume E101.D Issue 2 Pages 354-362
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    A random forest (RF) is a kind of ensemble machine learning algorithm used for a classification and a regression. It consists of multiple decision trees that are built from randomly sampled data. The RF has a simple, fast learning, and identification capability compared with other machine learning algorithms. It is widely used for application to various recognition systems. Since it is necessary to un-balanced trace for each tree and requires communication for all the ones, the random forest is not suitable in SIMD architectures such as GPUs. Although the accelerators using the FPGA have been proposed, such implementations were based on HDL design. Thus, they required longer design time than the soft-ware based realizations. In the previous work, we showed the high-level synthesis design of the RF including the fully pipelined architecture and the all-to-all communication. In this paper, to further reduce the amount of hardware, we use k-means clustering to share comparators of the branch nodes on the decision tree. Also, we develop the krange tool flow, which generates the bitstream with a few number of hyper parameters. Since the proposed tool flow is based on the high-level synthesis design, we can obtain the high performance RF with short design time compared with the conventional HDL design. We implemented the RF on the Xilinx Inc. ZC702 evaluation board. Compared with the CPU (Intel Xeon (R) E5607 Processor) and the GPU (NVidia Geforce Titan) implementations, as for the performance, the FPGA realization was 8.4 times faster than the CPU one, and it was 62.8 times faster than the GPU one. As for the power consumption efficiency, the FPGA realization was 7.8 times better than the CPU one, and it was 385.9 times better than the GPU one.

    Download PDF (1509K)
  • Takeshi OHKAWA, Kazushi YAMASHINA, Hitomi KIMURA, Kanemitsu OOTSU, Tak ...
    Type: PAPER
    Subject area: Emerging Applications
    2018 Volume E101.D Issue 2 Pages 363-375
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    A component-oriented FPGA design platform is proposed for robot system integration. FPGAs are known to be a power-efficient hardware platform, but the development cost of FPGA-based systems is currently too high to integrate them into robot systems. To solve this problem, we propose an FPGA component that allows FPGA devices to be easily integrated into robot systems based on the Robot Operating System (ROS). ROS-compliant FPGA components offer a seamless interface between the FPGA hardware and software running on the CPU. Two experiments were conducted using the proposed components. For the first experiment, the results show that the execution time of an FPGA component for image processing was 1.7 times faster than that of the original software-based component and was 2.51 times more power efficient than an ordinary PC processor, despite substantial communication overhead. The second experiment showed that an FPGA component for sensor fusion was able to process multiple sensor inputs efficiently and with very low latency via parallel processing.

    Download PDF (1902K)
  • Tomoya FUJII, Shimpei SATO, Hiroki NAKAHARA
    Type: PAPER
    Subject area: Emerging Applications
    2018 Volume E101.D Issue 2 Pages 376-386
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    For a pre-trained deep convolutional neural network (CNN) for an embedded system, a high-speed and a low power consumption are required. In the former of the CNN, it consists of convolutional layers, while in the latter, it consists of fully connection layers. In the convolutional layer, the multiply accumulation operation is a bottleneck, while the fully connection layer, the memory access is a bottleneck. The binarized CNN has been proposed to realize many multiply accumulation circuit on the FPGA, thus, the convolutional layer can be done with a high-seed operation. However, even if we apply the binarization to the fully connection layer, the amount of memory was still a bottleneck. In this paper, we propose a neuron pruning technique which eliminates almost part of the weight memory, and we apply it to the fully connection layer on the binarized CNN. In that case, since the weight memory is realized by an on-chip memory on the FPGA, it achieves a high-speed memory access. To further reduce the memory size, we apply the retraining the CNN after neuron pruning. In this paper, we propose a sequential-input parallel-output fully connection layer circuit for the binarized fully connection layer, while proposing a streaming circuit for the binarized 2D convolutional layer. The experimental results showed that, by the neuron pruning, as for the fully connected layer on the VGG-11 CNN, the number of neurons was reduced by 39.8% with keeping the 99% baseline accuracy. We implemented the neuron pruning CNN on the Xilinx Inc. Zynq Zedboard. Compared with the ARM Cortex-A57, it was 1773.0 times faster, it dissipated 3.1 times lower power, and its performance per power efficiency was 5781.3 times better. Also, compared with the Maxwell GPU, it was 11.1 times faster, it dissipated 7.7 times lower power, and its performance per power efficiency was 84.1 times better. Thus, the binarized CNN on the FPGA is suitable for the embedded system.

    Download PDF (1248K)
Regular Section
  • Haiyan HUANG, Chenxi LI
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2018 Volume E101.D Issue 2 Pages 387-395
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Considering that different people are different in their linguistic preference and in order to determine the consensus state when using Computing with Words (CWW) for supporting consensus decision making, this paper first proposes an interval composite scale based 2-tuple linguistic model, which realizes the process of translation from word to interval numerical and the process of retranslation from interval numerical to word. Second, this paper proposes an interval composite scale based personalized individual semantics model (ICS-PISM), which can provide different linguistic representation models for different decision-makers. Finally, this paper proposes a consensus decision making model with ICS-PISM, which includes a semantic translation and retranslation phase during decision process and determines the consensus state of the whole decision process. These models proposed take into full consideration that human language contains vague expressions and usually real-world preferences are uncertain, and provide efficient computation models to support consensus decision making.

    Download PDF (674K)
  • Huimin CAI, Eryun LIU, Hongxia LIU, Shulong WANG
    Type: PAPER
    Subject area: Software System
    2018 Volume E101.D Issue 2 Pages 396-404
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    A real-time road-direction point detection model is developed based on convolutional neural network architecture which can adapt to complex environment. Firstly, the concept of road-direction point is defined for either single road or crossroad. For single road, the predicted road-direction point can serve as a guiding point for a self-driving vehicle to go ahead. In the situation of crossroad, multiple road-direction points can also be detected which will help this vehicle to make a choice from possible directions. Meanwhile, different types of road surface can be classified by this model for both paved roads and unpaved roads. This information will be beneficial for a self-driving vehicle to speed up or slow down according to various road conditions. Finally, the performance of this model is evaluated on different platforms including Jetson TX1. The processing speed can reach 12 FPS on this portable embedded system so that it provides an effective and economic solution of road-direction estimation in the applications of autonomous navigation.

    Download PDF (4294K)
  • Chunyan HOU, Jinsong WANG, Chen CHEN
    Type: PAPER
    Subject area: Software Engineering
    2018 Volume E101.D Issue 2 Pages 405-414
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    System scenarios that derived from system design specification play an important role in the reliability engineering of component-based software systems. Several scenario-based approaches have been proposed to predict the reliability of a system at the design time, most of them adopt flat construction of scenarios, which doesn't conform to software design specifications and is subject to introduce state space explosion problem in the large systems. This paper identifies various challenges related to scenario modeling at the early design stages based on software architecture specification. A novel scenario-based reliability modeling and prediction approach is introduced. The approach adopts hierarchical scenario specification to model software reliability to avoid state space explosion and reduce computational complexity. Finally, the evaluation experiment shows the potential of the approach.

    Download PDF (1145K)
  • Eita FUJISHIMA, Kenji NAKASHIMA, Saneyasu YAMAGUCHI
    Type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2018 Volume E101.D Issue 2 Pages 415-427
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Hadoop is a popular open-source MapReduce implementation. In the cases of jobs, wherein huge scale of output files of all relevant Map tasks are transmitted into Reduce tasks, such as TeraSort, the Reduce tasks are the bottleneck tasks and are I/O bounded for processing many large output files. In most cases, including TeraSort, the intermediate data, which include the output files of the Map tasks, are large and accessed sequentially. For improving the performance of these jobs, it is important to increase the sequential access performance. In this paper, we propose methods for improving the performance of Reduce tasks of such jobs by considering the following two things. One is that these files are accessed sequentially on an HDD, and the other is that each zone in an HDD has different sequential I/O performance. The proposed methods control the location to store intermediate data by modifying block bitmap of filesystem, which manages utilization (free or used) of blocks in an HDD. In addition, we propose striping layout for applying these methods for virtualized environment using image files. We then present performance evaluation of the proposed method and demonstrate that our methods improve the Hadoop application performance.

    Download PDF (3443K)
  • Toru NAKASHIKA
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2018 Volume E101.D Issue 2 Pages 428-436
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Two different types of representations, such as an image and its manually-assigned corresponding labels, generally have complex and strong relationships to each other. In this paper, we represent such deep relationships between two different types of visible variables using an energy-based probabilistic model, called a deep relational model (DRM) to improve the prediction accuracies. A DRM stacks several layers from one visible layer on to another visible layer, sandwiching several hidden layers between them. As with restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs), all connections (weights) between two adjacent layers are undirected. During maximum likelihood (ML) -based training, the network attempts to capture the latent complex relationships between two visible variables with its deep architecture. Unlike deep neural networks (DNNs), 1) the DRM is a totally generative model and 2) allows us to generate one visible variables given the other, and 2) the parameters can be optimized in a probabilistic manner. The DRM can be also fine-tuned using DNNs, like deep belief nets (DBNs) or DBMs pre-training. This paper presents experiments conduced to evaluate the performance of a DRM in image recognition and generation tasks using the MNIST data set. In the image recognition experiments, we observed that the DRM outperformed DNNs even without fine-tuning. In the image generation experiments, we obtained much more realistic images generated from the DRM more than those from the other generative models.

    Download PDF (690K)
  • Chengxiang YIN, Hongjun ZHANG, Rui ZHANG, Zilin ZENG, Xiuli QI, Yuntia ...
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2018 Volume E101.D Issue 2 Pages 437-446
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    The main idea of filter methods in feature selection is constructing a feature-assessing criterion and searching for feature subset that optimizes the criterion. The primary principle of designing such criterion is to capture the relevance between feature subset and the class as precisely as possible. It would be difficult to compute the relevance directly due to the computation complexity when the size of feature subset grows. As a result, researchers adopt approximate strategies to measure relevance. Though these strategies worked well in some applications, they suffer from three problems: parameter determination problem, the neglect of feature interaction information and overestimation of some features. We propose a new feature selection algorithm that could compute mutual information between feature subset and the class directly without deteriorating computation complexity based on the computation of partitions. In light of the specific properties of mutual information and partitions, we propose a pruning rule and a stopping criterion to accelerate the searching speed. To evaluate the effectiveness of the proposed algorithm, we compare our algorithm to the other five algorithms in terms of the number of selected features and the classification accuracies on three classifiers. The results on the six synthetic datasets show that our algorithm performs well in capturing interaction information. The results on the thirteen real world datasets show that our algorithm selects less yet better feature subset.

    Download PDF (4979K)
  • Yu YAN, Kohei HARA, Takenobu KAZUMA, Yasuhiro HISADA, Aiguo HE
    Type: PAPER
    Subject area: Educational Technology
    2018 Volume E101.D Issue 2 Pages 447-454
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Studies have shown that program visualization(PV) is effective for student programming exercise or self-study support. However, very few instructors actively use PV tools for programming lectures. This article discussed the impediments the instructors meet during combining PV tools into lecture classrooms and proposed a C programming classroom instruction support tool based on program visualization — PROVIT-CI (PROgram VIsualization Tool for Classroom Instruction). PROVIT-CI has been consecutively and actively used by the instructors in author's university to enhance their lectures since 2015. The evaluation of application results in an introductory C programming course shows that PROVIT-CI is effective and helpful for instructors classroom use.

    Download PDF (1917K)
  • Tetsuya WATANABE, Hirotsugu KAGA, Shota SHINKAI
    Type: PAPER
    Subject area: Rehabilitation Engineering and Assistive Technology
    2018 Volume E101.D Issue 2 Pages 455-461
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Many text entry methods are available in the use of touch interface devices when using a screen reader, and blind smartphone users and their supporters are eager to know which one is the easiest to learn and the fastest. Thus, we compared the text entry speeds and error counts for four combinations of software keyboards and character-selecting gestures over a period of five days. The split-tap gesture on the Japanese numeric keypad was found to be the fastest across the five days even though this text entry method produced the most errors. The two entry methods on the QWERTY keyboard were slower than the two entry methods on the numeric keypad. This difference in text entry speed was explained by the differences in key pointing and tapping times and their repitition numbers among different methods.

    Download PDF (1754K)
  • Nobukatsu HOJO, Yusuke IJIMA, Hideyuki MIZUNO
    Type: PAPER
    Subject area: Speech and Hearing
    2018 Volume E101.D Issue 2 Pages 462-472
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Deep neural network (DNN)-based speech synthesis can produce more natural synthesized speech than the conventional HMM-based speech synthesis. However, it is not revealed whether the synthesized speech quality can be improved by utilizing a multi-speaker speech corpus. To address this problem, this paper proposes DNN-based speech synthesis using speaker codes as a method to improve the performance of the conventional speaker dependent DNN-based method. In order to model speaker variation in the DNN, the augmented feature (speaker codes) is fed to the hidden layer(s) of the conventional DNN. This paper investigates the effectiveness of introducing speaker codes to DNN acoustic models for speech synthesis for two tasks: multi-speaker modeling and speaker adaptation. For the multi-speaker modeling task, the method we propose trains connection weights of the whole DNN using a multi-speaker speech corpus. When performing multi-speaker synthesis, the speaker code corresponding to the selected target speaker is fed to the DNN to generate the speaker's voice. When performing speaker adaptation, a set of connection weights of the multi-speaker model is re-estimated to generate a new target speaker's voice. We investigated the relationship between the prediction performance and architecture of the DNNs through objective measurements. Objective evaluation experiments revealed that the proposed model outperformed conventional methods (HMMs, speaker dependent DNNs and multi-speaker DNNs based on a shared hidden layer structure). Subjective evaluation experimental results showed that the proposed model again outperformed the conventional methods (HMMs, speaker dependent DNNs), especially when using a small number of target speaker utterances.

    Download PDF (1592K)
  • Kyeongmin JEONG, Kwangyeon CHOI, Donghwan KIM, Byung Cheol SONG
    Type: PAPER
    Subject area: Image Processing and Video Processing
    2018 Volume E101.D Issue 2 Pages 473-480
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Advanced driver assistance system (ADAS) can recognize traffic signals, vehicles, pedestrians, and so on all over the vehicle. However, because the ADAS is based on images taken in an outdoor environment, it is susceptible to ambient weather such as fog. So, preprocessing such as de-fog and de-hazing techniques is required to prevent degradation of object recognition performance due to decreased visibility. But, if such a fog removal technique is applied in an environment where there is little or no fog, the visual quality may be deteriorated due to excessive contrast improvement. And in foggy road environments, typical fog removal algorithms suffer from color distortion. In this paper, we propose a temporal filter-based fog detection algorithm to selectively apply de-fogging method only in the presence of fog. We also propose a method to avoid color distortion by detecting the sky region and applying different methods to the sky region and the non-sky region. Experimental results show that in the actual images, the proposed algorithm shows an average of more than 97% fog detection accuracy, and improves subjective image quality of existing de-fogging algorithms. In addition, the proposed algorithm shows very fast computation time of less than 0.1ms per frame.

    Download PDF (1584K)
  • Yoshiki ITO, Takahiro OGAWA, Miki HASEYAMA
    Type: PAPER
    Subject area: Image Processing and Video Processing
    2018 Volume E101.D Issue 2 Pages 481-490
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    A method for accurate estimation of personalized video preference using multiple users' viewing behavior is presented in this paper. The proposed method uses three kinds of features: a video, user's viewing behavior and evaluation scores for the video given by a target user. First, the proposed method applies Supervised Multiview Spectral Embedding (SMSE) to obtain lower-dimensional video features suitable for the following correlation analysis. Next, supervised Multi-View Canonical Correlation Analysis (sMVCCA) is applied to integrate the three kinds of features. Then we can get optimal projections to obtain new visual features, “canonical video features” reflecting the target user's individual preference for a video based on sMVCCA. Furthermore, in our method, we use not only the target user's viewing behavior but also other users' viewing behavior for obtaining the optimal canonical video features of the target user. This unique approach is the biggest contribution of this paper. Finally, by integrating these canonical video features, Support Vector Ordinal Regression with Implicit Constraints (SVORIM) is trained in our method. Consequently, the target user's preference for a video can be estimated by using the trained SVORIM. Experimental results show the effectiveness of our method.

    Download PDF (615K)
  • Yung-Yao CHEN, Yi-Cheng ZHANG
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2018 Volume E101.D Issue 2 Pages 491-503
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Tracking-by-detection methods consider tracking task as a continuous detection problem applied over video frames. Modern tracking-by-detection trackers have online learning ability; the update stage is essential because it determines how to modify the classifier inherent in a tracker. However, most trackers search for the target within a fixed region centered at the previous object position; thus, they lack spatiotemporal consistency. This becomes a problem when the tracker detects an incorrect object during short-term occlusion. In addition, the scale of the bounding box that contains the target object is usually assumed not to change. This assumption is unrealistic for long-term tracking, where the scale of the target varies as the distance between the target and the camera changes. The accumulation of errors resulting from these shortcomings results in the drift problem, i.e. drifting away from the target object. To resolve this problem, we present a drift-free, online learning-based tracking-by-detection method using a single static camera. We improve the latent structured support vector machine (SVM) tracker by designing a more robust tracker update step by incorporating two Kalman filter modules: the first is used to predict an adaptive search region in consideration of the object motion; the second is used to adjust the scale of the bounding box by accounting for the background model. We propose a hierarchical search strategy that combines Bhattacharyya coefficient similarity analysis and Kalman predictors. This strategy facilitates overcoming occlusion and increases tracking efficiency. We evaluate this work using publicly available videos thoroughly. Experimental results show that the proposed method outperforms the state-of-the-art trackers.

    Download PDF (3599K)
  • Lishuang LI, Xinyu HE, Jieqiong ZHENG, Degen HUANG, Fuji REN
    Type: PAPER
    Subject area: Natural Language Processing
    2018 Volume E101.D Issue 2 Pages 504-511
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Protein-Protein Interaction Extraction (PPIE) from biomedical literatures is an important task in biomedical text mining and has achieved great success on public datasets. However, in real-world applications, the existing PPI extraction methods are limited to label effort. Therefore, transfer learning method is applied to reduce the cost of manual labeling. Current transfer learning methods suffer from negative transfer and lower performance. To tackle this problem, an improved TrAdaBoost algorithm is proposed, that is, relative distribution is introduced to initialize the weights of TrAdaBoost to overcome the negative transfer caused by domain differences. To make further improvement on the performance of transfer learning, an approach combining active learning with the improved TrAdaBoost is presented. The experimental results on publicly available PPI corpora show that our method outperforms TrAdaBoost and SVM when the labeled data is insufficient,and on document classification corpora, it also illustrates that the proposed approaches can achieve better performance than TrAdaBoost and TPTSVM in final, which verifies the effectiveness of our methods.

    Download PDF (551K)
  • Seung-Hoon NA, Young-Kil KIM
    Type: PAPER
    Subject area: Natural Language Processing
    2018 Volume E101.D Issue 2 Pages 512-522
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    In this paper, we propose a novel phrase-based model for Korean morphological analysis by considering a phrase as the basic processing unit, which generalizes all the other existing processing units. The impetus for using phrases this way is largely motivated by the success of phrase-based statistical machine translation (SMT), which convincingly shows that the larger the processing unit, the better the performance. Experimental results using the SEJONG dataset show that the proposed phrase-based models outperform the morpheme-based models used as baselines. In particular, when combined with the conditional random field (CRF) model, our model leads to statistically significant improvements over the state-of-the-art CRF method.

    Download PDF (383K)
  • XueTing LIM, Kenjiro SUGIMOTO, Sei-ichiro KAMATA
    Type: PAPER
    Subject area: Biological Engineering
    2018 Volume E101.D Issue 2 Pages 523-530
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Seed detection or sometimes known as nuclei detection is a prerequisite step of nuclei segmentation which plays a critical role in quantitative cell analysis. The detection result is considered as accurate if each detected seed lies only in one nucleus and is close to the nucleus center. In previous works, voting methods are employed to detect nucleus center by extracting the nucleus saliency features. However, these methods still encounter the risk of false seeding, especially for the heterogeneous intensity images. To overcome the drawbacks of previous works, a novel detection method is proposed, which is called secant normal voting. Secant normal voting achieves good performance with the proposed skipping range. Skipping range avoids over-segmentation by preventing false seeding on the occlusion regions. Nucleus centers are obtained by mean-shift clustering from clouds of voting points. In the experiments, we show that our proposed method outperforms the comparison methods by achieving high detection accuracy without sacrificing the computational efficiency.

    Download PDF (1312K)
  • Jin-Taek SEONG
    Type: LETTER
    Subject area: Fundamentals of Information Systems
    2018 Volume E101.D Issue 2 Pages 531-534
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    In this paper, we consider to develop a recovery algorithm of a sparse signal for a compressed sensing (CS) framework over finite fields. A basic framework of CS for discrete signals rather than continuous signals is established from the linear measurement step to the reconstruction. With predetermined priori distribution of a sparse signal, we reconstruct it by using a message passing algorithm, and evaluate the performance obtained from simulation. We compare our simulation results with the theoretic bounds obtained from probability analysis.

    Download PDF (161K)
  • Jiajun ZHOU, Bo LIU, Lu DENG, Yaofeng CHEN, Zhefeng XIAO
    Type: LETTER
    Subject area: Fundamentals of Information Systems
    2018 Volume E101.D Issue 2 Pages 535-538
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Graph sampling is an effective method to sample a representative subgraph from a large-scale network. Recently, researches have proven that several classical sampling methods are able to produce graph samples but do not well match the distribution of the graph properties in the original graph. On the other hand, the validation of these sampling methods and the scale of a good graph sample have not been examined on weighted graphs. In this paper, we propose the weighted graph sampling problem. We consider the proper size of a good graph sample, propose novel methods to verify the effectiveness of sampling and test several algorithms on real datasets. Most notably, we get new practical results, shedding a new insight on weighted graph sampling. We find weighted random walk performs best compared with other algorithms and a graph sample of 20% is enough for weighted graph sampling.

    Download PDF (265K)
  • Yun-Feng XING, Xiao CHEN, Ming-Xiang GUAN, Zhe-Ming LU
    Type: LETTER
    Subject area: Fundamentals of Information Systems
    2018 Volume E101.D Issue 2 Pages 539-542
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Considering that the traditional local-world evolving network model cannot fully reflect the characteristics of real-world power grids, this Letter proposes a new evolving model based on geographical location clusters. The proposed model takes into account the geographical locations and degree values of nodes, and the growth process is in line with the characteristics of the power grid. Compared with the characteristics of real-world power grids, the results show that the proposed model can simulate the degree distribution of China's power grids when the number of nodes is small. When the number of nodes exceeds 800, our model can simulate the USA western power grid's degree distribution. And the average distances and clustering coefficients of the proposed model are close to that of the real world power grids. All these properties confirm the validity and rationality of our model.

    Download PDF (501K)
  • Hyun KWON, Yongchul KIM, Hyunsoo YOON, Daeseon CHOI
    Type: LETTER
    Subject area: Information Network
    2018 Volume E101.D Issue 2 Pages 543-546
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    We propose new CAPTCHA image generation systems by using generative adversarial network (GAN) techniques to strengthen against CAPTCHA solvers. To verify whether a user is human, CAPTCHA images are widely used on the web industry today. We introduce two different systems for generating CAPTCHA images, namely, the distance GAN (D-GAN) and composite GAN (C-GAN). The D-GAN adds distance values to the original CAPTCHA images to generate new ones, and the C-GAN generates a CAPTCHA image by composing multiple source images. To evaluate the performance of the proposed schemes, we used the CAPTCHA breaker software as CAPTCHA solver. Then, we compared the resistance of the original source images and the generated CAPTCHA images against the CAPTCHA solver. The results show that the proposed schemes improve the resistance to the CAPTCHA solver by over 67.1% and 89.8% depending on the system.

    Download PDF (1190K)
  • Joyce Jiyoung WHANG, Yunseob SHIN
    Type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2018 Volume E101.D Issue 2 Pages 547-551
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    In social and information network analysis, ranking has been considered to be one of the most fundamental and important tasks where the goal is to rank the nodes of a given graph according to their importance. For example, the PageRank and the HITS algorithms are well-known ranking methods. While these traditional ranking methods focus only on the structure of the entire network, we propose to incorporate a local view into node ranking by exploiting the clustering structure of real-world networks. We develop localized ranking mechanisms by partitioning the graphs into a set of tightly-knit groups and extracting each of the groups where the localized ranking is computed. Experimental results show that our localized ranking methods rank the nodes quite differently from the traditional global ranking methods, which indicates that our methods provide new insights and meaningful viewpoints for network analysis.

    Download PDF (2371K)
  • Yangyu FAN, Rui DU, Jianshu WANG
    Type: LETTER
    Subject area: Pattern Recognition
    2018 Volume E101.D Issue 2 Pages 552-555
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    Identification of urban road targets using radar systems is usually heavily dependent on the aspect angle between the target velocity and line of sight of the radar. To improve the performance of the classification result when the target is in a cross range position relative to the radar, a method based on range micro Doppler signature is proposed in this paper. Joint time-frequency analysis is applied in every range cell to extract the time Doppler signature. The spectrograms from all of the target range cells are combined to form the range micro Doppler signature to allow further identification. Experiments were conducted to investigate the performance of the proposed method, and the results proved the effectiveness of the method presented.

    Download PDF (664K)
  • JianFeng WU, HuiBin QIN, YongZhu HUA, LingYan FAN
    Type: LETTER
    Subject area: Speech and Hearing
    2018 Volume E101.D Issue 2 Pages 556-559
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.

    Download PDF (365K)
  • Jinhua WANG, Weiqiang WANG, Guangmei XU, Hongzhe LIU
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2018 Volume E101.D Issue 2 Pages 560-563
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    In this paper, we describe the direct learning of an end-to-end mapping between under-/over-exposed images and well-exposed images. The mapping is represented as a deep convolutional neural network (CNN) that takes multiple-exposure images as input and outputs a high-quality image. Our CNN has a lightweight structure, yet gives state-of-the-art fusion quality. Furthermore, we know that for a given pixel, the influence of the surrounding pixels gradually increases as the distance decreases. If the only pixels considered are those in the convolution kernel neighborhood, the final result will be affected. To overcome this problem, the size of the convolution kernel is often increased. However, this also increases the complexity of the network (too many parameters) and the training time. In this paper, we present a method in which a number of sub-images of the source image are obtained using the same CNN model, providing more neighborhood information for the convolution operation. Experimental results demonstrate that the proposed method achieves better performance in terms of both objective evaluation and visual quality.

    Download PDF (651K)
  • Jingjie YAN, Bojie YAN, Ruiyu LIANG, Guanming LU, Haibo LI, Shipeng XI ...
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2018 Volume E101.D Issue 2 Pages 564-567
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS

    In this paper, we present a novel regression-based robust locality preserving projections (RRLPP) method to effectively deal with the issue of noise and occlusion in facial expression recognition. Similar to robust principal component analysis (RPCA) and robust regression (RR) approach, the basic idea of the presented RRLPP approach is also to lead in the low-rank term and the sparse term of facial expression image sample matrix to simultaneously overcome the shortcoming of the locality preserving projections (LPP) method and enhance the robustness of facial expression recognition. However, RRLPP is a nonlinear robust subspace method which can effectively describe the local structure of facial expression images. The test results on the Multi-PIE facial expression database indicate that the RRLPP method can effectively eliminate the noise and the occlusion problem of facial expression images, and it also can achieve better or comparative facial expression recognition rate compared to the non-robust and robust subspace methods meantime.

    Download PDF (170K)
  • SungIk CHO, JungHyun HAN
    Type: LETTER
    Subject area: Computer Graphics
    2018 Volume E101.D Issue 2 Pages 568-571
    Published: February 01, 2018
    Released: February 01, 2018
    JOURNALS FREE ACCESS
    Supplementary material

    This paper proposes a painterly morphing algorithm for mobile smart devices, where each frame in the morphing sequence looks like an oil-painted picture with brush strokes. It can be presented, for example, during the transition between the main screen and a specific application screen. For this, a novel dissimilarity function and acceleration data structures are developed. The experimental results show that the algorithm produces visually stunning effects at an interactive time.

    Download PDF (2437K)
Errata
feedback
Top