IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E103.D, Issue 12
Displaying 1-38 of 38 articles from this issue
Special Section on Parallel, Distributed, and Reconfigurable Computing, and Networking
  • Fukuhito OOSHITA
    2020 Volume E103.D Issue 12 Pages 2411
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS
    Download PDF (113K)
  • Lucas Saad Nogueira NUNES, Jacir Luiz BORDIM, Yasuaki ITO, Koji NAKANO
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 12 Pages 2412-2420
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    The volume of digital information is growing at an extremely fast pace which, in turn, exacerbates the need of efficient mechanisms to find the presence of a pattern in an input text or a set of input strings. Combining the processing power of Graphics Processing Unit (GPU) with matching algorithms seems a natural alternative to speedup the string-matching process. This work proposes a Parallel Rabin-Karp implementation (PRK) that encompasses a fast-parallel prefix-sums algorithm to maximize parallelization and accelerate the matching verification. Given an input text T of length n and p patterns of length m, the proposed implementation finds all occurrences of p in T in O(m+q+n/τ+nm/q) time, where q is a sufficiently large prime number and τ is the available number of threads. Sequential and parallel versions of the PRK have been implemented. Experiments have been executed on p≥1 patterns of length m comprising of m=10, 20, 30 characters which are compared against a text string of length n=227. The results show that the parallel implementation of the PRK algorithm on NVIDIA V100 GPU provides speedup surpassing 372 times when compared to the sequential implementation and speedup of 12.59 times against an OpenMP implementation running on a multi-core server with 128 threads. Compared to another prominent GPU implementation, the PRK implementation attained speedup surpassing 37 times.

    Download PDF (472K)
  • Jingcheng SHEN, Fumihiko INO, Albert FARRÉS, Mauricio HANZICH
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 12 Pages 2421-2434
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Graphics processing units (GPUs) are highly efficient architectures for parallel stencil code; however, the small device (i.e., GPU) memory capacity (several tens of GBs) necessitates the use of out-of-core computation to process excess data. Great programming effort is needed to manually implement efficient out-of-core stencil code. To relieve such programming burdens, directive-based frameworks emerged, such as the pipelined accelerator (PACC); however, they usually lack specific optimizations to reduce data transfer. In this paper, we extend PACC with two data-centric optimizations to address data transfer problems. The first is a direct-mapping scheme that eliminates host (i.e., CPU) buffers, which intermediate between the original data and device buffers. The second is a region-sharing scheme that significantly reduces host-to-device data transfer. The extended PACC was applied to an acoustic wave propagator, automatically extending the length of original serial code 2.3-fold to obtain the out-of-core code. Experimental results revealed that on a Tesla V100 GPU, the generated code ran 41.0, 22.1, and 3.6 times as fast as implementations based on Open Multi-Processing (OpenMP), Unified Memory, and the previous PACC, respectively. The generated code also demonstrated usefulness with small datasets that fit in the device capacity, running 1.3 times as fast as an in-core implementation.

    Download PDF (1752K)
  • Ke CUI, Michihiro KOIBUCHI
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 12 Pages 2435-2443
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Random network topologies have been proposed as a low-latency network for parallel computers. Although multicast is a common collective-communication operation, multicast algorithms each of which consists of a large number of unicasts are not well optimized for random network topologies. In this study, we firstly apply a two-opt algorithm for building efficient multicast on random network topologies. The two-opt algorithm creates a skilled ordered list of visiting nodes to minimize the total path hops or the total possible contention counts of unicasts that form the target multicast. We secondly extend to apply the two-opt algorithm for the other collective-communication operations, e.g., allreduce and allgather. The SimGrid discrete-event simulation results show that the two-opt multicast outperforms that in typical MPI implementation by up to 22% of the execution time of an MPI program that repeats the MPI_Bcast function. The two-opt allreduce and the two-opt allgather operations also improve by up to 15% and 14% the execution time when compared to those used in typical MPI implementations, respectively.

    Download PDF (1776K)
  • Chenxu WANG, Yutong LU, Zhiguang CHEN, Junnan LI
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 12 Pages 2444-2456
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

    Download PDF (1657K)
  • Yuxi SUN, Hideharu AMANO
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2457-2462
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Recurrent neural networks (RNNs) have been proven effective for sequence-based tasks thanks to their capability to process temporal information. In real-world systems, deep RNNs are more widely used to solve complicated tasks such as large-scale speech recognition and machine translation. However, the implementation of deep RNNs on traditional hardware platforms is inefficient due to long-range temporal dependence and irregular computation patterns within RNNs. This inefficiency manifests itself in the proportional increase in the latency of RNN inference with respect to the number of layers of deep RNNs on CPUs and GPUs. Previous work has focused mostly on optimizing and accelerating individual RNN cells. To make deep RNN inference fast and efficient, we propose an accelerator based on a multi-FPGA platform called Flow-in-Cloud (FiC). In this work, we show that the parallelism provided by the multi-FPGA system can be taken advantage of to scale up the inference of deep RNNs, by partitioning a large model onto several FPGAs, so that the latency stays close to constant with respect to increasing number of RNN layers. For single-layer and four-layer RNNs, our implementation achieves 31x and 61x speedup compared with an Intel CPU.

    Download PDF (4045K)
  • Masayuki SHIMODA, Youki SADA, Ryosuke KURAMOCHI, Shimpei SATO, Hiroki ...
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2463-2470
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In the realization of convolutional neural networks (CNNs) in resource-constrained embedded hardware, the memory footprint of weights is one of the primary problems. Pruning techniques are often used to reduce the number of weights. However, the distribution of nonzero weights is highly skewed, which makes it more difficult to utilize the underlying parallelism. To address this problem, we present SENTEI*, filter-wise pruning with distillation, to realize hardware-aware network architecture with comparable accuracy. The filter-wise pruning eliminates weights such that each filter has the same number of nonzero weights, and retraining with distillation retains the accuracy. Further, we develop a zero-weight skipping inter-layer pipelined accelerator on an FPGA. The equalization enables inter-filter parallelism, where a processing block for a layer executes filters concurrently with straightforward architecture. Our evaluation of semantic-segmentation tasks indicates that the resulting mIoU only decreased by 0.4 points. Additionally, the speedup and power efficiency of our FPGA implementation were 33.2× and 87.9× higher than those of the mobile GPU. Therefore, our technique realizes hardware-aware network with comparable accuracy.

    Download PDF (1771K)
  • Ryuta KAWANO, Ryota YASUDO, Hiroki MATSUTANI, Michihiro KOIBUCHI, Hide ...
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2471-2479
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Network throughput has become an important issue for big-data analysis on Warehouse-Scale Computing (WSC) systems. It has been reported that randomly-connected inter-switch networks can enlarge the network throughput. For irregular networks, a multi-path routing method called k-shortest path routing is conventionally utilized. However, it cannot efficiently exploit longer-than-shortest paths that would be detour paths to avoid bottlenecks. In this work, a novel routing method called k-optimized path routing to achieve high throughput is proposed for irregular networks. We introduce a heuristic to select detour paths that can avoid bottlenecks in the network to improve the average-case network throughput. Experimental results by network simulation show that the proposed k-optimized path routing can improve the saturation throughput by up to 18.2% compared to the conventional k-shortest path routing. Moreover, it can reduce the computation time required for optimization to 1/2760 at a minimum compared to our previously proposed method.

    Download PDF (1064K)
  • Yao HU, Michihiro KOIBUCHI
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2480-2493
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Due to recent technology progress based on big-data processing, many applications present irregular or unpredictable communication patterns among compute nodes in high-performance computing (HPC) systems. Traditional communication infrastructures, e.g., torus or fat-tree interconnection networks, may not handle well their matchmaking problems with these newly emerging applications. There are already many communication-efficient application mapping algorithms for these typical non-random network topologies, which use nearby compute nodes to reduce the network distances. However, for the above unpredictable communication patterns, it is difficult to efficiently map their applications onto the non-random network topologies. In this context, we recommend using random network topologies as the communication infrastructures, which have drawn increasing attention for the use of HPC interconnects due to their small diameter and average shortest path length (ASPL). We make a comparative study to analyze the impact of application mapping performance on non-random and random network topologies. We propose using topology embedding metrics, i.e., diameter and ASPL, and list several diameter/ASPL-based application mapping algorithms to compare their job scheduling performances, assuming that the communication pattern of each application is unpredictable to the computing system. Evaluation with a large compound application workload shows that, when compared to non-random topologies, random topologies can reduce the average turnaround time up to 39.3% by a random connected mapping method and up to 72.1% by a diameter/ASPL-based mapping algorithm. Moreover, when compared to the baseline topology mapping method, the proposed diameter/ASPL-based topology mapping strategy can reduce up to 48.0% makespan and up to 78.1% average turnaround time, and improve up to 1.9x system utilization over random topologies.

    Download PDF (1801K)
  • Hiromu MIYAZAKI, Takuto KANAMORI, Md Ashraful ISLAM, Kenji KISE
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2494-2503
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    RISC-V is a RISC based open and loyalty free instruction set architecture which has been developed since 2010, and can be used for cost-effective soft processors on FPGAs. The basic 32-bit integer instruction set in RISC-V is defined as RV32I, which is sufficient to support the operating system environment and suits for embedded systems. In this paper, we propose an optimized RV32I soft processor named RVCoreP adopting five-stage pipelining. Three effective methods are applied to the processor to improve the operating frequency. These methods are instruction fetch unit optimization, ALU optimization, and data memory optimization. We implement RVCoreP in Verilog HDL and verify the behavior using Verilog simulation and an actual Xilinx Atrix-7 FPGA board. We evaluate IPC (instructions per cycle), operating frequency, hardware resource utilization, and processor performance. From the evaluation results, we show that RVCoreP achieves 30.0% performance improvement compared with VexRiscv, which is a high-performance and open source RV32I processor selected from some related works.

    Download PDF (992K)
  • Elsayed A. ELSAYED, Kenji KISE
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2504-2517
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Data sorting is an important operation in computer science. It is extensively used in several applications such as database and searching. While high-performance sorting accelerators are in demand, it is very important to pay attention to the hardware resources for such kind of high-performance sorters. In this paper, we propose three FPGA based architectures to accelerate sorting operation based on the merge sorting algorithm. We call our proposals as WMS: Wide Merge Sorter, EHMS: Efficient Hardware Merge Sorter, and EHMSP: Efficient Hardware Merge Sorter Plus. We target the Virtex UltraScale FPGA device. Evaluation results show that our proposed merge sorters maintain both the high-performance and cost-effective properties. While using much fewer hardware resources, our proposed merge sorters achieve higher performance compared to the state-of-the-art. For instance, with 256 sorted records are produced per cycle, implementation results of proposed EHMS show a significant reduction in the required number of Flip Flops (FFs) and Look-Up Tables (LUTs) to about 66% and 79%, respectively over the state-of-the-art merge sorter. Moreover, while requiring fewer hardware resources, EHMS achieves about 1.4x higher throughput than the state-of-the-art merge sorter. For the same number of produced records, proposed WMS also achieves about 1.6x throughput improvement over the state-of-the-art while requiring about 81% of FFs and 76% of LUTs needed by the state-of-the-art sorter.

    Download PDF (1873K)
  • Shougo INOUE, Satoshi FUJITA
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2518-2524
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In this paper, we consider the collaborative editing of two-dimensional (2D) data such as handwritten letters and illustrations. In contrast to the editing of 1D data, which is generally realized by the combination of insertion/deletion of characters, overriding of strokes can have a specific meaning in editing 2D data. In other words, the appearance of the resulting picture depends on the reflection order of strokes to the shared canvas in addition of the absolute coordinate of the strokes. We propose a Peer-to-Peer (P2P) collaborative drawing system consisting of several nodes with replica canvas, in which the consistency among replica canvases is maintained through data channel of WebRTC. The system supports three editing modes concerned with the reflection order of strokes generated by different users. The result of experiments indicates that the proposed system realizes a short latency of around 120 ms, which is a half of a cloud-based system implemented with Firebase Realtime Database. In addition, it realizes a smooth drawing of pictures on remote canvases with a refresh rate of 12 fps.

    Download PDF (443K)
  • SeokHwan KONG, Saikia DIPJYOTI, JaiYong LEE
    Article type: LETTER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2525-2527
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    With the spread of smart cities through 5G and the development of IoT devices, the number of services requiring firm assurance of high capacity and ultra-low delay quality in various forms is increasing. However, continuous growth of large data makes it difficult for a centralized cloud to ensure quality of service. For this, a variety of distributed application architecture researches, such as MEC (Mobile|Mutli-access Edge Computing), are in progress. However, vendor-dependent MEC technology based on VNF (Virtual Network Function) has performance and scalability issues when deploying a variety of 5G-based services. This paper proposes PRISM-MECR, an SDN (Software Defined Network) based hardware accelerated MEC router using P4[3] programmable chip, to improve forwarding performance while minimizing load of host CPU cores in charge of forwarding among MEC technologies.

    Download PDF (489K)
  • Yasuhiro NAKAHARA, Masato KIYAMA, Motoki AMAGASAKI, Masahiro IIDA
    Article type: LETTER
    Subject area: Computer System
    2020 Volume E103.D Issue 12 Pages 2528-2529
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Quantization is an important technique for implementing convolutional neural networks on edge devices. Quantization often requires relearning, but relearning sometimes cannot be always be applied because of issues such as cost or privacy. In such cases, it is important to know the numerical precision required to maintain accuracy. We accurately simulate calculations on hardware and accurately measure the relationship between accuracy and numerical precision.

    Download PDF (238K)
Special Section on Software Agent and Its Applications
  • Yuichi SEI
    2020 Volume E103.D Issue 12 Pages 2530
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS
    Download PDF (128K)
  • Yuta HOSOKAWA, Katsuhide FUJITA
    Article type: PAPER
    2020 Volume E103.D Issue 12 Pages 2531-2539
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In recent years, agreement technologies have garnered interest among agents in the field of multi-agent systems. Automated negotiation is one of the agreement technologies, in which agents negotiate with each other to make an agreement so that they can solve conflicts between their preferences. Although most agents keep their own preferences private, it is necessary to estimate the opponent's preferences to obtain a better agreement. Therefore, opponent modeling is one of the most important elements in automated negotiating strategy. A frequency model is widely used for opponent modeling because of its robustness against various types of strategy while being easy to implement. However, existing frequency models do not consider the opponent's proposal speed and the transition of offers. This study proposes a novel frequency model that considers the opponent's behavior using two main elements: the offer ratio and the weighting function. The offer ratio stabilizes the model against changes in the opponent's offering speed, whereas the weighting function takes the opponent's concession into account. The two experiments conducted herein show that our proposed model is more accurate than other frequency models. Additionally, we find that the agent with the proposed model performs with a significantly higher utility value in negotiations.

    Download PDF (1772K)
  • Ryohei KAWATA, Katsuhide FUJITA
    Article type: PAPER
    2020 Volume E103.D Issue 12 Pages 2540-2548
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Multi-time negotiation which repeats negotiations many times under the same conditions is an important class of automated negotiation. We propose a meta-strategy that selects an agent's individual negotiation strategy for multi-time negotiation. Because the performance of the negotiating agents depends on situational parameters, such as the negotiation domains and the opponents, a suitable and effective individual strategy should be selected according to the negotiation situation. However, most existing agents negotiate based on only one negotiation policy: one bidding strategy, one acceptance strategy, and one opponent modeling method. Although the existing agents effectively negotiate in most situations, they do not work well in particular situations and their utilities are decreased. The proposed meta-strategy provides an effective negotiation strategy for the situation at the beginning of the negotiation. We model the meta-strategy as a multi-armed bandit problem that regards an individual negotiation strategy as a slot machine and utility of the agent as a reward. We implement the meta-strategy as the negotiating agents that use existing effective agents as the individual strategies. The experimental results demonstrate the effectiveness of our meta-strategy under various negotiation conditions. Additionally, the results indicate that the individual utilities of negotiating agents are influenced by the opponents' strategies, the profiles of the opponent and its own profiles.

    Download PDF (1581K)
Regular Section
  • Xiaoxuan GUO, Renxi GONG, Haibo BAO, Zhenkun LU
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 12 Pages 2549-2558
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    It is well known that the large-scale access of wind power to the power system will affect the economic and environmental objectives of power generation scheduling, and also bring new challenges to the traditional deterministic power generation scheduling because of the intermittency and randomness of wind power. In order to deal with these problems, a multiobjective optimization dispatch method of wind-thermal power system is proposed. The method can be described as follows: A multiobjective interval power generation scheduling model of wind-thermal power system is firstly established by describing the wind speed on wind farm as an interval variable, and the minimization of fuel cost and pollution gas emission cost of thermal power unit is chosen as the objective functions. And then, the optimistic and pessimistic Pareto frontiers of the multi-objective interval power generation scheduling are obtained by utilizing an improved normal boundary intersection method with a normal boundary intersection (NBI) combining with a bilevel optimization method to solve the model. Finally, the optimistic and pessimistic compromise solutions is determined by a distance evaluation method. The calculation results of the 16-unit 174-bus system show that by the proposed method, a uniform optimistic and pessimistic Pareto frontier can be obtained, the analysis of the impact of wind speed interval uncertainty on the economic and environmental indicators can be quantified. In addition, it has been verified that the Pareto front in the actual scenario is distributed between the optimistic and pessimistic Pareto front, and the influence of different wind power access levels on the optimistic and pessimistic Pareto fronts is analyzed.

    Download PDF (1675K)
  • Ho-Young KIM, Seong-Won LEE
    Article type: PAPER
    Subject area: Software System
    2020 Volume E103.D Issue 12 Pages 2559-2567
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In an internet of things (IoT) system using an energy harvesting device and a secondary (2nd) battery, regardless of the age of the 2nd battery, the power management shortens the lifespan of the entire system. In this paper, we propose a scheme that extends the lifetime of the energy harvesting-based IoT system equipped with a Lithium 2nd battery. The proposed scheme includes several policies of using a supercapacitor as a primary energy storage, limiting the charging level according to the predicted harvesting energy, swinging the energy level around the minimum stress state of charge (SOC) level, and delaying the charge start time. Experiments with natural solar energy measurements based on a battery aging approximation model show that the proposed method can extend the operation lifetime of an existing IoT system from less than one and a half year to more than four years.

    Download PDF (1336K)
  • Akkharawoot TAKHOM, Sasiporn USANAVASIN, Thepchai SUPNITHI, Prachya BO ...
    Article type: PAPER
    Subject area: Software Engineering
    2020 Volume E103.D Issue 12 Pages 2568-2577
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Ontology describes concepts and relations in a specific domain-knowledge that are important for knowledge representation and knowledge sharing. In the past few years, several tools have been introduced for ontology modeling and editing. To design and develop an ontology is one of the challenge tasks and its challenges are quite similar to software development as it requires many collaborative activities from many stakeholders (e.g. domain experts, knowledge engineers, application users, etc.) through the development cycle. Most of the existing tools do not provide collaborative feature to support stakeholders to collaborate work more effectively. In addition, there are lacking of standard process adoption for ontology development task. Thus, in this work, we incorporated ontology development process into Scrum process as used for process standard in software engineering. Based on Scrum, we can perform standard agile development of ontology that can reduce the development cycle as well as it can be responding to any changes better and faster. To support this idea, we proposed a Scrum Ontology Development Framework, which is an online collaborative framework for agile ontology design and development. Each ontology development process based on Scrum model will be supported by different services in our framework, aiming to promote collaborative activities among different roles of stakeholders. In addition to services such as ontology visualized modeling and editing, we also provide three more important features such as 1) concept/relation misunderstanding diagnosis, 2) cross-domain concept detection and 3) concept classification. All these features allow stakeholders to share their understanding and collaboratively discuss to improve quality of domain ontologies through a community consensus.

    Download PDF (3154K)
  • Ying JI, Yu WANG, Jien KATO, Kensaku MORI
    Article type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2020 Volume E103.D Issue 12 Pages 2578-2589
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    With the rapid development of multimedia, violent video can be easily accessed in games, movies, websites, and so on. Identifying violent videos and rating violence extent is of great importance to media filtering and children protection. Many previous studies only address the problems of violence scene detection and violent action recognition, yet violence rating problem is still not solved. In this paper, we present a novel video-level rating prediction method to estimate violence extent automatically. It has two main characteristics: (1) a two-stream network is fine-tuned to construct effective representations of violent videos; (2) a violence rating prediction machine is designed to learn the strength relationship among different videos. Furthermore, we present a novel violent video dataset with a total of 1,930 human-involved violent videos designed for violence rating analysis. Each video is annotated with 6 fine-grained objective attributes, which are considered to be closely related to violence extent. The ground-truth of violence rating is given by pairwise comparison method. The dataset is evaluated in both stability and convergence. Experiment results on this dataset demonstrate the effectiveness of our method compared with the state-of-art classification methods.

    Download PDF (3525K)
  • Hayato YAMAKI, Hiroaki NISHI, Shinobu MIWA, Hiroki HONDA
    Article type: PAPER
    Subject area: Information Network
    2020 Volume E103.D Issue 12 Pages 2590-2599
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    We propose a technique to reduce compulsory misses of packet processing cache (PPC), which largely affects both throughput and energy of core routers. Rather than prefetching data, our technique called response prediction cache (RPC) speculatively stores predicted data in PPC without additional access to the low-throughput and power-consuming memory (i.e., TCAM). RPC predicts the data related to a response flow at the arrival of the corresponding request flow, based on the request-response model of internet communications. Our experimental results with 11 real-network traces show that RPC can reduce the PPC miss rate by 13.4% in upstream and 47.6% in downstream on average when we suppose three-layer PPC. Moreover, we extend RPC to adaptive RPC (A-RPC) that selects the use of RPC in each direction within a core router for further improvement in PPC misses. Finally, we show that A-RPC can achieve 1.38x table-lookup throughput with 74% energy consumption per packet, when compared to conventional PPC.

    Download PDF (1379K)
  • Sanghun CHOI, Yichen AN, Iwao SASASE
    Article type: PAPER
    Subject area: Information Network
    2020 Volume E103.D Issue 12 Pages 2600-2610
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    The flooding DDoS attack is a serious problem these days. In order to detect the flooding DDoS attack, the survival approaches and the mitigation approaches have been investigated. Since the survival approach occurs the burden on the victims, the mitigation approach is mainly studied. As for the mitigation approaches, to detect the flooding DDoS attack, the conventional schemes using the bloom filter, machine learning, and pattern analyzation have been investigated. However, those schemes are not effective to ensure the high accuracy (ACC), the high true positive rate (TPR), and the low false positive rate (FPR). In addition, the data size and calculation time are high. Moreover, the performance is not effective from the fluctuant attack packet per second (pps). In order to effectively detect the flooding DDoS attack, we propose the lightweight detection using bloom filter against flooding DDoS attack. To detect the flooding DDoS attack and ensure the high accuracy, the high true positive rate, and the low false positive rate, the dec-all (decrement-all) operation and the checkpoint are flexibly changed from the fluctuant pps in the bloom filter. Since we only consider the IP address, all kinds of flooding attacks can be detected without the blacklist and whitelist. Moreover, there is no complexity to recognize the attack. By the computer simulation with the datasets, we show our scheme achieves an accuracy of 97.5%. True positive rate and false positive rate show 97.8% and 6.3%, respectively. The data size for processing is much small as 280bytes. Furthermore, our scheme can detect the flooding DDoS attack at once in 11.1sec calculation time.

    Download PDF (1724K)
  • Haitao XIE, Qingtao FAN, Qian XIAO
    Article type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 12 Pages 2611-2619
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Nowadays recommender systems (RS) keep drawing attention from academia, and collaborative filtering (CF) is the most successful technique for building RS. To overcome the inherent limitation, which is referred to as data sparsity in CF, various solutions are proposed to incorporate additional social information into recommendation processes, such as trust networks. However, existing methods suffer from multi-source data integration (i.e., fusion of social information and ratings), which is the basis for similarity calculation of user preferences. To this end, we propose a social collaborative filtering method based on novel trust metrics. Firstly, we use Graph Convolutional Networks (GCNs) to learn the associations between social information and user ratings while considering the underlying social network structures. Secondly, we measure the direct-trust values between neighbors by representing multi-source data as user ratings on popular items, and then calculate the indirect-trust values based on trust propagations. Thirdly, we employ all trust values to create a social regularization in user-item rating matrix factorization in order to avoid overfittings. The experiments on real datasets show that our approach outperforms the other state-of-the-art methods on usage of multi-source data to alleviate data sparsity.

    Download PDF (1864K)
  • Kazuki SESHIMO, Akira OTA, Daichi NISHIO, Satoshi YAMANE
    Article type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 12 Pages 2620-2631
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In recent years, the use of big data has attracted more attention, and many techniques for data analysis have been proposed. Big data analysis is difficult, however, because such data varies greatly in its regularity. Heterogeneous mixture machine learning is one algorithm for analyzing such data efficiently. In this study, we propose online heterogeneous learning based on an online EM algorithm. Experiments show that this algorithm has higher learning accuracy than that of a conventional method and is practical. The online learning approach will make this algorithm useful in the field of data analysis.

    Download PDF (1014K)
  • Hiroyuki OKUDA, Nobuto SUGIE, Tatsuya SUZUKI, Kentaro HARAGUCHI, Zibo ...
    Article type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 12 Pages 2632-2642
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Path planning and motion control are fundamental components to realize safe and reliable autonomous driving. The discrimination of the role of these two components, however, is somewhat obscure because of strong mathematical interaction between these two components. This often results in a redundant computation in the implementation. One of attracting idea to overcome this redundancy is a simultaneous path planning and motion control (SPPMC) based on a model predictive control framework. SPPMC finds the optimal control input considering not only the vehicle dynamics but also the various constraints which reflect the physical limitations, safety constraints and so on to achieve the goal of a given behavior. In driving in the real traffic environment, decision making has also strong interaction with planning and control. This is much more emphasized in the case that several tasks are switched in some context to realize higher-level tasks. This paper presents a basic idea to integrate decision making, path planning and motion control which is able to be executed in realtime. In particular, lane-changing behavior together with the decision of its initiation is selected as the target task. The proposed idea is based on the nonlinear model predictive control and appropriate switching of the cost function and constraints in it. As the result, the decision of the initiation, planning, and control of the lane-changing behavior are achieved by solving a single optimization problem under several constraints such as safety. The validity of the proposed method is tested by using a vehicle simulator.

    Download PDF (1731K)
  • Liyang ZHANG, Hiroyuki SUZUKI, Akio KOYAMA
    Article type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 12 Pages 2643-2648
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In recent years, with the improvement of health awareness, people have paid more and more attention to proper meal. Existing research has shown that a proper meal can help people prevent lifestyle diseases such as diabetes. In this research, by attaching sensors to the tableware, the information during the meal can be captured, and after processing and analyzing it, the meal information, such as time and sequence of meal, can be obtained. This paper introduces how to use supervised learning and multi-instance learning to deal with meal information and a detailed comparison is made. Three supervised learning algorithms and two multi-instance learning algorithms are used in the experiment. The experimental results showed that although the supervised learning algorithms have achieved good results in F-score, the multi-instance learning algorithms have achieved better results not only in accuracy but also in F-score.

    Download PDF (1033K)
  • Kazunori IWATA
    Article type: PAPER
    Subject area: Pattern Recognition
    2020 Volume E103.D Issue 12 Pages 2649-2658
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    The nearest neighbor method is a simple and flexible scheme for the classification of data points in a vector space. It predicts a class label of an unseen data point using a majority rule for the labels of known data points inside a neighborhood of the unseen data point. Because it sometimes achieves good performance even for complicated problems, several derivatives of it have been studied. Among them, the discriminant adaptive nearest neighbor method is particularly worth revisiting to demonstrate its application. The main idea of this method is to adjust the neighbor metric of an unseen data point to the set of known data points before label prediction. It often improves the prediction, provided the neighbor metric is adjusted well. For statistical shape analysis, shape classification attracts attention because it is a vital topic in shape analysis. However, because a shape is generally expressed as a matrix, it is non-trivial to apply the discriminant adaptive nearest neighbor method to shape classification. Thus, in this study, we develop the discriminant adaptive nearest neighbor method to make it slightly more useful in shape classification. To achieve this development, a mixture model and optimization algorithm for shape clustering are incorporated into the method. Furthermore, we describe several helpful techniques for the initial guess of the model parameters in the optimization algorithm. Using several shape datasets, we demonstrated that our method is successful for shape classification.

    Download PDF (638K)
  • Noriyuki MATSUNAGA, Yamato OHTANI, Tatsuya HIRAHARA
    Article type: PAPER
    Subject area: Speech and Hearing
    2020 Volume E103.D Issue 12 Pages 2659-2672
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Deep neural network (DNN)-based speech synthesis became popular in recent years and is expected to soon be widely used in embedded devices and environments with limited computing resources. The key intention of these systems in poor computing environments is to reduce the computational cost of generating speech parameter sequences while maintaining voice quality. However, reducing computational costs is challenging for two primary conventional DNN-based methods used for modeling speech parameter sequences. In feed-forward neural networks (FFNNs) with maximum likelihood parameter generation (MLPG), the MLPG reconstructs the temporal structure of the speech parameter sequences ignored by FFNNs but requires additional computational cost according to the sequence length. In recurrent neural networks, the recursive structure allows for the generation of speech parameter sequences while considering temporal structures without the MLPG, but increases the computational cost compared to FFNNs. We propose a new approach for DNNs to acquire parameters captured from the temporal structure by backpropagating the errors of multiple attributes of the temporal sequence via the loss function. This method enables FFNNs to generate speech parameter sequences by considering their temporal structure without the MLPG. We generated the fundamental frequency sequence and the mel-cepstrum sequence with our proposed method and conventional methods, and then synthesized and subjectively evaluated the speeches from these sequences. The proposed method enables even FFNNs that work on a frame-by-frame basis to generate speech parameter sequences by considering the temporal structure and to generate sequences perceptually superior to those from the conventional methods.

    Download PDF (1246K)
  • Junya KOGUCHI, Shinnosuke TAKAMICHI, Masanori MORISE, Hiroshi SARUWATA ...
    Article type: PAPER
    Subject area: Speech and Hearing
    2020 Volume E103.D Issue 12 Pages 2673-2681
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    We propose a speech analysis-synthesis and deep neural network (DNN)-based text-to-speech (TTS) synthesis framework using Gaussian mixture model (GMM)-based approximation of full-band spectral envelopes. GMMs have excellent properties as acoustic features in statistic parametric speech synthesis. Each Gaussian function of a GMM fits the local resonance of the spectrum. The GMM retains the fine spectral envelope and achieve high controllability of the structure. However, since conventional speech analysis methods (i.e., GMM parameter estimation) have been formulated for a narrow-band speech, they degrade the quality of synthetic speech. Moreover, a DNN-based TTS synthesis method using GMM-based approximation has not been formulated in spite of its excellent expressive ability. Therefore, we employ peak-picking-based initialization for full-band speech analysis to provide better initialization for iterative estimation of the GMM parameters. We introduce not only prediction error of GMM parameters but also reconstruction error of the spectral envelopes as objective criteria for training DNN. Furthermore, we propose a method for multi-task learning based on minimizing these errors simultaneously. We also propose a post-filter based on variance scaling of the GMM for our framework to enhance synthetic speech. Experimental results from evaluating our framework indicated that 1) the initialization method of our framework outperformed the conventional one in the quality of analysis-synthesized speech; 2) introducing the reconstruction error in DNN training significantly improved the synthetic speech; 3) our variance-scaling-based post-filter further improved the synthetic speech.

    Download PDF (1185K)
  • Tomohiro TAKAHASHI, Katsumi KONISHI, Kazunori URUMA, Toshihiro FURUKAW ...
    Article type: PAPER
    Subject area: Image Processing and Video Processing
    2020 Volume E103.D Issue 12 Pages 2682-2692
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    This paper proposes an image inpainting algorithm based on multiple linear models and matrix rank minimization. Several inpainting algorithms have been previously proposed based on the assumption that an image can be modeled using autoregressive (AR) models. However, these algorithms perform poorly when applied to natural photographs because they assume that an image is modeled by a position-invariant linear model with a fixed model order. In order to improve inpainting quality, this work introduces a multiple AR model and proposes an image inpainting algorithm based on multiple matrix rank minimization with sparse regularization. In doing so, a practical algorithm is provided based on the iterative partial matrix shrinkage algorithm, with numerical examples showing the effectiveness of the proposed algorithm.

    Download PDF (6779K)
  • Kangbo SUN, Jie ZHU
    Article type: PAPER
    Subject area: Image Recognition, Computer Vision
    2020 Volume E103.D Issue 12 Pages 2693-2700
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Location and feature representation of object's parts play key roles in fine-grained visual recognition. To promote the final recognition accuracy without any bounding boxes/part annotations, many studies adopt object location networks to propose bounding boxes/part annotations with only category labels, and then crop the images into partial images to help the classification network make the final decision. In our work, to propose more informative partial images and effectively extract discriminative features from the original and partial images, we propose a two-stage approach that can fuse the original features and partial features by evaluating and ranking the information of partial images. Experimental results show that our proposed approach achieves excellent performance on two benchmark datasets, which demonstrates its effectiveness.

    Download PDF (1011K)
  • Manabu OKAWA
    Article type: PAPER
    Subject area: Image Recognition, Computer Vision
    2020 Volume E103.D Issue 12 Pages 2701-2708
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    In this paper, we propose a novel single-template strategy based on a mean template set and locally/globally weighted dynamic time warping (LG-DTW) to improve the performance of online signature verification. Specifically, in the enrollment phase, we implement a time series averaging method, Euclidean barycenter-based DTW barycenter averaging, to obtain a mean template set considering intra-user variability among reference samples. Then, we acquire a local weighting estimate considering a local stability sequence that is obtained analyzing multiple matching points of an optimal match between the mean template and reference sets. Thereafter, we derive a global weighting estimate based on the variable importance estimated by gradient boosting. Finally, in the verification phase, we apply both local and global weighting methods to acquire a discriminative LG-DTW distance between the mean template set and a query sample. Experimental results obtained on the public SVC2004 Task2 and MCYT-100 signature datasets confirm the effectiveness of the proposed method for online signature verification.

    Download PDF (1666K)
  • Shi QIU, German M. DANIEL, Katsuro INOUE
    Article type: LETTER
    Subject area: Software Engineering
    2020 Volume E103.D Issue 12 Pages 2709-2712
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    For Free and Open Source Software (FOSS), identifying the copyright notices is important. However, both the collaborative manner of FOSS project development and the large number of source files increase its difficulty. In this paper, we aim at automatically identifying the copyright notices in source files based on machine learning techniques. The evaluation experiment shows that our method outperforms FOSSology, the only existing method based on regular expression.

    Download PDF (70K)
  • Guangyuan LIU, Daokun CHEN
    Article type: LETTER
    Subject area: Information Network
    2020 Volume E103.D Issue 12 Pages 2713-2716
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    How to restore virtual network against substrate network failure (e.g. link cut) is one of the key challenges of network virtualization. The traditional virtual network recovery (VNR) methods are mostly based on the idea of centralized control. However, if multiple virtual networks fail at the same time, their recovery processes are usually queued according to a specific priority, which may increase the average waiting time of users. In this letter, we study distributed virtual network recovery (DVNR) method to improve the virtual network recovery efficiency. We establish exclusive virtual machine (VM) for each virtual network and process recovery requests of multiple virtual networks in parallel. Simulation results show that the proposed DVNR method can obtain recovery success rate closely to centralized VNR method while yield ~70% less average recovery time.

    Download PDF (330K)
  • Mengmeng LI, Xiaoguang REN, Yanzhen WANG, Wei QIN, Yi LIU
    Article type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 12 Pages 2717-2720
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Feature selection is important for learning algorithms, and it is still an open problem. Antlion optimizer is an excellent nature inspired method, but it doesn't work well for feature selection. This paper proposes a hybrid approach called Ant-Antlion Optimizer which combines advantages of antlion's smart behavior of antlion optimizer and ant's powerful searching movement of ant colony optimization. A mutation operator is also adopted to strengthen exploration ability. Comprehensive experiments by binary classification problems show that the proposed algorithm is superiority to other state-of-art methods on four performance indicators.

    Download PDF (80K)
  • Farzin MATIN, Yoosoo JEONG, Hanhoon PARK
    Article type: LETTER
    Subject area: Image Processing and Video Processing
    2020 Volume E103.D Issue 12 Pages 2721-2724
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Multiscale retinex is one of the most popular image enhancement methods. However, its control parameters, such as Gaussian kernel sizes, gain, and offset, should be tuned carefully according to the image contents. In this letter, we propose a new method that optimizes the parameters using practical swarm optimization and multi-objective function. The method iteratively verifies the visual quality (i.e. brightness, contrast, and colorfulness) of the enhanced image using a multi-objective function while subtly adjusting the parameters. Experimental results shows that the proposed method achieves better image quality qualitatively and quantitatively compared with other image enhancement methods.

    Download PDF (2684K)
  • Yubo LIU, Yangting LAI, Jianyong CHEN, Lingyu LIANG, Qiaoming DENG
    Article type: LETTER
    Subject area: Computer Graphics
    2020 Volume E103.D Issue 12 Pages 2725-2729
    Published: December 01, 2020
    Released on J-STAGE: December 01, 2020
    JOURNAL FREE ACCESS

    Computer aided design (CAD) technology is widely used for architectural design, but current CAD tools still require high-level design specifications from human. It would be significant to construct an intelligent CAD system allowing automatic architectural layout parsing (AutoALP), which generates candidate designs or predicts architectural attributes without much user intervention. To tackle these problems, many learning-based methods were proposed, and benchmark dataset become one of the essential elements for the data-driven AutoALP. This paper proposes a new dataset called SCUT-AutoALP for multi-paradigm applications. It contains two subsets: 1) Subset-I is for floor plan design containing 300 residential floor plan images with layout, boundary and attribute labels; 2) Subset-II is for urban plan design containing 302 campus plan images with layout, boundary and attribute labels. We analyzed the samples and labels statistically, and evaluated SCUT-AutoALP for different layout parsing tasks of floor plan/urban plan based on conditional generative adversarial networks (cGAN) models. The results verify the effectiveness and indicate the potential applications of SCUT-AutoALP. The dataset is available at https://github.com/designfuturelab702/SCUT-AutoALP-Database-Release.

    Download PDF (2150K)
feedback
Top