IEICE Transactions on Information and Systems

Special Section on Picture Coding and Image Media Processing

FOREWORD

Toshiaki FUJII

2020 Volume E103.D Issue 10 Pages 2035
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020PCF0001

JOURNAL FREE ACCESS

Download PDF (57K)
Local Riesz Pyramid for Faster Phase-Based Video Magnification

Shoichiro TAKEDA, Megumi ISOGAI, Shinya SHIMIZU, Hideaki KIMATA

Article type: PAPER
2020 Volume E103.D Issue 10 Pages 2036-2046
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020PCP0006

JOURNAL FREE ACCESS

Show abstractHide abstract

Phase-based video magnification methods can magnify and reveal subtle motion changes invisible to the naked eye. In these methods, each image frame in a video is decomposed into an image pyramid, and subtle motion changes are then detected as local phase changes with arbitrary orientations at each pixel and each pyramid level. One problem with this process is a long computational time to calculate the local phase changes, which makes high-speed processing of video magnification difficult. Recently, a decomposition technique called the Riesz pyramid has been proposed that detects only local phase changes in the dominant orientation. This technique can remove the arbitrariness of orientations and lower the over-completeness, thus achieving high-speed processing. However, as the resolution of input video increases, a large amount of data must be processed, requiring a long computational time. In this paper, we focus on the correlation of local phase changes between adjacent pyramid levels and present a novel decomposition technique called the local Riesz pyramid that enables faster phase-based video magnification by automatically processing the minimum number of sufficient local image areas at several pyramid levels. Through this minimum pyramid processing, our proposed phase-based video magnification method using the local Riesz pyramid achieves good magnification results within a short computational time.

View full abstract

Download PDF (4869K)
Algorithm-Hardware Co-Design of Real-Time Edge Detection for Deep-Space Autonomous Optical Navigation

Hao XIAO, Yanming FAN, Fen GE, Zhang ZHANG, Xin CHENG

Article type: PAPER
2020 Volume E103.D Issue 10 Pages 2047-2058
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020PCP0002

JOURNAL FREE ACCESS

Show abstractHide abstract

Optical navigation (OPNAV) is the use of the on-board imaging data to provide a direct measurement of the image coordinates of the target as navigation information. Among the optical observables in deep-space, the edge of the celestial body is an important feature that can be utilized for locating the planet centroid. However, traditional edge detection algorithms like Canny algorithm cannot be applied directly for OPNAV due to the noise edges caused by surface markings. Moreover, due to the constrained computation and energy capacity on-board, light-weight image-processing algorithms with less computational complexity are desirable for real-time processing. Thus, to fast and accurately extract the edge of the celestial body from high-resolution satellite imageries, this paper presents an algorithm-hardware co-design of real-time edge detection for OPNAV. First, a light-weight edge detection algorithm is proposed to efficiently detect the edge of the celestial body while suppressing the noise edges caused by surface markings. Then, we further present an FPGA implementation of the proposed algorithm with an optimized real-time performance and resource efficiency. Experimental results show that, compared with the traditional edge detection algorithms, our proposed one enables more accurate celestial body edge detection, while simplifying the hardware implementation.

View full abstract

Download PDF (1122K)
An MMT-Based Hierarchical Transmission Module for 4K/120fps Temporally Scalable Video

Yasuhiro MOCHIDA, Takayuki NAKACHI, Takahiro YAMAGUCHI

Article type: PAPER
2020 Volume E103.D Issue 10 Pages 2059-2066
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020PCP0008

JOURNAL FREE ACCESS

Show abstractHide abstract

High frame rate (HFR) video is attracting strong interest since it is considered as a next step toward providing Ultra-High Definition video service. For instance, the Association of Radio Industries and Businesses (ARIB) standard, the latest broadcasting standard in Japan, defines a 120 fps broadcasting format. The standard stipulates temporally scalable coding and hierarchical transmission by MPEG Media Transport (MMT), in which the base layer and the enhancement layer are transmitted over different paths for flexible distribution. We have developed the first ever MMT transmitter/receiver module for 4K/120fps temporally scalable video. The module is equipped with a newly proposed encapsulation method of temporally scalable bitstreams with correct boundaries. It is also designed to be tolerant to severe network constraints, including packet loss, arrival timing offset, and delay jitter. We conducted a hierarchical transmission experiment for 4K/120fps temporally scalable video. The experiment demonstrated that the MMT module was successfully fabricated and capable of dealing with severe network constraints. Consequently, the module has excellent potential as a means to support HFR video distribution in various network situations.

View full abstract

Download PDF (1879K)
HDR Imaging Based on Image Interpolation and Motion Blur Suppression in Multiple-Exposure-Time Image Sensor

Masahito SHIMAMOTO, Yusuke KAMEDA, Takayuki HAMAMOTO

Article type: LETTER
2020 Volume E103.D Issue 10 Pages 2067-2071
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020PCL0001

JOURNAL FREE ACCESS

Show abstractHide abstract

We aim at HDR imaging with simple processing while preventing spatial resolution degradation in multiple-exposure-time image sensor where the exposure time is controlled for each pixel. The contributions are the proposal of image interpolation by motion area detection and pixel adaptive weighting method by overexposure and motion blur detection.

View full abstract

Download PDF (5284K)

Regular Section

Construction of an Efficient Divided/Distributed Neural Network Model Using Edge Computing

Ryuta SHINGAI, Yuria HIRAGA, Hisakazu FUKUOKA, Takamasa MITANI, Takash ...

Article type: PAPER
Subject area: Fundamentals of Information Systems
2020 Volume E103.D Issue 10 Pages 2072-2082
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7326

JOURNAL FREE ACCESS

Show abstractHide abstract

Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.

View full abstract

Download PDF (1503K)
Job-Aware File-Storage Optimization for Improved Hadoop I/O Performance

Makoto NAKAGAMI, Jose A.B. FORTES, Saneyasu YAMAGUCHI

Article type: PAPER
Subject area: Software System
2020 Volume E103.D Issue 10 Pages 2083-2093
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7337

JOURNAL FREE ACCESS

Show abstractHide abstract

Hadoop is a popular data-analytics platform based on Google's MapReduce programming model. Hard-disk drives (HDDs) are generally used in big-data analysis, and the effectiveness of the Hadoop platform can be optimized by enhancing its I/O performance. HDD performance varies depending on whether the data are stored in the inner or outer disk zones. This paper proposes a method that utilizes the knowledge of job characteristics to realize efficient data storage in HDDs, which in turn, helps improve Hadoop performance. Per the proposed method, job files that need to be frequently accessed are stored in outer disk tracks which are capable of facilitating sequential-access speeds that are higher than those provided by inner tracks. Thus, the proposed method stores temporary and permanent files in the outer and inner zones, respectively, thereby facilitating fast access to frequently required data. Results of performance evaluation demonstrate that the proposed method improves Hadoop performance by 15.4% when compared to normal cases when file placement is not used. Additionally, the proposed method outperforms a previously proposed placement approach by 11.1%.

View full abstract

Download PDF (2384K)
Empirical Evaluation of Mimic Software Project Data Sets for Software Effort Estimation

Maohua GAN, Zeynep YÜCEL, Akito MONDEN, Kentaro SASAKI

Article type: PAPER
Subject area: Software Engineering
2020 Volume E103.D Issue 10 Pages 2094-2103
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7150

JOURNAL FREE ACCESS

Show abstractHide abstract

To conduct empirical research on industry software development, it is necessary to obtain data of real software projects from industry. However, only few such industry data sets are publicly available; and unfortunately, most of them are very old. In addition, most of today's software companies cannot make their data open, because software development involves many stakeholders, and thus, its data confidentiality must be strongly preserved. To that end, this study proposes a method for artificially generating a “mimic” software project data set, whose characteristics (such as average, standard deviation and correlation coefficients) are very similar to a given confidential data set. Instead of using the original (confidential) data set, researchers are expected to use the mimic data set to produce similar results as the original data set. The proposed method uses the Box-Muller transform for generating normally distributed random numbers; and exponential transformation and number reordering for data mimicry. To evaluate the efficacy of the proposed method, effort estimation is considered as potential application domain for employing mimic data. Estimation models are built from 8 reference data sets and their concerning mimic data. Our experiments confirmed that models built from mimic data sets show similar effort estimation performance as the models built from original data sets, which indicate the capability of the proposed method in generating representative samples.

View full abstract

Download PDF (579K)
Proposing High-Smart Approach for Content Authentication and Tampering Detection of Arabic Text Transmitted via Internet

Fahd N. AL-WESABI

Article type: PAPER
Subject area: Information Network
2020 Volume E103.D Issue 10 Pages 2104-2112
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7011

JOURNAL FREE ACCESS

Show abstractHide abstract

The security and reliability of Arabic text exchanged via the Internet have become a challenging area for the research community. Arabic text is very sensitive to modify by malicious attacks and easy to make changes on diacritics i.e. Fat-ha, Kasra and Damma, which are represent the syntax of Arabic language and can make the meaning is differing. In this paper, a Hybrid of Natural Language Processing and Zero-Watermarking Approach (HNLPZWA) has been proposed for the content authentication and tampering detection of Arabic text. The HNLPZWA approach embeds and detects the watermark logically without altering the original text document to embed a watermark key. Fifth level order of word mechanism based on hidden Markov model is integrated with digital zero-watermarking techniques to improve the tampering detection accuracy issues of the previous literature proposed by the researchers. Fifth-level order of Markov model is used as a natural language processing technique in order to analyze the Arabic text. Moreover, it extracts the features of interrelationship between contexts of the text and utilizes the extracted features as watermark information and validates it later with attacked Arabic text to detect any tampering occurred on it. HNLPZWA has been implemented using PHP with VS code IDE. Tampering detection accuracy of HNLPZWA is proved with experiments using four datasets of varying lengths under multiple random locations of insertion, reorder and deletion attacks of experimental datasets. The experimental results show that HNLPZWA is more sensitive for all kinds of tampering attacks with high level accuracy of tampering detection.

View full abstract

Download PDF (1543K)
Real-Time Detection of Global Cyberthreat Based on Darknet by Estimating Anomalous Synchronization Using Graphical Lasso

Chansu HAN, Jumpei SHIMAMURA, Takeshi TAKAHASHI, Daisuke INOUE, Jun'ic ...

Article type: PAPER
Subject area: Information Network
2020 Volume E103.D Issue 10 Pages 2113-2124
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7076

JOURNAL FREE ACCESS

Show abstractHide abstract

With the rapid evolution and increase of cyberthreats in recent years, it is necessary to detect and understand it promptly and precisely to reduce the impact of cyberthreats. A darknet, which is an unused IP address space, has a high signal-to-noise ratio, so it is easier to understand the global tendency of malicious traffic in cyberspace than other observation networks. In this paper, we aim to capture global cyberthreats in real time. Since multiple hosts infected with similar malware tend to perform similar behavior, we propose a system that estimates a degree of synchronizations from the patterns of packet transmission time among the source hosts observed in unit time of the darknet and detects anomalies in real time. In our evaluation, we perform our proof-of-concept implementation of the proposed engine to demonstrate its feasibility and effectiveness, and we detect cyberthreats with an accuracy of 97.14%. This work is the first practical trial that detects cyberthreats from in-the-wild darknet traffic regardless of new types and variants in real time, and it quantitatively evaluates the result.

View full abstract

Download PDF (1780K)
Complete Double Node Upset Tolerant Latch Using C-Element

Yuta YAMAMOTO, Kazuteru NAMBA

Article type: PAPER
Subject area: Dependable Computing
2020 Volume E103.D Issue 10 Pages 2125-2132
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7103

JOURNAL FREE ACCESS

Show abstractHide abstract

The recent development of semiconductor technology has led to downsized, large-scaled and low-power VLSI systems. However, the incidence of soft errors has increased. Soft errors are temporary events caused by striking of α-rays and high energy neutron radiation. Since the scale of VLSI has become smaller in recent development, it is necessary to consider the occurrence of not only single node upset (SNU) but also double node upset (DNU). The existing High-performance, Low-cost, and DNU Tolerant Latch design (HLDTL) does not completely tolerate DNU. This paper presents a new design of a DNU tolerant latch to resolve this issue by adding some transistors to the HLDTL latch.

View full abstract

Download PDF (1951K)
Optimal Rejuvenation Policies for Non-Markovian Availability Models with Aperiodic Checkpointing

Junjun ZHENG, Hiroyuki OKAMURA, Tadashi DOHI

Article type: PAPER
Subject area: Dependable Computing
2020 Volume E103.D Issue 10 Pages 2133-2142
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7321

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we present non-Markovian availability models for capturing the dynamics of system behavior of an operational software system that undergoes aperiodic time-based software rejuvenation and checkpointing. Two availability models with rejuvenation are considered taking account of the procedure after the completion of rollback recovery operation. We further proceed to investigate whether there exists the optimal rejuvenation schedule that maximizes the steady-state system availability, which is derived by means of the phase expansion technique, since the resulting models are not the trivial stochastic models such as semi-Markov process and Markov regenerative process, so that it is hard to solve them by using the common approaches like Laplace-Stieltjes transform and embedded Markov chain techniques. The numerical experiments are conducted to determine the optimal rejuvenation trigger timing maximizing the steady-state system availability for each availability model, and to compare both two models.

View full abstract

Download PDF (1888K)
Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge

Nicolas BOUGIE, Ryutaro ICHISE

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2020 Volume E103.D Issue 10 Pages 2143-2153
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7170

JOURNAL FREE ACCESS

Show abstractHide abstract

Advances in deep reinforcement learning have demonstrated its effectiveness in a wide variety of domains. Deep neural networks are capable of approximating value functions and policies in complex environments. However, deep neural networks inherit a number of drawbacks. Lack of interpretability limits their usability in many safety-critical real-world scenarios. Moreover, they rely on huge amounts of data to learn efficiently. This may be suitable in simulated tasks, but restricts their use to many real-world applications. Finally, their generalization capability is low, the ability to determine that a situation is similar to one encountered previously. We present a method to combine external knowledge and interpretable reinforcement learning. We derive a rule-based variant version of the Sarsa(λ) algorithm, which we call Sarsa-rb(λ), that augments data with prior knowledge and exploits similarities among states. We demonstrate that our approach leverages small amounts of prior knowledge to significantly accelerate the learning in multiple domains such as trading or visual navigation. The resulting agent provides substantial gains in training speed and performance over deep q-learning (DQN), deep deterministic policy gradients (DDPG), and improves stability over proximal policy optimization (PPO).

View full abstract

Download PDF (2176K)
Completion of Missing Labels for Multi-Label Annotation by a Unified Graph Laplacian Regularization

Jonathan MOJOO, Yu ZHAO, Muthu Subash KAVITHA, Junichi MIYAO, Takio KU ...

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2020 Volume E103.D Issue 10 Pages 2154-2161
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7318

JOURNAL FREE ACCESS

Show abstractHide abstract

The task of image annotation is becoming enormously important for efficient image retrieval from the web and other large databases. However, huge semantic information and complex dependency of labels on an image make the task challenging. Hence determining the semantic similarity between multiple labels on an image is useful to understand any incomplete label assignment for image retrieval. This work proposes a novel method to solve the problem of multi-label image annotation by unifying two different types of Laplacian regularization terms in deep convolutional neural network (CNN) for robust annotation performance. The unified Laplacian regularization model is implemented to address the missing labels efficiently by generating the contextual similarity between labels both internally and externally through their semantic similarities, which is the main contribution of this study. Specifically, we generate similarity matrices between labels internally by using Hayashi's quantification method-type III and externally by using the word2vec method. The generated similarity matrices from the two different methods are then combined as a Laplacian regularization term, which is used as the new objective function of the deep CNN. The Regularization term implemented in this study is able to address the multi-label annotation problem, enabling a more effectively trained neural network. Experimental results on public benchmark datasets reveal that the proposed unified regularization model with deep CNN produces significantly better results than the baseline CNN without regularization and other state-of-the-art methods for predicting missing labels.

View full abstract

Download PDF (1543K)
Asymmetric Learning for Stereo Matching Cost Computation

Zhongjian MA, Dongzhen HUANG, Baoqing LI, Xiaobing YUAN

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2020 Volume E103.D Issue 10 Pages 2162-2167
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7002

JOURNAL FREE ACCESS

Show abstractHide abstract

Current stereo matching methods benefit a lot from the precise stereo estimation with Convolutional Neural Networks (CNNs). Nevertheless, patch-based siamese networks rely on the implicit assumption of constant depth within a window, which does not hold for slanted surfaces. Existing methods for handling slanted patches focus on post-processing. In contrast, we propose a novel module for matching cost networks to overcome this bias. Slanted objects appear horizontally stretched between stereo pairs, suggesting that the feature extraction in the horizontal direction should be different from that in the vertical direction. To tackle this distortion, we utilize asymmetric convolutions in our proposed module. Experimental results show that the proposed module in matching cost networks can achieve higher accuracy with fewer parameters compared to conventional methods.

View full abstract

Download PDF (4187K)
Improving Pointing Direction Estimation by Considering Hand- and Ocular-Dominance

Tomohiro MASHITA, Koichi SHINTANI, Kiyoshi KIYOKAWA

Article type: PAPER
Subject area: Human-computer Interaction
2020 Volume E103.D Issue 10 Pages 2168-2177
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7320

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper introduces a user study regarding the effects of hand- and ocular-dominances to pointing gestures. The result of this study is applicable for designing new gesture interfaces which are close to a user's cognition, intuitive, and easy to use. The user study investigates the relationship between the participant's dominances and pointing gestures. Four participant groups—right-handed right-eye dominant, right-handed left-eye dominant, left-handed right-eye dominant and left-handed left-eye dominant—were prepared, and participants were asked to point at the targets on a screen by their left and right hands. The pointing errors among the different participant groups are calculated and compared. The result of this user study shows that using dominant eyes produces better results than using non-dominant eyes and the accuracy increases when the targets are located at the same side of dominant eye. Based on these interesting properties, a method to find the dominant eye for pointing gestures is proposed. This method can find the dominant eye of an individual with more than 90% accuracy.

View full abstract

Download PDF (2193K)
Joint Multi-Patch and Multi-Task CNNs for Robust Face Recognition

Yanfei LIU, Junhua CHEN, Yu QIU

Article type: PAPER
Subject area: Pattern Recognition
2020 Volume E103.D Issue 10 Pages 2178-2187
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7059

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we present a joint multi-patch and multi-task convolutional neural networks (JMM-CNNs) framework to learn more descriptive and robust face representation for face recognition. In the proposed JMM-CNNs, a set of multi-patch CNNs and a feature fusion network are constructed to learn and fuse global and local facial features, then a multi-task learning algorithm, including face recognition task and pose estimation task, is operated on the fused feature to obtain a pose-invariant face representation for the face recognition task. To further enhance the pose insensitiveness of the learned face representation, we also introduce a similarity regularization term on features of the two tasks to propose a regularization loss. Moreover, a simple but effective patch sampling strategy is applied to make the JMM-CNNs have an end-to-end network architecture. Experiments on Multi-PIE dataset demonstrate the effectiveness of the proposed method, and we achieve a competitive performance compared with state-of-the-art methods on Labeled Face in the Wild (LFW), YouTube Faces (YTF) and MegaFace Challenge.

View full abstract

Download PDF (875K)
Single Stage Vehicle Logo Detector Based on Multi-Scale Prediction

Junxing ZHANG, Shuo YANG, Chunjuan BO, Huimin LU

Article type: PAPER
Subject area: Pattern Recognition
2020 Volume E103.D Issue 10 Pages 2188-2198
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7088

JOURNAL FREE ACCESS

Show abstractHide abstract

Vehicle logo detection technology is one of the research directions in the application of intelligent transportation systems. It is an important extension of detection technology based on license plates and motorcycle types. A vehicle logo is characterized by uniqueness, conspicuousness, and diversity. Therefore, thorough research is important in theory and application. Although there are some related works for object detection, most of them cannot achieve real-time detection for different scenes. Meanwhile, some real-time detection methods of single-stage have performed poorly in the object detection of small sizes. In order to solve the problem that the training samples are scarce, our work in this paper is improved by constructing the data of a vehicle logo (VLD-45-S), multi-stage pre-training, multi-scale prediction, feature fusion between deeper with shallow layer, dimension clustering of the bounding box, and multi-scale detection training. On the basis of keeping speed, this article improves the detection precision of the vehicle logo. The generalization of the detection model and anti-interference capability in real scenes are optimized by data enrichment. Experimental results show that the accuracy and speed of the detection algorithm are improved for the object of small sizes.

View full abstract

Download PDF (3203K)
Efficient Salient Object Detection Model with Dilated Convolutional Networks

Fei GUO, Yuan YANG, Yong GAO, Ningmei YU

Article type: PAPER
Subject area: Image Recognition, Computer Vision
2020 Volume E103.D Issue 10 Pages 2199-2207
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7284

JOURNAL FREE ACCESS

Show abstractHide abstract

Introduction of Fully Convolutional Networks (FCNs) has made record progress in salient object detection models. However, in order to retain the input resolutions, deconvolutional networks with unpooling are applied on top of FCNs. This will cause the increase of the computation and network model size in segmentation task. In addition, most deep learning based methods always discard effective saliency prior knowledge completely, which are shown effective. Therefore, an efficient salient object detection method based on deep learning is proposed in our work. In this model, dilated convolutions are exploited in the networks to produce the output with high resolution without pooling and adding deconvolutional networks. In this way, the parameters and depth of the network are decreased sharply compared with the traditional FCNs. Furthermore, manifold ranking model is explored for the saliency refinement to keep the spatial consistency and contour preserving. Experimental results verify that performance of our method is superior with other state-of-art methods. Meanwhile, the proposed model occupies the less model size and fastest processing speed, which is more suitable for the wearable processing systems.

View full abstract

Download PDF (4451K)
A Visual Inspection System for Accurate Positioning of Railway Fastener

Jianwei LIU, Hongli LIU, Xuefeng NI, Ziji MA, Chao WANG, Xun SHAO

Article type: PAPER
Subject area: Image Recognition, Computer Vision
2020 Volume E103.D Issue 10 Pages 2208-2215
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDP7097

JOURNAL FREE ACCESS

Show abstractHide abstract

Automatic disassembly of railway fasteners is of great significance for improving the efficiency of replacing rails. The accurate positioning of fastener is the key factor to realize automatic disassembling. However, most of the existing literature mainly focuses on fastener region positioning and the literature on accurate positioning of fasteners is scarce. Therefore, this paper constructed a visual inspection system for accurate positioning of fastener (VISP). At first, VISP acquires railway image by image acquisition subsystem, and then the subimage of fastener can be obtained by coarse-to-fine method. Subsequently, the accurate positioning of fasteners can be completed by three steps, including contrast enhancement, binarization and spike region extraction. The validity and robustness of the VISP were verified by vast experiments. The results show that VISP has competitive performance for accurate positioning of fasteners. The single positioning time is about 260ms, and the average positioning accuracy is above 90%. Thus, it is with theoretical interest and potential industrial application.

View full abstract

Download PDF (4208K)
Sentence-Embedding and Similarity via Hybrid Bidirectional-LSTM and CNN Utilizing Weighted-Pooling Attention

Degen HUANG, Anil AHMED, Syed Yasser ARAFAT, Khawaja Iftekhar RASHID, ...

Article type: PAPER
Subject area: Natural Language Processing
2020 Volume E103.D Issue 10 Pages 2216-2227
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2018EDP7410

JOURNAL FREE ACCESS

Show abstractHide abstract

Neural networks have received considerable attention in sentence similarity measuring systems due to their efficiency in dealing with semantic composition. However, existing neural network methods are not sufficiently effective in capturing the most significant semantic information buried in an input. To address this problem, a novel weighted-pooling attention layer is proposed to retain the most remarkable attention vector. It has already been established that long short-term memory and a convolution neural network have a strong ability to accumulate enriched patterns of whole sentence semantic representation. First, a sentence representation is generated by employing a siamese structure based on bidirectional long short-term memory and a convolutional neural network. Subsequently, a weighted-pooling attention layer is applied to obtain an attention vector. Finally, the attention vector pair information is leveraged to calculate the score of sentence similarity. An amalgamation of both, bidirectional long short-term memory and a convolutional neural network has resulted in a model that enhances information extracting and learning capacity. Investigations show that the proposed method outperforms the state-of-the-art approaches to datasets for two tasks, namely semantic relatedness and Microsoft research paraphrase identification. The new model improves the learning capability and also boosts the similarity accuracy as well.

View full abstract

Download PDF (2441K)
New Word Detection Using BiLSTM+CRF Model with Features

Jianyong DUAN, Zheng TAN, Mei ZHANG, Hao WANG

Article type: PAPER
Subject area: Natural Language Processing
2020 Volume E103.D Issue 10 Pages 2228-2236
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2019EDP7330

JOURNAL FREE ACCESS

Show abstractHide abstract

With the widespread popularity of a large number of social platforms, an increasing number of new words gradually appear. However, such new words have made some NLP tasks like word segmentation more challenging. Therefore, new word detection is always an important and tough task in NLP. This paper aims to extract new words using the BiLSTM+CRF model which added some features selected by us. These features include word length, part of speech (POS), contextual entropy and degree of word coagulation. Comparing to the traditional new word detection methods, our method can use both the features extracted by the model and the features we select to find new words. Experimental results demonstrate that our model can perform better compared to the benchmark models.

View full abstract

Download PDF (1069K)
Cross-Project Defect Prediction via Semi-Supervised Discriminative Feature Learning

Danlei XING, Fei WU, Ying SUN, Xiao-Yuan JING

Article type: LETTER
Subject area: Software Engineering
2020 Volume E103.D Issue 10 Pages 2237-2240
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDL8044

JOURNAL FREE ACCESS

Show abstractHide abstract

Cross-project defect prediction (CPDP) is a feasible solution to build an accurate prediction model without enough historical data. Although existing methods for CPDP that use only labeled data to build the prediction model achieve great results, there are much room left to further improve on prediction performance. In this paper we propose a Semi-Supervised Discriminative Feature Learning (SSDFL) approach for CPDP. SSDFL first transfers knowledge of source and target data into the common space by using a fully-connected neural network to mine potential similarities of source and target data. Next, we reduce the differences of both marginal distributions and conditional distributions between mapped source and target data. We also introduce the discriminative feature learning to make full use of label information, which is that the instances from the same class are close to each other and the instances from different classes are distant from each other. Extensive experiments are conducted on 10 projects from AEEEM and NASA datasets, and the experimental results indicate that our approach obtains better prediction performance than baselines.

View full abstract

Download PDF (706K)
Robust Transferable Subspace Learning for Cross-Corpus Facial Expression Recognition

Dongliang CHEN, Peng SONG, Wenjing ZHANG, Weijian ZHANG, Bingui XU, Xu ...

Article type: LETTER
Subject area: Pattern Recognition
2020 Volume E103.D Issue 10 Pages 2241-2245
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDL8074

JOURNAL FREE ACCESS

Show abstractHide abstract

In this letter, we propose a novel robust transferable subspace learning (RTSL) method for cross-corpus facial expression recognition. In this method, on one hand, we present a novel distance metric algorithm, which jointly considers the local and global distance distribution measure, to reduce the cross-corpus mismatch. On the other hand, we design a label guidance strategy to improve the discriminate ability of subspace. Thus, the RTSL is much more robust to the cross-corpus recognition problem than traditional transfer learning methods. We conduct extensive experiments on several facial expression corpora to evaluate the recognition performance of RTSL. The results demonstrate the superiority of the proposed method over some state-of-the-art methods.

View full abstract

Download PDF (1299K)
Superpixel Based Hierarchical Segmentation for Color Image

Chong WU, Le ZHANG, Houwang ZHANG, Hong YAN

Article type: LETTER
Subject area: Image Processing and Video Processing
2020 Volume E103.D Issue 10 Pages 2246-2249
Published: October 01, 2020
Released on J-STAGE: October 01, 2020

DOIhttps://doi.org/10.1587/transinf.2020EDL8025

JOURNAL FREE ACCESS

Show abstractHide abstract

In this letter, we propose a hierarchical segmentation (HS) method for color images, which can not only maintain the segmentation accuracy, but also ensure a good speed. In our method, HS adopts the fuzzy simple linear iterative clustering (Fuzzy SLIC) to obtain an over-segmentation result. Then, HS uses the fast fuzzy C-means clustering (FFCM) to produce the rough segmentation result based on superpixels. Finally, HS takes the non-iterative K-means clustering using priority queue (KPQ) to refine the segmentation result. In the validation experiments, we tested our method and compared it with state-of-the-art image segmentation methods on the Berkeley (BSD500) benchmark under different types of noise. The experiment results show that our method outperforms state-of-the-art techniques in terms of accuracy, speed and robustness.

View full abstract

Download PDF (4204K)

Register with J-STAGE for free!