IEICE Transactions on Information and Systems

Special Section on Enriched Multimedia — Multimedia Security and Forensics —

FOREWORD

Masaki KAWAMURA

2021Volume E104.DIssue 1 Pages 1
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUF0001

JOURNAL FREE ACCESS

Download PDF (59K)
Preventing Fake Information Generation Against Media Clone Attacks

Noboru BABAGUCHI, Isao ECHIZEN, Junichi YAMAGISHI, Naoko NITTA, Yuta N ...

Article type: INVITED PAPER
2021Volume E104.DIssue 1 Pages 2-11
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUI0001

JOURNAL FREE ACCESS

Show abstractHide abstract

Fake media has been spreading due to remarkable advances in media processing and machine leaning technologies, causing serious problems in society. We are conducting a research project called Media Clone aimed at developing methods for protecting people from fake but skillfully fabricated replicas of real media called media clones. Such media can be created from fake information about a specific person. Our goal is to develop a trusted communication system that can defend against attacks of media clones. This paper describes some research results of the Media Clone project, in particular, various methods for protecting personal information against generating fake information. We focus on 1) fake information generation in the physical world, 2) anonymization and abstraction in the cyber world, and 3) modeling of media clone attacks.

View full abstract

Download PDF (2910K)
Generation and Detection of Media Clones

Isao ECHIZEN, Noboru BABAGUCHI, Junichi YAMAGISHI, Naoko NITTA, Yuta N ...

Article type: INVITED PAPER
2021Volume E104.DIssue 1 Pages 12-23
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUI0002

JOURNAL FREE ACCESS

Show abstractHide abstract

With the spread of high-performance sensors and social network services (SNS) and the remarkable advances in machine learning technologies, fake media such as fake videos, spoofed voices, and fake reviews that are generated using high-quality learning data and are very close to the real thing are causing serious social problems. We launched a research project, the Media Clone (MC) project, to protect receivers of replicas of real media called media clones (MCs) skillfully fabricated by means of media processing technologies. Our aim is to achieve a communication system that can defend against MC attacks and help ensure safe and reliable communication. This paper describes the results of research in two of the five themes in the MC project: 1) verification of the capability of generating various types of media clones such as audio, visual, and text derived from fake information and 2) realization of a protection shield for media clones' attacks by recognizing them.

View full abstract

Download PDF (2909K)
Target-Oriented Deformation of Visual-Semantic Embedding Space

Takashi MATSUBARA

Article type: PAPER
2021Volume E104.DIssue 1 Pages 24-33
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUP0003

JOURNAL FREE ACCESS

Show abstractHide abstract

Multimodal embedding is a crucial research topic for cross-modal understanding, data mining, and translation. Many studies have attempted to extract representations from given entities and align them in a shared embedding space. However, because entities in different modalities exhibit different abstraction levels and modality-specific information, it is insufficient to embed related entities close to each other. In this study, we propose the Target-Oriented Deformation Network (TOD-Net), a novel module that continuously deforms the embedding space into a new space under a given condition, thereby providing conditional similarities between entities. Unlike methods based on cross-modal attention applied to words and cropped images, TOD-Net is a post-process applied to the embedding space learned by existing embedding systems and improves their performances of retrieval. In particular, when combined with cutting-edge models, TOD-Net gains the state-of-the-art image-caption retrieval model associated with the MS COCO and Flickr30k datasets. Qualitative analysis reveals that TOD-Net successfully emphasizes entity-specific concepts and retrieves diverse targets via handling higher levels of diversity than existing models.

View full abstract

Download PDF (5762K)
Digital Watermarking Method for Printed Matters Using Deep Learning for Detecting Watermarked Areas

Hiroyuki IMAGAWA, Motoi IWATA, Koichi KISE

Article type: PAPER
2021Volume E104.DIssue 1 Pages 34-42
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUP0004

JOURNAL FREE ACCESS

Show abstractHide abstract

There are some technologies like QR codes to obtain digital information from printed matters. Digital watermarking is one of such techniques. Compared with other techniques, digital watermarking is suitable for adding information to images without spoiling their design. For such purposes, digital watermarking methods for printed matters using detection markers or image registration techniques for detecting watermarked areas are proposed. However, the detection markers themselves can damage the appearance such that the advantages of digital watermarking, which do not lose design, are not fully utilized. On the other hand, methods using image registration techniques are not able to work for non-registered images. In this paper, we propose a novel digital watermarking method using deep learning for the detection of watermarked areas instead of using detection markers or image registration. The proposed method introduces a semantic segmentation based on deep learning model for detecting watermarked areas from printed matters. We prepare two datasets for training the deep learning model. One is constituted of geometrically transformed non-watermarked and watermarked images. The number of images in this dataset is relatively large because the images can be generated based on image processing. This dataset is used for pre-training. The other is obtained from actually taken photographs including non-watermarked or watermarked printed matters. The number of this dataset is relatively small because taking the photographs requires a lot of effort and time. However, the existence of pre-training allows a fewer training images. This dataset is used for fine-tuning to improve robustness for print-cam attacks. In the experiments, we investigated the performance of our method by implementing it on smartphones. The experimental results show that our method can carry 96 bits of information with watermarked printed matters.

View full abstract

Download PDF (10991K)
A Scheme of Reversible Data Hiding for the Encryption-Then-Compression System

Masaaki FUJIYOSHI, Ruifeng LI, Hitoshi KIYA

Article type: PAPER
2021Volume E104.DIssue 1 Pages 43-50
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUP0006

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes an encryption-then-compression (EtC) system-friendly data hiding scheme for images, where an EtC system compresses images after they are encrypted. The EtC system divides an image into non-overlapping blocks and applies four block-based processes independently and randomly to the image for visual encryption of the image. The proposed scheme hides data to a plain, i.e., unencrypted image and the scheme can take hidden data out from the image encrypted by the EtC system. Furthermore, the scheme serves reversible data hiding, so it can perfectly recover the unmarked image from the marked image whereas the scheme once distorts unmarked image for hiding data to the image. The proposed scheme copes with the three of four processes in the EtC system, namely, block permutation, rotation/flipping of blocks, and inverting brightness in blocks, whereas the conventional schemes for the system do not cope with the last one. In addition, these conventional schemes have to identify the encrypted image so that image-dependent side information can be used to extract embedded data and to restore the unmarked image, but the proposed scheme does not need such identification. Moreover, whereas the data hiding process must know the block size of encryption in conventional schemes, the proposed scheme needs no prior knowledge of the block size for encryption. Experimental results show the effectiveness of the proposed scheme.

View full abstract

Download PDF (1612K)
Salient Chromagram Extraction Based on Trend Removal for Cover Song Identification

Jin S. SEO

Article type: LETTER
2021Volume E104.DIssue 1 Pages 51-54
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MUL0002

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a salient chromagram by removing local trend to improve cover song identification accuracy. The proposed salient chromagram emphasizes tonal contents of music, which are well-preserved between an original song and its cover version, while reducing the effects of timber difference. We apply the proposed salient chromagram to the sequence-alignment based cover song identification. Experiments on two cover song datasets confirm that the proposed salient chromagram improves the cover song identification accuracy.

View full abstract

Download PDF (217K)

Special Section on Empirical Software Engineering

FOREWORD

Hideaki HATA

2021Volume E104.DIssue 1 Pages 55
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPF0001

JOURNAL FREE ACCESS

Download PDF (142K)
Measurement of Enterprise Smart Business Performance on a Smart Business Management

Chui Young YOON

Article type: PAPER
2021Volume E104.DIssue 1 Pages 56-62
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPP0002

JOURNAL FREE ACCESS

Show abstractHide abstract

Smart business management has been built to efficiently carry out enterprise business activities and improve its business outcomes in a global business circumstance. Firms have applied their smart business to their business activities in order to enhance the smart business results. The outcome of an enterprise's smart business fulfillment has to be managed and measured to effectively establish and control the smart business environment based on its business plan and business departments. In this circumstance, we need the measurement framework that can reasonably gauge a firm's smart business output in order to control and advance its smart business ability. This research presents a measurement instrument for an enterprise smart business performance in terms of a general smart business outcome. The developed measurement scale is verified on its validity and reliability through factor analysis and reliability analysis based on previous literature. This study presents an 11-item measurement tool that can reasonably gauge a firm smart business performance in both of finance and non-finance perspective.

View full abstract

Download PDF (941K)
Mitigation of Flash Crowd in Web Services By Providing Feedback Information to Users

Harumasa TADA, Masayuki MURATA, Masaki AIDA

Article type: PAPER
2021Volume E104.DIssue 1 Pages 63-75
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPP0003

JOURNAL FREE ACCESS

Show abstractHide abstract

The term “flash crowd” describes a situation in which a large number of users access a Web service simultaneously. Flash crowds, in particular, constitute a critical problem in e-commerce applications because of the potential for enormous economic damage as well as difficulty in management. Flash crowds can become more serious depending on users' behavior. When a flash crowd occurs, the delay in server response may cause users to retransmit their requests, thereby adding to the server load. In the present paper, we propose to use the psychological factors of the users for flash crowd mitigation. We aim to analyze changes in the user behavior by presenting feedback information. To evaluate the proposed method, we performed subject experiments and stress tests. Subject experiments showed that, by providing feedback information, the average number of request retransmissions decreased from 1.33 to 0.09, and the subjects that abandoned the service decreased from 81% to 0%. This confirmed that feedback information is effective in influencing user behavior in terms of abandonment and retransmission of requests. Stress tests showed that the average number of retransmissions decreased by 41%, and the proportion of abandonments decreased by 30%. These results revealed that the presentation of feedback information could mitigate the damage caused by flash crowds in real websites, although the effect is limited. The proposed method can be used in conjunction with conventional methods to handle flash crowds.

View full abstract

Download PDF (516K)
Analysis of Work Efficiency and Quality of Software Maintenance Using Cross-Company Dataset

Masateru TSUNODA, Akito MONDEN, Kenichi MATSUMOTO, Sawako OHIWA, Tomok ...

Article type: PAPER
2021Volume E104.DIssue 1 Pages 76-90
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPP0004

JOURNAL FREE ACCESS

Show abstractHide abstract

Software maintenance is an important activity in the software lifecycle. Software maintenance does not only mean removing faults found after software release. Software needs extensions or modifications of its functions owing to changes in the business environment and software maintenance also refers to them. To help users and service suppliers benchmark work efficiency for software maintenance, and to clarify the relationships between software quality, work efficiency, and unit cost of staff, we used a dataset that includes 134 data points collected by the Economic Research Association in 2012, and analyzed the factors that affected the work efficiency of software maintenance. In the analysis, using a multiple regression model, we clarified the relationships between work efficiency and programming language and productivity factors. To analyze the influence to the quality, relationships of fault ratio was analyzed using correlation coefficients. The programming language and productivity factors affect work efficiency. Higher work efficiency and higher unit cost of staff do not affect the quality of software maintenance.

View full abstract

Download PDF (3202K)
Influence of Outliers on Estimation Accuracy of Software Development Effort

Kenichi ONO, Masateru TSUNODA, Akito MONDEN, Kenichi MATSUMOTO

Article type: PAPER
2021Volume E104.DIssue 1 Pages 91-105
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPP0005

JOURNAL FREE ACCESS

Show abstractHide abstract

When applying estimation methods, the issue of outliers is inevitable. The extent of their influence has not been clarified, though several studies have evaluated outlier elimination methods. It is unclear whether we should always be sensitive to outliers, whether outliers should always be removed before estimation, and what amount of precaution is required for collecting project data. Therefore, the goal of this study is to illustrate a guideline that suggests how sensitively we should handle outliers. In the analysis, we experimentally add outliers to three datasets, to analyze their influence. We modified the percentage of outliers, their extent (e.g., we varied the actual effort from 100 to 200 person-hours when the extent was 100%), the variables including outliers (e.g., adding outliers to function points or effort), and the locations of outliers in a dataset. Next, the effort was estimated using these datasets. We used multiple linear regression analysis and analogy based estimation to estimate the development effort. The experimental results indicate that the influence of outliers on the estimation accuracy is non-trivial when the extent or percentage of outliers is considerable (i.e., 100% and 20%, respectively). In contrast, their influence is negligible when the extent and percentage are small (i.e., 50% and 10%, respectively). Moreover, in some cases, the linear regression analysis was less affected by outliers than analogy based estimation.

View full abstract

Download PDF (2645K)
What are the Features of Good Discussions for Shortening Bug Fixing Time?

Yuki NOYORI, Hironori WASHIZAKI, Yoshiaki FUKAZAWA, Hideyuki KANUKA, K ...

Article type: PAPER
2021Volume E104.DIssue 1 Pages 106-116
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPP0007

JOURNAL FREE ACCESS

Show abstractHide abstract

Resource limitations require that bugs be resolved efficiently. The bug modification process uses bug reports, which are generated from service user reports. Developers read these reports and fix bugs. Developers discuss bugs by posting comments directly in bug reports. Although several studies have investigated the initial report in bug reports, few have researched the comments. Our research focuses on bug reports. Currently, everyone is free to comment, but the bug fixing time may be affected by how to comment. Herein we investigate the topic of comments in bug reports. Mixed topics do not affect the bug fixing time. However, the bug fixing time tends to be shorter when the discussion length of the phenomenon is short.

View full abstract

Download PDF (3194K)
Quantitative Evaluation of Software Component Behavior Discovery Approach

Cong LIU

Article type: LETTER
2021Volume E104.DIssue 1 Pages 117-120
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPL0001

JOURNAL FREE ACCESS

Show abstractHide abstract

During the execution of software systems, their execution data can be recorded. By fully exploiting these data, software practitioners can discover behavioral models describing the actual execution of the underlying software system. The recorded unstructured software execution data may be too complex, spanning over several days, etc. Applying existing discovery techniques results in spaghetti-like models with no clear structure and no valuable information for comprehension. Starting from the observation that a software system is composed of a set of logical components, Liu et al. propose to decompose the software behavior discovery problem into smaller independent ones by discovering a behavioral model per component in [1]. However, the effectiveness of the proposed approach is not fully evaluated and compared with existing approaches. In this paper, we evaluate the quality (in terms of understandability/complexity) of discovered component behavior models in a quantitative manner. Based on evaluation, we show that this approach can reduce the complexity of the discovered model and gives a better understanding.

View full abstract

Download PDF (682K)
Relationship between Code Reading Speed and Programmers' Age

Yukasa MURAKAMI, Masateru TSUNODA, Masahide NAKAMURA

Article type: LETTER
2021Volume E104.DIssue 1 Pages 121-125
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020MPL0002

JOURNAL FREE ACCESS

Show abstractHide abstract

According to the aging society, it is getting more important for software industry to secure human resources including senior developers. To enhance the performance of senior developers, we should clarify the strengths and weaknesses of senior developers, and based on that, we should reconsider software engineering education and development support tools. To a greater or lesser extent, many cognitive abilities would be affected by aging, and we focus on the human memory as one of such abilities. We performed preliminary analysis based on the assumption. In the preliminary experiment, we prepared programs in which the influence of human memory performance (i.e., the number of variables remembered in the short-term memory) on reading speed is different, and measured time for subjects to understand the programs. As a result, we observed that the code reading speed of senior subjects was slow, when they read programs in which the influence of human memory performance is larger.

View full abstract

Download PDF (605K)

Regular Section

Native Build System for Unity Builds with Sophisticated Bundle Strategies

Takafumi KUBOTA, Kenji KONO

Article type: PAPER
Subject area: Software Engineering
2021Volume E104.DIssue 1 Pages 126-137
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7105

JOURNAL FREE ACCESS

Show abstractHide abstract

Build systems are essential tools for developing large software projects. Traditionally, build systems have been designed for high incremental-build performance. However, the longer build times of recent large C++ projects have imposed a requirement on build systems: i.e., unity builds. Unity builds are a build technique for speeding up sequential compilation of many source files by bundling multiple source files into one. Unity builds lead to a significant reduction in build time through removal of redundant parsing of shared header files. However, unity builds have a negative effect on incremental builds because each compiler task gets larger. Our previous study reported existing unity builds overlook many better bundle configurations that improve unity-build performance without increasing the incremental-build time. Motivated by the problem, we present a novel build system for better performance in unity builds. Our build system aims to achieve competitive unity-build performance in full builds with mitigating the negative effect on incremental builds. To accomplish this goal, our build system uses sophisticated bundle strategies developed on the basis of hints extracted from the preprocessed code of each source file. Thanks to the strategies, our build system finds better bundle configurations that improve both of the full-build performance and the incremental-build performance in unity builds. For example, in comparison with the state-of-the-art unity builds of WebKit, our build system improves build performance by 9% in full builds, by 39% in incremental builds, and by 23% in continuous builds that include both types of the builds.

View full abstract

Download PDF (1429K)
AdaLSH: Adaptive LSH for Solving c-Approximate Maximum Inner Product Search Problem

Kejing LU, Mineichi KUDO

Article type: PAPER
Subject area: Data Engineering, Web Information Systems
2021Volume E104.DIssue 1 Pages 138-145
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7132

JOURNAL FREE ACCESS

Show abstractHide abstract

Maximum inner product search (MIPS) problem has gained much attention in a wide range of applications. In order to overcome the curse of dimensionality in high-dimensional spaces, most of existing methods first transform the MIPS problem into another approximate nearest neighbor search (ANNS) problem and then solve it by Locality Sensitive Hashing (LSH). However, due to the error incurred by the transmission and incomprehensive search strategies, these methods suffer from low precision and have loose probability guarantees. In this paper, we propose a novel search method named Adaptive-LSH (AdaLSH) to solve MIPS problem more efficiently and more precisely. AdaLSH examines objects in the descending order of both norms and (the probably correctly estimated) cosine angles with a query object in support of LSH with extendable windows. Such extendable windows bring not only efficiency in searching but also the probability guarantee of finding exact or approximate MIP objects. AdaLSH gives a better probability guarantee of success than those in conventional algorithms, bringing less running times on various datasets compared with them. In addition, AdaLSH can even support exact MIPS with probability guarantee.

View full abstract

Download PDF (523K)
Fuzzy Output Support Vector Machine Based Incident Ticket Classification

Libo YANG

Article type: PAPER
Subject area: Artificial Intelligence, Data Mining
2021Volume E104.DIssue 1 Pages 146-151
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7044

JOURNAL FREE ACCESS

Show abstractHide abstract

Incident ticket classification plays an important role in the complex system maintenance. However, low classification accuracy will result in high maintenance costs. To solve this issue, this paper proposes a fuzzy output support vector machine (FOSVM) based incident ticket classification approach, which can be implemented in the context of both two-class SVMs and multi-class SVMs such as one-versus-one and one-versus-rest. Our purpose is to solve the unclassifiable regions of multi-class SVMs to output reliable and robust results by more fine-grained analysis. Experiments on both benchmark data sets and real-world ticket data demonstrate that our method has better performance than commonly used multi-class SVM and fuzzy SVM methods.

View full abstract

Download PDF (567K)
Integration of Experts' and Beginners' Machine Operation Experiences to Obtain a Detailed Task Model

Longfei CHEN, Yuichi NAKAMURA, Kazuaki KONDO, Dima DAMEN, Walterio MAY ...

Article type: PAPER
Subject area: Human-computer Interaction
2021Volume E104.DIssue 1 Pages 152-161
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2019EDP7180

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a novel framework for integrating beginners' machine operational experiences with those of experts' to obtain a detailed task model. Beginners can provide valuable information for operation guidance and task design; for example, from the operations that are easy or difficult for them, the mistakes they make, and the strategy they tend to choose. However, beginners' experiences often vary widely and are difficult to integrate directly. Thus, we consider an operational experience as a sequence of hand-machine interactions at hotspots. Then, a few experts' experiences and a sufficient number of beginners' experiences are unified using two aggregation steps that align and integrate sequences of interactions. We applied our method to more than 40 experiences of a sewing task. The results demonstrate good potential for modeling and obtaining important properties of the task.

View full abstract

Download PDF (3385K)
Presenting Walking Route for VR Zombie

Nobuchika SAKATA, Kohei KANAMORI, Tomu TOMINAGA, Yoshinori HIJIKATA, K ...

Article type: PAPER
Subject area: Human-computer Interaction
2021Volume E104.DIssue 1 Pages 162-173
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7084

JOURNAL FREE ACCESS

Show abstractHide abstract

The aim of this study is to calculate optimal walking routes in real space for users partaking in immersive virtual reality (VR) games without compromising their immersion. To this end, we propose a navigation system to automatically determine the route to be taken by a VR user to avoid collisions with surrounding obstacles. The proposed method is evaluated by simulating a real environment. It is verified to be capable of calculating and displaying walking routes to safely guide users to their destinations without compromising their VR immersion. In addition, while walking in real space while experiencing VR content, users can choose between 6-DoF (six degrees of freedom) and 3-DoF (three degrees of freedom). However, we expect users to prefer 3-DoF conditions, as they tend to walk longer while using VR content. In dynamic situations, when two pedestrians are added to a designated computer-generated real environment, it is necessary to calculate the walking route using moving body prediction and display the moving body in virtual space to preserve immersion.

View full abstract

Download PDF (3093K)
Rethinking the Rotation Invariance of Local Convolutional Features for Content-Based Image Retrieval

Longjiao ZHAO, Yu WANG, Jien KATO

Article type: PAPER
Subject area: Image Processing and Video Processing
2021Volume E104.DIssue 1 Pages 174-182
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7017

JOURNAL FREE ACCESS

Show abstractHide abstract

Recently, local features computed using convolutional neural networks (CNNs) show good performance to image retrieval. The local convolutional features obtained by the CNNs (LC features) are designed to be translation invariant, however, they are inherently sensitive to rotation perturbations. This leads to miss-judgements in retrieval tasks. In this work, our objective is to enhance the robustness of LC features against image rotation. To do this, we conduct a thorough experimental evaluation of three candidate anti-rotation strategies (in-model data augmentation, in-model feature augmentation, and post-model feature augmentation), over two kinds of rotation attack (dataset attack and query attack). In the training procedure, we implement a data augmentation protocol and network augmentation method. In the test procedure, we develop a local transformed convolutional (LTC) feature extraction method, and evaluate it over different network configurations. We end up a series of good practices with steady quantitative supports, which lead to the best strategy for computing LC features with high rotation invariance in image retrieval.

View full abstract

Download PDF (4015K)
Multi-Category Image Super-Resolution with Convolutional Neural Network and Multi-Task Learning

Kazuya URAZOE, Nobutaka KUROKI, Yu KATO, Shinya OHTANI, Tetsuya HIROSE ...

Article type: PAPER
Subject area: Image Processing and Video Processing
2021Volume E104.DIssue 1 Pages 183-193
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7054

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents an image super-resolution technique using a convolutional neural network (CNN) and multi-task learning for multiple image categories. The image categories include natural, manga, and text images. Their features differ from each other. However, several CNNs for super-resolution are trained with a single category. If the input image category is different from that of the training images, the performance of super-resolution is degraded. There are two possible solutions to manage multi-categories with conventional CNNs. The first involves the preparation of the CNNs for every category. This solution, however, requires a category classifier to select an appropriate CNN. The second is to learn all categories with a single CNN. In this solution, the CNN cannot optimize its internal behavior for each category. Therefore, this paper presents a super-resolution CNN architecture for multiple image categories. The proposed CNN has two parallel outputs for a high-resolution image and a category label. The main CNN for the high-resolution image is a normal three convolutional layer-architecture, and the sub neural network for the category label is branched out from its middle layer and consists of two fully-connected layers. This architecture can simultaneously learn the high-resolution image and its category using multi-task learning. The category information is used for optimizing the super-resolution. In an applied setting, the proposed CNN can automatically estimate the input image category and change the internal behavior. Experimental results of 2× image magnification have shown that the average peak signal-to-noise ratio for the proposed method is approximately 0.22 dB higher than that for the conventional super-resolution with no difference in processing time and parameters. We have ensured that the proposed method is useful when the input image category is varying.

View full abstract

Download PDF (4102K)
REAP: A Method for Pruning Convolutional Neural Networks with Performance Preservation

Koji KAMMA, Toshikazu WADA

Article type: PAPER
Subject area: Biocybernetics, Neurocomputing
2021Volume E104.DIssue 1 Pages 194-202
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDP7049

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a pruning method, Reconstruction Error Aware Pruning (REAP), to reduce the redundancy of convolutional neural network models for accelerating their inference. In REAP, we have the following steps: 1) Prune the channels whose outputs are redundant and can be reconstructed from the outputs of other channels in each convolutional layer; 2) Update the weights of the remaining channels by least squares method so as to compensate the error caused by pruning. This is how we compress and accelerate the models that are initially large and slow with little degradation. The ability of REAP to maintain the model performances saves us lots of time and labors for retraining the pruned models. The challenge of REAP is the computational cost for selecting the channels to be pruned. For selecting the channels, we need to solve a huge number of least squares problems. We have developed an efficient algorithm based on biorthogonal system to obtain the solutions of those least squares problems. In the experiments, we show that REAP can conduct pruning with smaller sacrifice of the model performances than several existing methods including the previously state-of-the-art one.

View full abstract

Download PDF (697K)
A Novel Robust Carrier Activation Selection Scheme for OFDM-IM System with Power Allocation

Gui-geng LU, Hai-bin WAN, Tuan-fa QIN, Shu-ping DANG, Zheng-qiang WANG

Article type: LETTER
Subject area: Fundamentals of Information Systems
2021Volume E104.DIssue 1 Pages 203-207
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDL8052

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we investigate the subcarriers combination selection and the subcarriers activation of OFDM-IM system. Firstly, we propose an algorithm to solve the problem of subcarriers combination selection based on the transmission rate and diversity gain. Secondly, we ropose a more concise algorithm to solve the problem of power allocation and carrier combination activation probability under this combination to improve system capacity. Finally, we verify the robustness of the algorithm and the superiority of the system scheme in the block error rate (BLER) and system capacity by numerical results.

View full abstract

Download PDF (1306K)
An Empirical Evaluation of Coverage Criteria for FBD Simulation Using Mutation Analysis

Dong-Ah LEE, Eui-Sub KIM, Junbeom YOO

Article type: LETTER
Subject area: Software Engineering
2021Volume E104.DIssue 1 Pages 208-211
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDL8090

JOURNAL FREE ACCESS

Show abstractHide abstract

Two structural coverage criteria, toggle coverage and modified condition/decision coverage, for FBD (Function Block Diagram) simulation are proposed in the previous study. This paper empirically evaluates how effective the coverage criteria are to detect faults in an FBD program using the mutation analysis.

View full abstract

Download PDF (219K)
Practical Video Authentication Scheme to Analyze Software Characteristics

Wan Yeon LEE

Article type: LETTER
Subject area: Data Engineering, Web Information Systems
2021Volume E104.DIssue 1 Pages 212-215
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDL8081

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a video authentication scheme to verify whether a given video file is recorded by a camera device or touched by a video editing tool. The proposed scheme prepares software characteristics of camera devices and video editing tools in advance, and compares them with the metadata of the given video file. Through practical implementation, we show that the proposed scheme has benefits of fast analysis time, high accuracy and full automation.

View full abstract

Download PDF (261K)
A Novel Multi-Knowledge Distillation Approach

Lianqiang LI, Kangbo SUN, Jie ZHU

Article type: LETTER
Subject area: Artificial Intelligence, Data Mining
2021Volume E104.DIssue 1 Pages 216-219
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDL8080

JOURNAL FREE ACCESS

Show abstractHide abstract

Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.

View full abstract

Download PDF (486K)
Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition

Shilei CHENG, Mei XIE, Zheng MA, Siqi LI, Song GU, Feng YANG

Article type: LETTER
Subject area: Biocybernetics, Neurocomputing
2021Volume E104.DIssue 1 Pages 220-224
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2020EDL0002

JOURNAL FREE ACCESS

Show abstractHide abstract

As characterizing videos simultaneously from spatial and temporal cues have been shown crucial for video processing, with the shortage of temporal information of soft assignment, the vector of locally aggregated descriptor (VLAD) should be considered as a suboptimal framework for learning the spatio-temporal video representation. With the development of attention mechanisms in natural language processing, in this work, we present a novel model with VLAD following spatio-temporal self-attention operations, named spatio-temporal self-attention weighted VLAD (ST-SAWVLAD). In particular, sequential convolutional feature maps extracted from two modalities i.e., RGB and Flow are receptively fed into the self-attention module to learn soft spatio-temporal assignments parameters, which enabling aggregate not only detailed spatial information but also fine motion information from successive video frames. In experiments, we evaluate ST-SAWVLAD by using competitive action recognition datasets, UCF101 and HMDB51, the results shcoutstanding performance. The source code is available at:https://github.com/badstones/st-sawvlad.

View full abstract

Download PDF (543K)

Errata

Erratum: A Multiobjective Optimization Dispatch Method of Wind-Thermal Power System [IEICE Transactions on Information and Systems Vol.E103.D (2020), No.12 pp.2549-2558]

Xiaoxuan GUO, Renxi GONG, Haibo BAO, Zhenkun LU

2021Volume E104.DIssue 1 Pages 225_e1
Published: January 01, 2021
Released on J-STAGE: January 01, 2021

DOIhttps://doi.org/10.1587/transinf.2021EDe0001

JOURNAL FREE ACCESS

Download PDF (24K)

Register with J-STAGE for free!