IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E103.D, Issue 1
Displaying 1-23 of 23 articles from this issue
Special Section on Enriched Multimedia - Application of Multimedia Technology and Its Security -
  • Keiichi IWAMURA
    2020 Volume E103.D Issue 1 Pages 1
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS
    Download PDF (68K)
  • KokSheik WONG, ChuanSheng CHAN, AprilPyone MAUNGMAUNG
    Article type: INVITED PAPER
    2020 Volume E103.D Issue 1 Pages 2-10
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    With massive utilization of video in every aspect of our daily lives, managing videos is crucial and demanding. The rich literature of data embedding has proven its viability in managing as well as enriching videos and other multimedia contents, but conventional methods are designed to operate in the media/compression layer. In this work, the synchronization between the audio-video and subtitle tracks within an MP4 format container is manipulated to insert data. Specifically, the data are derived from the statistics of the audio samples and video frames, and it serves as the authentication data for verification purpose. When needed, the inserted authentication data can be extracted and compared against the information computed from the received audio samples and video frames. The proposed method is lightweight because simple statistics, i.e., ‘0’ and ‘1’ at the bit stream level, are treated as the authentication data. Furthermore, unlike conventional MP4 container format-based data insertion technique, the bit stream size remains unchanged before and after data insertion using the proposed method. The proposed authentication method can also be deployed for joint utilization with any existing authentication technique for audio / video as long as these media can be multiplexed into a single bit stream and contained within an MP4 container. Experiments are carried out to verify the basic functionality of the proposed technique as an authentication method.

    Download PDF (1426K)
  • Mariko FUJII, Tomoharu SHIBUYA
    Article type: PAPER
    2020 Volume E103.D Issue 1 Pages 11-24
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    (k,n)-visual secret sharing scheme ((k,n)-VSSS) is a method to divide a secret image into n images called shares that enable us to restore the original image by only stacking at least k of them without any complicated computations. In this paper, we consider (2,2)-VSSS to share two secret images at the same time only by two shares, and investigate the methods to improve the quality of decoded images. More precisely, we consider (2,2)-VSSS in which the first secret image is decoded by stacking those two shares in the usual way, while the second one is done by stacking those two shares in the way that one of them is used reversibly. Since the shares must have some subpixels that inconsistently correspond to pixels of the secret images, the decoded pixels do not agree with the corresponding pixels of the secret images, which causes serious degradation of the quality of decoded images. To reduce such degradation, we propose several methods to construct shares that utilize 8-neighbor Laplacian filter and halftoning. Then we show that the proposed methods can effectively improve the quality of decoded images. Moreover, we demonstrate that the proposed methods can be naturally extended to (2,2)-VSSS for RGB images.

    Download PDF (3634K)
  • Kenta IIDA, Hitoshi KIYA
    Article type: PAPER
    2020 Volume E103.D Issue 1 Pages 25-32
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    We propose an image identification scheme for double-compressed encrypted JPEG images that aims to identify encrypted JPEG images that are generated from an original JPEG image. To store images without any visual sensitive information on photo sharing services, encrypted JPEG images are generated by using a block-scrambling-based encryption method that has been proposed for Encryption-then-Compression systems with JPEG compression. In addition, feature vectors robust against JPEG compression are extracted from encrypted JPEG images. The use of the image encryption and feature vectors allows us to identify encrypted images recompressed multiple times. Moreover, the proposed scheme is designed to identify images re-encrypted with different keys. The results of a simulation show that the identification performance of the scheme is high even when images are recompressed and re-encrypted.

    Download PDF (2336K)
  • Ippei HAMAMOTO, Masaki KAWAMURA
    Article type: PAPER
    2020 Volume E103.D Issue 1 Pages 33-41
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    We have developed a digital watermarking method that use neural networks to learn embedding and extraction processes that are robust against rotation and JPEG compression. The proposed neural networks consist of a stego-image generator, a watermark extractor, a stego-image discriminator, and an attack simulator. The attack simulator consists of a rotation layer and an additive noise layer, which simulate the rotation attack and the JPEG compression attack, respectively. The stego-image generator can learn embedding that is robust against these attacks, and also, the watermark extractor can extract watermarks without rotation synchronization. The quality of the stego-images can be improved by using the stego-image discriminator, which is a type of adversarial network. We evaluated the robustness of the watermarks and image quality and found that, using the proposed method, high-quality stego-images could be generated and the neural networks could be trained to embed and extract watermarks that are robust against rotation and JPEG compression attacks. We also showed that the robustness and image quality can be adjusted by changing the noise strength in the noise layer.

    Download PDF (1878K)
  • Ryota KAMINISHI, Haruna MIYAMOTO, Sayaka SHIOTA, Hitoshi KIYA
    Article type: PAPER
    2020 Volume E103.D Issue 1 Pages 42-49
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.

    Download PDF (1212K)
  • Takayuki NAKACHI, Yukihiro BANDOH, Hitoshi KIYA
    Article type: PAPER
    2020 Volume E103.D Issue 1 Pages 50-58
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    In this paper, we propose secure dictionary learning based on a random unitary transform for sparse representation. Currently, edge cloud computing is spreading to many application fields including services that use sparse coding. This situation raises many new privacy concerns. Edge cloud computing poses several serious issues for end users, such as unauthorized use and leak of data, and privacy failures. The proposed scheme provides practical MOD and K-SVD dictionary learning algorithms that allow computation on encrypted signals. We prove, theoretically, that the proposal has exactly the same dictionary learning estimation performance as the non-encrypted variant of MOD and K-SVD algorithms. We apply it to secure image modeling based on an image patch model. Finally, we demonstrate its performance on synthetic data and a secure image modeling application for natural images.

    Download PDF (2300K)
  • Jin S. SEO
    Article type: LETTER
    2020 Volume E103.D Issue 1 Pages 59-62
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    To enhance cover song identification accuracy on a large-size music archive, a song-level feature summarization method is proposed by using multi-scale representation. The chroma n-grams are extracted in multiple scales to cope with both global and local tempo changes. We derive index from the extracted n-grams by clustering to reduce storage and computation for DB search. Experiments on the widely used music datasets confirmed that the proposed method achieves the state-of-the-art accuracy while reducing cost for cover song search.

    Download PDF (236K)
  • Reiya NAMIKAWA, Masashi UNOKI
    Article type: LETTER
    2020 Volume E103.D Issue 1 Pages 63-66
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    We propose a method of non-blind speech watermarking based on direct spread spectrum (DSS) using a linear prediction scheme to solve sound distortion due to spread spectrum. Results of evaluation simulations revealed that the proposed method had much lower sound-quality distortion than the DSS method while having almost the same bit error ratios (BERs) against various attacks as the DSS method.

    Download PDF (830K)
  • Huyen T. T. TRAN, Trang H. HOANG, Phu N. MINH, Nam PHAM NGOC, Truong C ...
    Article type: LETTER
    2020 Volume E103.D Issue 1 Pages 67-70
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    Thanks to the ability to bring immersive experiences to users, Virtual Reality (VR) technologies have been gaining popularity in recent years. A key component in VR systems is omnidirectional content, which can provide 360-degree views of scenes. However, at a given time, only a portion of the full omnidirectional content, called viewport, is displayed corresponding to the user's current viewing direction. In this work, we first develop Weighted-Viewport PSNR (W-VPSNR), an objective quality metric for quality assessment of omnidirectional content. The proposed metric takes into account the foveation feature of the human visual system. Then, we build a subjective database consisting of 72 stimuli with spatial varying viewport quality. By using this database, an evaluation of the proposed metric and four conventional metrics is conducted. Experiment results show that the W-VPSNR metric well correlates with the mean opinion scores (MOS) and outperforms the conventional metrics. Also, it is found that the conventional metrics do not perform well for omnidirectional content.

    Download PDF (258K)
  • Sufen ZHAO, Rong PENG, Meng ZHANG, Liansheng TAN
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 1 Pages 71-84
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    It is of great importance to recommend collaborators for scholars in academic social networks, which can benefit more scientific research results. Facing the problem of data sparsity of co-author recommendation in academic social networks, a novel recommendation algorithm named HeteroRWR (Heterogeneous Random Walk with Restart) is proposed. Different from the basic Random Walk with Restart (RWR) model which only walks in homogeneous networks, HeteroRWR implements multiple random walks in a heterogeneous network which integrates a citation network and a co-authorship network to mine the k mostly valuable co-authors for target users. By introducing the citation network, HeteroRWR algorithm can find more suitable candidate authors when the co-authorship network is extremely sparse. Candidate recommenders will not only have high topic similarities with target users, but also have good community centralities. Analyses on the convergence and time efficiency of the proposed approach are presented. Extensive experiments have been conducted on DBLP and CiteSeerX datasets. Experimental results demonstrate that HeteroRWR outperforms state-of-the-art baseline methods in terms of precision and recall rate even in the case of incorporating an incomplete citation dataset.

    Download PDF (1086K)
  • Shanshan JIAO, Zhisong PAN, Yutian CHEN, Yunbo LI
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 1 Pages 85-92
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    As one of the most popular intelligent optimization algorithms, Simulated Annealing (SA) faces two key problems, the generation of perturbation solutions and the control strategy of the outer loop (cooling schedule). In this paper, we introduce the Gaussian Cloud model to solve both problems and propose a novel cloud annealing algorithm. Its basic idea is to use the Gaussian Cloud model with decreasing numerical character He (Hyper-entropy) to generate new solutions in the inner loop, while He essentially indicates a heuristic control strategy to combine global random search of the outer loop and local tuning search of the inner loop. Experimental results in function optimization problems (i.e. single-peak, multi-peak and high dimensional functions) show that, compared with the simple SA algorithm, the proposed cloud annealing algorithm will lead to significant improvement on convergence and the average value of obtained solutions is usually closer to the optimal solution.

    Download PDF (587K)
  • Keiichi KANEKO
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2020 Volume E103.D Issue 1 Pages 93-100
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    In this paper, we extend the notion of bijective connection graphs to introduce directed bijective connection graphs. We propose algorithms that solve the node-to-set node-disjoint paths problem and the node-to-node node-disjoint paths problem in a directed bijective connection graph. The time complexities of the algorithms are both O(n4), and the maximum path lengths are both 2n-1.

    Download PDF (781K)
  • Ryuta KAWANO, Ryota YASUDO, Hiroki MATSUTANI, Michihiro KOIBUCHI, Hide ...
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 1 Pages 101-110
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    Recently proposed irregular networks can reduce the latency for both on-chip and off-chip systems with a large number of computing nodes and thus can improve the performance of parallel applications. However, these networks usually suffer from deadlocks in routing packets when using a naive minimal path routing algorithm. To solve this problem, we focus attention on a lately proposed theory that generalizes the turn model to maintain the network performance with deadlock-freedom. The theorems remain a challenge of applying themselves to arbitrary topologies including fully irregular networks. In this paper, we advance the theorems to completely general ones. Moreover, we provide a feasible implementation of a deadlock-free routing method based on our advanced theorem. Experimental results show that the routing method based on our proposed theorem can improve the network throughput by up to 138 % compared to a conventional deterministic minimal routing method. Moreover, when utilized as the escape path in Duato's protocol, it can improve the throughput by up to 26.3 % compared with the conventional up*/down* routing.

    Download PDF (1654K)
  • Takashi YOKOTA, Kanemitsu OOTSU, Takeshi OHKAWA
    Article type: PAPER
    Subject area: Computer System
    2020 Volume E103.D Issue 1 Pages 111-129
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    Inter-node communication is essential in parallel computation. The performance of parallel processing depends on the efficiencies in both computation and communication, thus, the communication cost is not negligible. A parallel application program involves a logical communication structure that is determined by the interchange of data between computation nodes. Sometimes the logical communication structure mismatches to that in a real parallel machine. This mismatch results in large communication costs. This paper addresses the node-mapping problem that rearranges logical position of node so that the degree of mismatch is decreased. This paper assumes that parallel programs execute one or more collective communications that follow specific traffic patterns. An appropriate node-mapping achieves high communication performance. This paper proposes a strong heuristic method for solving the node-mapping problem and adapts the method to a genetic algorithm. Evaluation results reveal that the proposed method achieves considerably high performance; it achieves 8.9 (4.9) times speed-up on average in single-(two-)traffic-pattern cases in 32×32 torus networks. Specifically, for some traffic patterns in small-scale networks, the proposed method finds theoretically optimized solutions. Furthermore, this paper discusses in deep about various issues in the proposed method that employs genetic algorithm, such as population of genes, number of generations, and traffic patterns. This paper also discusses applicability to large-scale systems for future practical use.

    Download PDF (2342K)
  • Shun IMAI, Akihiro INOKUCHI
    Article type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2020 Volume E103.D Issue 1 Pages 130-141
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    This paper proposes a method for searching for graphs in the database which are contained as subgraphs by a given query. In the proposed method, the search index does not require any knowledge of the query set or the frequent subgraph patterns. In conventional techniques, enumerating and selecting frequent subgraph patterns is computationally expensive, and the distribution of the query set must be known in advance. Subsequent changes to the query set require the frequent patterns to be selected again and the index to be reconstructed. The proposed method overcomes these difficulties through graph coding, using a tree structured index that contains infrequent subgraph patterns in the shallow part of the tree. By traversing this code tree, we are able to rapidly determine whether multiple graphs in the database contain subgraphs that match the query, producing a powerful pruning or filtering effect. Furthermore, the filtering and verification steps of the graph search can be conducted concurrently, rather than requiring separate algorithms. As the proposed method does not require the frequent subgraph patterns and the query set, it is significantly faster than previous techniques; this independence from the query set also means that there is no need to reconstruct the search index when the query set changes. A series of experiments using a real-world dataset demonstrate the efficiency of the proposed method, achieving a search speed several orders of magnitude faster than the previous best.

    Download PDF (1211K)
  • Rei TAKAMI, Yasufumi TAKAMA
    Article type: PAPER
    Subject area: Human-computer Interaction
    2020 Volume E103.D Issue 1 Pages 142-151
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    This paper proposes a visual analytics (VA) interface for time-series data so that it can solve the problems arising from the property of time-series data: a collision between interaction and animation on the temporal aspect, collision of interaction between the temporal and spatial aspects, and the trade-off of exploration accuracy, efficiency, and scalability between different visualization methods. To solve these problems, this paper proposes a VA interface that can handle temporal and spatial changes uniformly. Trajectories can show temporal changes spatially, of which direct manipulation enables to examine the relationship among objects either at a certain time point or throughout the entire time range. The usefulness of the proposed interface is demonstrated through experiments.

    Download PDF (2751K)
  • Weiqing TONG, Haisheng LI, Guoyue CHEN
    Article type: PAPER
    Subject area: Pattern Recognition
    2020 Volume E103.D Issue 1 Pages 152-162
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    Blob detection is an important part of computer vision and a special case of region detection with important applications in the image analysis. In this paper, the dilation operator in standard mathematical morphology is firstly extended to the order dilation operator of soft morphology, three soft morphological filters are designed by using the operator, and a novel blob detection algorithm called SMBD is proposed on that basis. SMBD had been proven to have better performance of anti-noise and blob shape detection than similar blob filters based on mathematical morphology like Quoit and N-Quoit in terms of theoretical and experimental aspects. Additionally, SMBD was also compared to LoG and DoH in different classes, which are the most commonly used blob detector, and SMBD also achieved significantly great results.

    Download PDF (3158K)
  • Jianyong DUAN, Yuwei WU, Mingli WU, Hao WANG
    Article type: PAPER
    Subject area: Natural Language Processing
    2020 Volume E103.D Issue 1 Pages 163-169
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    The similarity of words extracted from the rich text relation network is the main way to calculate the semantic similarity. Complex relational information and text content in Wikipedia website, Community Question Answering and social network, provide abundant corpus for semantic similarity calculation. However, most typical research only focused on single relationship. In this paper, we propose a semantic similarity calculation model which integrates multiple relational information, and map multiple relationship to the same semantic space through learning representing matrix and semantic matrix to improve the accuracy of semantic similarity calculation. In experiments, we confirm that the semantic calculation method which integrates many kinds of relationships can improve the accuracy of semantic calculation, compared with other semantic calculation methods.

    Download PDF (387K)
  • Sooyong JEONG, Ajay Kumar JHA, Youngsul SHIN, Woo Jin LEE
    Article type: LETTER
    Subject area: Software Engineering
    2020 Volume E103.D Issue 1 Pages 170-173
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    Embedded software developers assume the behavior of the environment when specifications are not available. However, developers may assume the behavior incorrectly, which may result in critical faults in the system. Therefore, it is important to detect the faults caused by incorrect assumptions. In this letter, we propose a log-based testing approach to detect the faults. First, we create a UML behavioral model to represent the assumed behavior of the environment, which is then transformed into a state model. Next, we extract the actual behavior of the environment from a log, which is then incorporated in the state model, resulting in a state model that represents both assumed and actual behaviors. Existing testing techniques based on the state model can be used to generate test cases from our state model to detect faults.

    Download PDF (1056K)
  • Shaojie ZHU, Lei ZHANG, Bailong LIU, Shumin CUI, Changxing SHAO, Yun L ...
    Article type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 1 Pages 174-176
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    Multi-modal semantic trajectory prediction has become a new challenge due to the rapid growth of multi-modal semantic trajectories with text message. Traditional RNN trajectory prediction methods have the following problems to process multi-modal semantic trajectory. The distribution of multi-modal trajectory samples shifts gradually with training. It leads to difficult convergency and long training time. Moreover, each modal feature shifts in different directions, which produces multiple distributions of dataset. To solve the above problems, MNERM (Mode Normalization Enhanced Recurrent Model) for multi-modal semantic trajectory is proposed. MNERM embeds multiple modal features together and combines the LSTM network to capture long-term dependency of trajectory. In addition, it designs Mode Normalization mechanism to normalize samples with multiple means and variances, and each distribution normalized falls into the action area of the activation function, so as to improve the prediction efficiency while improving greatly the training speed. Experiments on real dataset show that, compared with SERM, MNERM reduces the sensitivity of learning rate, improves the training speed by 9.120 times, increases HR@1 by 0.03, and reduces the ADE by 120 meters.

    Download PDF (529K)
  • Huaizhe ZHOU, Haihe BA, Yongjun WANG, Tie HONG
    Article type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2020 Volume E103.D Issue 1 Pages 177-180
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    The arms race between offense and defense in the cloud impels the innovation of techniques for monitoring attacks and unauthorized activities. The promising technique of virtual machine introspection (VMI) becomes prevalent for its tamper-resistant capability. However, some elaborate exploitations are capable of invalidating VMI-based tools by breaking the assumption of a trusted guest kernel. To achieve a more reliable and robust introspection, we introduce a practical approach to monitor and detect attacks that attempt to subvert VMI in this paper. Our approach combines supervised machine learning and hardware architectural events to identify those malicious behaviors which are targeted at VMI techniques. To demonstrate the feasibility, we implement a prototype named HyperMon on the Xen hypervisor. The results of our evaluation show the effectiveness of HyperMon in detecting malicious behaviors with an average accuracy of 90.51% (AUC).

    Download PDF (249K)
  • Pengyu WANG, Hongqing ZHU, Ning CHEN
    Article type: LETTER
    Subject area: Image Processing and Video Processing
    2020 Volume E103.D Issue 1 Pages 181-185
    Published: January 01, 2020
    Released on J-STAGE: January 01, 2020
    JOURNAL FREE ACCESS

    A novel superpixel segmentation approach driven by uniform mixture model with spatially constrained (UMMS) is proposed. Under this algorithm, each observation, i.e. pixel is first represented as a five-dimensional vector which consists of colour in CLELAB space and position information. And then, we define a new uniform distribution through adding pixel position, so that this distribution can describe each pixel in input image. Applied weighted 1-Norm to difference between pixels and mean to control the compactness of superpixel. In addition, an effective parameter estimation scheme is introduced to reduce computational complexity. Specifically, the invariant prior probability and parameter range restrict the locality of superpixels, and the robust mean optimization technique ensures the accuracy of superpixel boundaries. Finally, each defined uniform distribution is associated with a superpixel and the proposed UMMS successfully implements superpixel segmentation. The experiments on BSDS500 dataset verify that UMMS outperforms most of the state-of-the-art approaches in terms of segmentation accuracy, regularity, and rapidity.

    Download PDF (4832K)
feedback
Top