IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
E103.D 巻, 2 号
選択された号の論文の33件中1~33を表示しています
Special Section on Security, Privacy, Anonymity and Trust in Cyberspace Computing and Communications
  • Guojun WANG
    2020 年 E103.D 巻 2 号 p. 186-187
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー
  • Yudi ZHANG, Debiao HE, Xinyi HUANG, Ding WANG, Kim-Kwang Raymond CHOO, ...
    原稿種別: INVITED PAPER
    2020 年 E103.D 巻 2 号 p. 188-195
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Unlike black-box cryptography, an adversary in a white-box security model has full access to the implementation of the cryptographic algorithm. Thus, white-box implementation of cryptographic algorithms is more practical. Nevertheless, in recent years, there is no white-box implementation for public key cryptography. In this paper, we propose the first white-box implementation of the identity-based signature scheme in the IEEE P1363 standard. Our main idea is to hide the private key to multiple lookup tables, so that the private key cannot be leaked during the algorithm executed in the untrusted environment. We prove its security in both black-box and white-box models. We also evaluate the performance of our white-box implementations, in order to demonstrate utility for real-world applications.

  • Wenjuan LI, Weizhi MENG, Zhiqiang LIU, Man-Ho AU
    原稿種別: INVITED PAPER
    2020 年 E103.D 巻 2 号 p. 196-203
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Software-Defined Networking (SDN) enables flexible deployment and innovation of new networking applications by decoupling and abstracting the control and data planes. It has radically changed the concept and way of building and managing networked systems, and reduced the barriers to entry for new players in the service markets. It is considered to be a promising solution providing the scale and versatility necessary for IoT. However, SDN may also face many challenges, i.e., the centralized control plane would be a single point of failure. With the advent of blockchain technology, blockchain-based SDN has become an emerging architecture for securing a distributed network environment. Motivated by this, in this work, we summarize the generic framework of blockchain-based SDN, discuss security challenges and relevant solutions, and provide insights on the future development in this field.

  • Vasileios KOULIARIDIS, Konstantia BARMPATSALOU, Georgios KAMBOURAKIS, ...
    原稿種別: INVITED PAPER
    2020 年 E103.D 巻 2 号 p. 204-211
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Modern mobile devices are equipped with a variety of tools and services, and handle increasing amounts of sensitive information. In the same trend, the number of vulnerabilities exploiting mobile devices are also augmented on a daily basis and, undoubtedly, popular mobile platforms, such as Android and iOS, represent an alluring target for malware writers. While researchers strive to find alternative detection approaches to fight against mobile malware, recent reports exhibit an alarming increase in mobile malware exploiting victims to create revenues, climbing towards a billion-dollar industry. Current approaches to mobile malware analysis and detection cannot always keep up with future malware sophistication [2],[4]. The aim of this work is to provide a structured and comprehensive overview of the latest research on mobile malware detection techniques and pinpoint their benefits and limitations.

  • Yuya SENZAKI, Satsuya OHATA, Kanta MATSUURA
    原稿種別: PAPER
    専門分野: Reliability and Security of Computer Systems
    2020 年 E103.D 巻 2 号 p. 212-221
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Research on adversarial examples for machine learning has received much attention in recent years. Most of previous approaches are white-box attacks; this means the attacker needs to obtain before-hand internal parameters of a target classifier to generate adversarial examples for it. This condition is hard to satisfy in practice. There is also research on black-box attacks, in which the attacker can only obtain partial information about target classifiers; however, it seems we can prevent these attacks, since they need to issue many suspicious queries to the target classifier. In this paper, we show that a naive defense strategy based on surveillance of number query will not suffice. More concretely, we propose to generate not pixel-wise but block-wise adversarial perturbations to reduce the number of queries. Our experiments show that such rough perturbations can confuse the target classifier. We succeed in reducing the number of queries to generate adversarial examples in most cases. Our simple method is an untargeted attack and may have low success rates compared to previous results of other black-box attacks, but needs in average fewer queries. Surprisingly, the minimum number of queries (one and three in MNIST and CIFAR-10 dataset, respectively) is enough to generate adversarial examples in some cases. Moreover, based on these results, we propose a detailed classification for black-box attackers and discuss countermeasures against the above attacks.

  • Huiyao ZHENG, Jian SHEN, Youngju CHO, Chunhua SU, Sangman MOH
    原稿種別: PAPER
    専門分野: Reliability and Security of Computer Systems
    2020 年 E103.D 巻 2 号 p. 222-229
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Cloud computing is a unlimited computing resource and storing resource, which provides a lot of convenient services, for example, Internet and education, intelligent transportation system. With the rapid development of cloud computing, more and more people pay attention to reducing the cost of data management. Data sharing is a effective model to decrease the cost of individuals or companies in dealing with data. However, the existing data sharing scheme cannot reduce communication cost under ensuring the security of users. In this paper, an anonymous and traceable data sharing scheme is presented. The proposed scheme can protect the privacy of the user. In addition, the proposed scheme also can trace the user uploading irrelevant information. Security and performance analyses show that the data sharing scheme is secure and effective.

  • Qiuhua WANG, Mingyang KANG, Guohua WU, Yizhi REN, Chunhua SU
    原稿種別: PAPER
    専門分野: Network Security
    2020 年 E103.D 巻 2 号 p. 230-238
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Secret key generation based on channel characteristics is an effective physical-layer security method for 5G wireless networks. The issues of how to ensure the high key generation rate and correlation of the secret key under active attack are needed to be addressed. In this paper, a new practical secret key generation scheme with high rate and correlation is proposed. In our proposed scheme, Alice and Bob transmit independent random sequences instead of known training sequences or probing signals; neither Alice nor Bob can decode these random sequences or estimate the channel. User's random sequences together with the channel effects are used as common random source to generate the secret key. With this solution, legitimate users are able to share secret keys with sufficient length and high security under active attack. We evaluate the proposed scheme through both analytic and simulation studies. The results show that our proposed scheme achieves high key generation rate and key security, and is suitable for 5G wireless networks with resource-constrained devices.

  • Takuya WATANABE, Eitaro SHIOJI, Mitsuaki AKIYAMA, Keito SASAOKA, Takes ...
    原稿種別: PAPER
    専門分野: Network Security
    2020 年 E103.D 巻 2 号 p. 239-255
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    This paper presents a practical side-channel attack that identifies the social web service account of a visitor to an attacker's website. Our attack leverages the widely adopted user-blocking mechanism, abusing its inherent property that certain pages return different web content depending on whether a user is blocked from another user. Our key insight is that an account prepared by an attacker can hold an attacker-controllable binary state of blocking/non-blocking with respect to an arbitrary user on the same service; provided that the user is logged in to the service, this state can be retrieved as one-bit data through the conventional cross-site timing attack when a user visits the attacker's website. We generalize and refer to such a property as visibility control, which we consider as the fundamental assumption of our attack. Building on this primitive, we show that an attacker with a set of controlled accounts can gain a complete and flexible control over the data leaked through the side channel. Using this mechanism, we show that it is possible to design and implement a robust, large-scale user identification attack on a wide variety of social web services. To verify the feasibility of our attack, we perform an extensive empirical study using 16 popular social web services and demonstrate that at least 12 of these are vulnerable to our attack. Vulnerable services include not only popular social networking sites such as Twitter and Facebook, but also other types of web services that provide social features, e.g., eBay and Xbox Live. We also demonstrate that the attack can achieve nearly 100% accuracy and can finish within a sufficiently short time in a practical setting. We discuss the fundamental principles, practical aspects, and limitations of the attack as well as possible defenses. We have successfully addressed this attack by collaborative working with service providers and browser vendors.

  • Na RUAN, Chunhua SU, Chi XIE
    原稿種別: PAPER
    専門分野: Network Security
    2020 年 E103.D 巻 2 号 p. 256-264
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    The requirement of safety, roadway capacity and efficiency in the vehicular network, which makes vehicular platoons concept continue to be of interest. For the authentication in vehicular platoons, efficiency and cooperation are the two most important things. Cooperative authentication is a way to recognize false identities and messages as well as saving resources. However, taking part in cooperative authentication makes the vehicle more vulnerable to privacy leakage which is commonly done by location tracking. Moreover, vehicles consume their resources when cooperating with others during the process of cooperation authentication. These two significant factors cause selfish behaviors of the vehicles not to participate in cooperate cooperation actively. In this paper, an infinitely repeated game for cooperative authentication in vehicular platoons is proposed to help analyze the utility of all nodes and point out the weakness of the current collaborative authentication protocol. To deal with this weakness, we also devised an enhanced cooperative authentication protocol based on mechanisms which makes it easier for vehicles to stay in the cooperate strategy rather than tend to selfish behavior. Meanwhile, our protocol can defense insider attacks.

  • Mitsuhiro HATADA, Tatsuya MORI
    原稿種別: PAPER
    専門分野: Network Security
    2020 年 E103.D 巻 2 号 p. 265-275
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    This work develops a system called CLAP that detects and classifies “potentially unwanted applications” (PUAs) such as adware or remote monitoring tools. Our approach leverages DNS queries made by apps. Using a large sample of Android apps from third-party marketplaces, we first reveal that DNS queries can provide useful information for detection and classification of PUAs. We then show that existing DNS blacklists are limited when performing these tasks. Finally, we demonstrate that the CLAP system performs with high accuracy.

  • Takuya WATANABE, Mitsuaki AKIYAMA, Fumihiro KANEI, Eitaro SHIOJI, Yuta ...
    原稿種別: PAPER
    専門分野: Network Security
    2020 年 E103.D 巻 2 号 p. 276-291
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    This paper reports a large-scale study that aims to understand how mobile application (app) vulnerabilities are associated with software libraries. We analyze both free and paid apps. Studying paid apps was quite meaningful because it helped us understand how differences in app development/maintenance affect the vulnerabilities associated with libraries. We analyzed 30k free and paid apps collected from the official Android marketplace. Our extensive analyses revealed that approximately 70%/50% of vulnerabilities of free/paid apps stem from software libraries, particularly from third-party libraries. Somewhat paradoxically, we found that more expensive/popular paid apps tend to have more vulnerabilities. This comes from the fact that more expensive/popular paid apps tend to have more functionality, i.e., more code and libraries, which increases the probability of vulnerabilities. Based on our findings, we provide suggestions to stakeholders of mobile app distribution ecosystems.

  • Hiroshi NOMAGUCHI, Chunhua SU, Atsuko MIYAJI
    原稿種別: PAPER
    専門分野: Cryptographic Techniques
    2020 年 E103.D 巻 2 号 p. 292-298
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    RFID enable applications are ubiquitous in our society, especially become more and more important as IoT management rises. Meanwhile, the concern of security and privacy of RFID is also increasing. The pseudorandom number generator is one of the core primitives to implement RFID security. Therefore, it is necessary to design and implement a secure and robust pseudo-random number generator (PRNG) for current RFID tag. In this paper, we study the security of light-weight PRNGs for EPC Gen2 RFID tag which is an EPC Global standard. For this reason, we have analyzed and improved the existing research at IEEE TrustCom 2017 and proposed a model using external random numbers. However, because the previous model uses external random numbers, the speed has a problem depending on the generation speed of external random numbers. In order to solve this problem, we developed a pseudorandom number generator that does not use external random numbers. This model consists of LFSR, NLFSR and SLFSR. Safety is achieved by using nonlinear processing such as multiplication and logical multiplication on the Galois field. The cycle achieves a cycle longer than the key length by effectively combining a plurality of LFSR and the like. We show that our proposal PRNG has good randomness and passed the NIST randomness test. We also shows that it is resistant to identification attacks and GD attacks.

  • Tomoaki MIMOTO, Seira HIDANO, Shinsaku KIYOMOTO, Atsuko MIYAJI
    原稿種別: PAPER
    専門分野: Cryptographic Techniques
    2020 年 E103.D 巻 2 号 p. 299-308
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Time-sequence data is high dimensional and contains a lot of information, which can be utilized in various fields, such as insurance, finance, and advertising. Personal data including time-sequence data is converted to anonymized datasets, which need to strike a balance between both privacy and utility. In this paper, we consider low-rank matrix factorization as one of anonymization methods and evaluate its efficiency. We convert time-sequence datasets to matrices and evaluate both privacy and utility. The record IDs in time-sequence data are changed at regular intervals to reduce re-identification risk. However, since individuals tend to behave in a similar fashion over periods of time, there remains a risk of record linkage even if record IDs are different. Hence, we evaluate the re-identification and linkage risks as privacy risks of time-sequence data. Our experimental results show that matrix factorization is a viable anonymization method and it can achieve better utility than existing anonymization methods.

Regular Section
  • Jae Young HUR
    原稿種別: PAPER
    専門分野: Computer System
    2020 年 E103.D 巻 2 号 p. 309-320
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    The conventional linear or tiled address maps can degrade performance and memory utilization when traffic patterns are not matched with an underlying address map. The address map is usually fixed at design time. Accordingly, it is difficult to adapt to given applications. Modern embedded system usually accommodates memory management units (MMUs). As a result, depending on virtual address patterns, the system can suffer from performance overheads due to page table walks. To alleviate this performance overhead, we propose to cluster and rearrange tiles to construct an MMU-aware configurable address map. To construct the clustered tiled map, the generic tile number remapping algorithm is presented. In the presented scheme, an address map is configured based on the adaptive dimensioning algorithm. Considering image processing applications, a design, an analysis, an implementation, and simulations are conducted. The results indicate the proposed method can improve the performance and the memory utilization with moderate hardware costs.

  • Yudai SAKAMOTO, Shigeru YAMASHITA
    原稿種別: PAPER
    専門分野: Computer System
    2020 年 E103.D 巻 2 号 p. 321-328
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    In Stochastic Computing (SC), we need to generate many stochastic numbers (SNs). If we generate one SN conventionally, we need a Stochastic Number Generator (SNG) which consists of a linear-feedback shift register (LFSR) and a comparator. When we calculate an arithmetic function by SC, we need to generate many SNs whose values are equal to constant values used in the arithmetic function. As a consequence, the hardware overhead becomes huge. Accordingly, there has been proposed a method called GMCS (Generating Many Constant SNs from Few SNs) to generate many constant SNs with low hardware overhead. However, if we use GMCS simply, generated constant SNs are correlated highly with each other. This would be a serious problem because the high correlation of SNs make a large error in computation. Therefore, in this paper, we propose efficient methods to generate constant SNs with reasonably low hardware overhead without increasing errors. To reduce the correlations of constant SNs which are generated by GMCS, we use Register based Re-arrangement circuit using a Random bit stream duplicator (RRRD). RRRDs have few influences on the hardware overhead because an RRRD consists of three multiplexers (MUXs) and two 1-bit FFs. We also use a technique to share random number generators with several SNGs to reduce the hardware overhead. We provide some experimental results by which we can confirm that our proposed methods are in general very useful to reduce the hardware overhead for generating constant SNs without increasing errors.

  • Takashi NAKADA, Hiroyuki YANAGIHASHI, Kunimaro IMAI, Hiroshi UEKI, Tak ...
    原稿種別: PAPER
    専門分野: Software System
    2020 年 E103.D 巻 2 号 p. 329-338
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Near real-time periodic tasks, which are popular in multimedia streaming applications, have deadline periods that are longer than the input intervals thanks to buffering. For such applications, the conventional frame-based schedulings cannot realize the optimal scheduling due to their shortsighted deadline assumptions. To realize globally energy-efficient executions of these applications, we propose a novel task scheduling algorithm, which takes advantage of the long deadline period. We confirm our approach can take advantage of the longer deadline period and reduce the average power consumption by up to 18%.

  • Apinporn METHAWACHANANONT, Marut BURANARACH, Pakaimart AMSURIYA, Sompo ...
    原稿種別: PAPER
    専門分野: Software Engineering
    2020 年 E103.D 巻 2 号 p. 339-347
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    A key driver of software business growth in developing countries is the survival of software small and medium-sized enterprises (SMEs). Quality of products is a critical factor that can indicate the future of the business by building customer confidence. Software development agencies need to be aware of meeting international standards in software development process. In practice, consultants and assessors are usually employed as the primary solution, which can impact the budget in case of small businesses. Self-assessment tools for software development process can potentially reduce time and cost of formal assessment for software SMEs. However, the existing support methods and tools are largely insufficient in terms of process coverage and semi-automated evaluation. This paper proposes to apply a knowledge-based approach in development of a self-assessment and gap analysis support system for the ISO/IEC 29110 standard. The approach has an advantage that insights from domain experts and the standard are captured in the knowledge base in form of decision tables that can be flexibly managed. Our knowledge base is unique in that task lists and work products defined in the standard are broken down into task and work product characteristics, respectively. Their relation provides the links between Task List and Work Product which make users more understand and influence self-assessment. A prototype support system was developed to assess the level of software development capability of the agencies based on the ISO/IEC 29110 standard. A preliminary evaluation study showed that the system can improve performance of users who are inexperienced in applying ISO/IEC 29110 standard in terms of task coverage and user's time and effort compared to the traditional self-assessment method.

  • Yutaro KASHIWA, Masao OHIRA
    原稿種別: PAPER
    専門分野: Software Engineering
    2020 年 E103.D 巻 2 号 p. 348-362
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    This paper proposes a release-aware bug triaging method that aims to increase the number of bugs that developers can fix by the next release date during open-source software development. A variety of methods have been proposed for recommending appropriate developers for particular bug-fixing tasks, but since these approaches only consider the developers' ability to fix the bug, they tend to assign many of the bugs to a small number of the project's developers. Since projects generally have a release schedule, even excellent developers cannot fix all the bugs that are assigned to them by the existing methods. The proposed method places an upper limit on the number of tasks which are assigned to each developer during a given period, in addition to considering the ability of developers. Our method regards the bug assignment problem as a multiple knapsack problem, finding the best combination of bugs and developers. The best combination is one that maximizes the efficiency of the project, while meeting the constraint where it can only assign as many bugs as the developers can fix during a given period. We conduct the case study, applying our method to bug reports from Mozilla Firefox, Eclipse Platform and GNU compiler collection (GCC). We find that our method has the following properties: (1) it can prevent the bug-fixing load from being concentrated on a small number of developers; (2) compared with the existing methods, the proposed method can assign a more appropriate amount of bugs that each developer can fix by the next release date; (3) it can reduce the time taken to fix bugs by 35%-41%, compared with manual bug triaging;

  • Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA
    原稿種別: PAPER
    専門分野: Dependable Computing
    2020 年 E103.D 巻 2 号 p. 363-378
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.

  • Hiroya KATO, Shuichiro HARUTA, Iwao SASASE
    原稿種別: PAPER
    専門分野: Dependable Computing
    2020 年 E103.D 巻 2 号 p. 379-389
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Detecting Android malwares is imperative. As a promising Android malware detection scheme, we focus on the scheme leveraging the differences of traffic patterns between benign apps and malwares. Those differences can be captured even if the packet is encrypted. However, since such features are just statistic based ones, they cannot identify whether each traffic is malicious. Thus, it is necessary to design the scheme which is applicable to encrypted traffic data and supports identification of malicious traffic. In this paper, we propose an Android malware detection scheme based on level of SSL server certificate. Attackers tend to use an untrusted certificate to encrypt malicious payloads in many cases because passing rigorous examination is required to get a trusted certificate. Thus, we utilize SSL server certificate based features for detection since their certificates tend to be untrusted. Furthermore, in order to obtain the more exact features, we introduce required permission based weight values because malwares inevitably require permissions regarding malicious actions. By computer simulation with real dataset, we show our scheme achieves an accuracy of 92.7%. True positive rate and false positive rate are 5.6% higher and 3.2% lower than the previous scheme, respectively. Our scheme can cope with encrypted malicious payloads and 89 malwares which are not detected by the previous scheme.

  • Chihiro WATANABE, Kaoru HIRAMATSU, Kunio KASHINO
    原稿種別: PAPER
    専門分野: Artificial Intelligence, Data Mining
    2020 年 E103.D 巻 2 号 p. 390-397
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Interpretability has become an important issue in the machine learning field, along with the success of layered neural networks in various practical tasks. Since a trained layered neural network consists of a complex nonlinear relationship between large number of parameters, we failed to understand how they could achieve input-output mappings with a given data set. In this paper, we propose the non-negative task matrix decomposition method, which applies non-negative matrix factorization to a trained layered neural network. This enables us to decompose the inference mechanism of a trained layered neural network into multiple principal tasks of input-output mapping, and reveal the roles of hidden units in terms of their contribution to each principal task.

  • Naoki KATO, Toshihiko YAMASAKI, Kiyoharu AIZAWA, Takemi OHAMA
    原稿種別: PAPER
    専門分野: Artificial Intelligence, Data Mining
    2020 年 E103.D 巻 2 号 p. 398-405
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    With the recent advances in e-commerce, it has become important to recommend not only mass-produced daily items, such as books, but also items that are not mass-produced. In this study, we present an algorithm for real estate recommendations. Automatic property recommendations are a highly difficult task because no identical properties exist in the world, occupied properties cannot be recommended, and users rent or buy properties only a few times in their lives. For the first step of property recommendation, we predict users' preferences for properties by combining content-based filtering and Multi-Layer Perceptron (MLP). In the MLP, we use not only attribute data of users and properties, but also deep features extracted from property floor plan images. As a result, we successfully predict users' preference with a Matthews Correlation Coefficient (MCC) of 0.166.

  • Jeong-Uk BANG, Mu-Yeol CHOI, Sang-Hun KIM, Oh-Wook KWON
    原稿種別: PAPER
    専門分野: Speech and Hearing
    2020 年 E103.D 巻 2 号 p. 406-415
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    As deep learning-based speech recognition systems are spotlighted, the need for large-scale speech databases for acoustic model training is increasing. Broadcast data can be easily used for database construction, since it contains transcripts for the hearing impaired. However, the subtitle timestamps have not been used to extract speech data because they are often inaccurate due to the inherent characteristics of closed captioning. Thus, we propose to build a large-scale speech database from multi-genre broadcast data with inaccurate subtitle timestamps. The proposed method first extracts the most likely speech intervals by removing subtitle texts with low subtitle quality index, concatenating adjacent subtitle texts into a merged subtitle text, and adding a margin to the timestamp of the merged subtitle text. Next, a speech recognizer is used to extract a hypothesis text of a speech segment corresponding to the merged subtitle text, and then the hypothesis text obtained from the decoder is recursively aligned with the merged subtitle text. Finally, the speech database is constructed by selecting the sub-parts of the merged subtitle text that match the hypothesis text. Our method successfully refines a large amount of broadcast data with inaccurate subtitle timestamps, taking about half of the time compared with the previous methods. Consequently, our method is useful for broadcast data processing, where bulk speech data can be collected every hour.

  • Ming DAI, Zhiheng ZHOU, Tianlei WANG, Yongfan GUO
    原稿種別: PAPER
    専門分野: Image Processing and Video Processing
    2020 年 E103.D 巻 2 号 p. 416-423
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    In many real application scenarios of image segmentation problems involving limited and low-quality data, employing prior information can significantly improve the segmentation result. For example, the shape of the object is a kind of common prior information. In this paper, we introduced a new kind of prior information, which is named by prior distribution. On the basis of nonparametric statistical active contour model, we proposed a novel distribution prior model. Unlike traditional shape prior model, our model is not sensitive to the shapes of object boundary. Using the intensity distribution of objects and backgrounds as prior information can simplify the process of establishing and solving the model. The idea of constructing our energy function is as follows. During the contour curve convergence, while maximizing distribution difference between the inside and outside of the active contour, the distribution difference between the inside/outside of contour and the prior object/background is minimized. We present experimental results on a variety of synthetic and natural images. Experimental results demonstrate the potential of the proposed method that with the information of prior distribution, the segmentation effect and speed can be both improved efficaciously.

  • Kazuki SAKAI, Ryuichiro HIGASHINAKA, Yuichiro YOSHIKAWA, Hiroshi ISHIG ...
    原稿種別: PAPER
    専門分野: Natural Language Processing
    2020 年 E103.D 巻 2 号 p. 424-434
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Argumentation is a process of reaching a consensus through premises and rebuttals. If an artificial dialogue system can perform argumentation, it can improve users' decisions and ability to negotiate with the others. Previously, researchers have studied argumentative dialogue systems through a structured database regarding argumentation structure and evaluated the logical consistency of the dialogue. However, these systems could not change its response based on the user's agreement or disagreement to its last utterance. Furthermore, the persuasiveness of the generated dialogue has not been evaluated. In this study, a method is proposed to generate persuasive arguments through a hierarchical argumentation structure that considers human agreement and disagreement. Persuasiveness is evaluated through a crowd sourcing platform wherein participants' written impressions of shown dialogue texts are scored via a third person Likert scale evaluation. The proposed method was compared to the baseline method wherein argument response texts were generated without consideration of the user's agreement or disagreement. Experiment results suggest that the proposed method can generate a more persuasive dialogue than the baseline method. Further analysis implied that perceived persuasiveness was induced by evaluations of the behavior of the dialogue system, which was inherent in the hierarchical argumentation structure.

  • Andros TJANDRA, Sakriani SAKTI, Satoshi NAKAMURA
    原稿種別: PAPER
    専門分野: Music Information Processing
    2020 年 E103.D 巻 2 号 p. 435-449
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Recurrent Neural Network (RNN) has achieved many state-of-the-art performances on various complex tasks related to the temporal and sequential data. But most of these RNNs require much computational power and a huge number of parameters for both training and inference stage. Several tensor decomposition methods are included such as CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. First, we evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods. Later, we evaluate our proposed TT-GRU with speech recognition task. We compressed the bidirectional GRU layers inside DeepSpeech2 architecture. Based on our experiment result, our proposed TT-format GRU are able to preserve the performance while reducing the number of GRU parameters significantly compared to the uncompressed GRU.

  • Guizhong ZHANG, Baoxian WANG, Zhaobo YAN, Yiqiang LI, Huaizhi YANG
    原稿種別: LETTER
    専門分野: Artificial Intelligence, Data Mining
    2020 年 E103.D 巻 2 号 p. 450-453
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    In this work, we present one novel rust detection method based upon one-class classification and L2 sparse representation (SR) with decision fusion. Firstly, a new color contrast descriptor is proposed for extracting the rust features of steel structure images. Considering that the patterns of rust features are more simplified than those of non-rust ones, one-class support vector machine (SVM) classifier and L2 SR classifier are designed with these rust image features, respectively. After that, a multiplicative fusion rule is advocated for combining the one-class SVM and L2 SR modules, thereby achieving more accurate rust detecting results. In the experiments, we conduct numerous experiments, and when compared with other developed rust detectors, the presented method can offer better rust detecting performances.

  • Joanna Kazzandra DUMAGPI, Woo-Young JUNG, Yong-Jin JEONG
    原稿種別: LETTER
    専門分野: Artificial Intelligence, Data Mining
    2020 年 E103.D 巻 2 号 p. 454-458
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Threat object recognition in x-ray security images is one of the important practical applications of computer vision. However, research in this field has been limited by the lack of available dataset that would mirror the practical setting for such applications. In this paper, we present a novel GAN-based anomaly detection (GBAD) approach as a solution to the extreme class-imbalance problem in multi-label classification. This method helps in suppressing the surge in false positives induced by training a CNN on a non-practical dataset. We evaluate our method on a large-scale x-ray image database to closely emulate practical scenarios in port security inspection systems. Experiments demonstrate improvement against the existing algorithm.

  • Jiateng LIU, Wenming ZHENG, Yuan ZONG, Cheng LU, Chuangao TANG
    原稿種別: LETTER
    専門分野: Pattern Recognition
    2020 年 E103.D 巻 2 号 p. 459-463
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    In this letter, we propose a novel deep domain-adaptive convolutional neural network (DDACNN) model to handle the challenging cross-corpus speech emotion recognition (SER) problem. The framework of the DDACNN model consists of two components: a feature extraction model based on a deep convolutional neural network (DCNN) and a domain-adaptive (DA) layer added in the DCNN utilizing the maximum mean discrepancy (MMD) criterion. We use labeled spectrograms from source speech corpus combined with unlabeled spectrograms from target speech corpus as the input of two classic DCNNs to extract the emotional features of speech, and train the model with a special mixed loss combined with a cross-entrophy loss and an MMD loss. Compared to other classic cross-corpus SER methods, the major advantage of the DDACNN model is that it can extract robust speech features which are time-frequency related by spectrograms and narrow the discrepancies between feature distribution of source corpus and target corpus to get better cross-corpus performance. Through several cross-corpus SER experiments, our DDACNN achieved the state-of-the-art performance on three public emotion speech corpora and is proved to handle the cross-corpus SER problem efficiently.

  • Jichen YANG, Longting XU, Bo REN
    原稿種別: LETTER
    専門分野: Speech and Hearing
    2020 年 E103.D 巻 2 号 p. 464-468
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Under the framework of traditional power spectrum based feature extraction, in order to extract more discriminative information for playback attack detection, this paper proposes a feature by making use of deep neural network to describe the nonlinear relationship between power spectrum and discriminative information. Namely, constant-Q deep coefficients (CQDC). It relies on constant-Q transform, deep neural network and discrete cosine transform. In which, constant-Q transform is used to convert signal from the time domain into the frequency domain because it is a long-term transform that can provide more frequency detail, deep neural network is used to extract more discriminative information to discriminate playback speech from genuine speech and discrete cosine transform is used to decorrelate among the feature dimensions. ASVspoof 2017 corpus version 2.0 is used to evaluate the performance of CQDC. The experimental results show that CQDC outperforms the existing power spectrum obtained from constant-Q transform based features, and equal error can reduce from 19.18% to 51.56%. In addition, we found that discriminative information of CQDC hides in all frequency bins, which is different from commonly used features.

  • Yong-Uk YOON, Do-Hyeon PARK, Jae-Gon KIM
    原稿種別: LETTER
    専門分野: Image Processing and Video Processing
    2020 年 E103.D 巻 2 号 p. 469-471
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Cross-component linear model (CCLM) has been recently adopted as a chroma intra-prediction tool in Versatile Video Coding (VVC), which is being developed as a new video coding standard. CCLM predicts chroma components from luma components through a linear model based on assumption of linear correlation between both components. A linear model is derived from the reconstructed neighboring luma and chroma samples of the current coding block by linear regression. A simplified linear modeling method recently adopted in the test model of VVC (VTM) 3.0 significantly reduces computational complexity of deriving model parameters with considerable coding loss. This letter proposes a method of linear modeling to compensate the coding loss of the simplified linear model. In the proposed method, the model parameters which are quite roughly derived in the existing simplified linear model are refined more accurately using individual method to derive each parameter. Experimental results show that, compared to VTM 3.0, the proposed method gives 0.08%, 0.52% and 0.55% Bjotegaard-Delta (BD)-rate savings, for Y, Cb and Cr components, respectively, in the All-Intra (AI) configuration with negligible computational complexity increase.

  • Dohyeon PARK, Jinho LEE, Jung-Won KANG, Jae-Gon KIM
    原稿種別: LETTER
    専門分野: Image Processing and Video Processing
    2020 年 E103.D 巻 2 号 p. 472-475
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    The emerging Versatile Video Coding (VVC) standard currently adopts Triangular Partitioning Mode (TPM) to make more flexible inter prediction. Due to the motion search and motion storage for TPM, the complexity of the encoder and decoder is significantly increased. This letter proposes two simplifications of TPM for reducing the complexity of the current design. One simplification is to reduce the number of combinations of motion vectors for both partitions to be checked. The method gives 4% encoding time decrease with negligible BD-rate loss. Another one is to remove the reference picture remapping process in the motion vector storage of TPM. It reduces the complexity of the encoder and decoder without a BD-rate change for the random-access configuration.

  • Chunhua QIAN, Mingyang LI, Yi REN
    原稿種別: LETTER
    専門分野: Image Recognition, Computer Vision
    2020 年 E103.D 巻 2 号 p. 476-479
    発行日: 2020/02/01
    公開日: 2020/02/01
    ジャーナル フリー

    Tea sprouts segmentation via machine vision is the core technology of tea automatic picking. A novel method for Tea Sprouts Segmentation based on improved deep convolutional encoder-decoder Network (TS-SegNet) is proposed in this paper. In order to increase the segmentation accuracy and stability, the improvement is carried out by a contrastive-center loss function and skip connections. Therefore, the intra-class compactness and inter-class separability are comprehensively utilized, and the TS-SegNet can obtain more discriminative tea sprouts features. The experimental results indicate that the proposed method leads to good segmentation results, and the segmented tea sprouts are almost coincident with the ground truth.

feedback
Top