IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E107.D, Issue 10
Displaying 1-12 of 12 articles from this issue
Regular Section
  • Hongzhi XU, Binlian ZHANG
    Article type: PAPER
    Subject area: Fundamentals of Information Systems
    2024Volume E107.DIssue 10 Pages 1285-1296
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Reliability is an important figure of merit of the system and it must be satisfied in safety-critical applications. This paper considers parallel applications on heterogeneous embedded systems and proposes a two-phase algorithm framework to minimize energy consumption for satisfying applications' reliability requirement. The first phase is for initial assignment and the second phase is for either satisfying the reliability requirement or improving energy efficiency. Specifically, when the application's reliability requirement cannot be achieved via the initial assignment, an algorithm for enhancing the reliability of tasks is designed to satisfy the application's reliability requirement. Considering that the reliability of initial assignment may exceed the application's reliability requirement, an algorithm for reducing the execution frequency of tasks is designed to improve energy efficiency. The proposed algorithms are compared with existing algorithms by using real parallel applications. Experimental results demonstrate that the proposed algorithms consume less energy while satisfying the application's reliability requirements.

    Download PDF (1273K)
  • Haruhiko KAIYA, Shinpei OGATA, Shinpei HAYASHI
    Article type: PAPER
    Subject area: Software Engineering
    2024Volume E107.DIssue 10 Pages 1297-1311
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Before introducing systems to an activity in a business or in daily life, the effects of these systems should first be carefully examined by analysts. Thus, methods for examining such effects are required at the early stage of requirements analysis. In this study, we propose and evaluate an analysis method using a modeling notation for this purpose, called goal dependency modeling and analysis (GDMA). In an activity, an actor, such as a person or a system, expects a goal to be achieved. The actor or another actor will achieve this goal. We focus herein on such a goal and the two different roles played by the actors. In GDMA, the dependencies in the roles of the two actors about a goal are mainly represented. GDMA enables analysts to observe the change of actors, their expectations, and abilities by using metrics. Each metric is defined on the basis of the GDMA meta-model. Therefore, GDMA enables them to decide whether the change is good or bad both quantitatively and qualitatively for the people. We evaluate GDMA by describing models of the actual system introduction written in the literatures and explain the effects caused by this introduction. In addition, CASE tools are crucial in efficiently and accurately performing GDMA. Hence, we develop its tools by extending an existing UML modeling tool.

    Download PDF (3724K)
  • Rina TAGAMI, Hiroki KOBAYASHI, Shuichi AKIZUKI, Manabu HASHIMOTO
    Article type: PAPER
    Subject area: Pattern Recognition
    2024Volume E107.DIssue 10 Pages 1312-1321
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Due to the revitalization of the semiconductor industry and efforts to reduce labor and unmanned operations in the retail and food manufacturing industries, objects to be recognized at production sites are increasingly diversified in color and design. Depending on the target objects, it may be more reliable to process only color information, while intensity information may be better, or a combination of color and intensity information may be better. However, there are not many conventional method for optimizing the color and intensity information to be used, and deep learning is too costly for production sites. In this paper, we optimize the combination of the color and intensity information of a small number of pixels used for matching in the framework of template matching, on the basis of the mutual relationship between the target object and surrounding objects. We propose a fast and reliable matching method using these few pixels. Pixels with a low pixel pattern frequency are selected from color and grayscale images of the target object, and pixels that are highly discriminative from surrounding objects are carefully selected from these pixels. The use of color and intensity information makes the method highly versatile for object design. The use of a small number of pixels that are not shared by the target and surrounding objects provides high robustness to the surrounding objects and enables fast matching. Experiments using real images have confirmed that when 14 pixels are used for matching, the processing time is 6.3msec and the recognition success rate is 99.7%. The proposed method also showed better positional accuracy than the comparison method, and the optimized pixels had a higher recognition success rate than the non-optimized pixels.

    Download PDF (13370K)
  • Yuka KO, Katsuhito SUDOH, Sakriani SAKTI, Satoshi NAKAMURA
    Article type: PAPER
    Subject area: Speech and Hearing
    2024Volume E107.DIssue 10 Pages 1322-1331
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    End-to-end speech translation (ST) directly renders source language speech to the target language without intermediate automatic speech recognition (ASR) output as in a cascade approach. End-to-end ST avoids error propagation from intermediate ASR results. Although recent attempts have applied multi-task learning using an auxiliary task of ASR to improve ST performance, they use cross-entropy loss to one-hot references in the ASR task, and the trained ST models do not consider possible ASR confusion. In this study, we propose a novel multi-task learning framework for end-to-end STs leveraged by ASR-based loss against posterior distributions obtained using a pre-trained ASR model called ASR posterior-based loss (ASR-PBL). The ASR-PBL method, which enables a ST model to reflect possible ASR confusion among competing hypotheses with similar pronunciations, can be applied to one of the strong multi-task ST baseline models with Hybrid CTC/Attention ASR task loss. In our experiments on the Fisher Spanish-to-English corpus, the proposed method demonstrated better BLEU results than the baseline that used standard CE loss.

    Download PDF (1342K)
  • Wenxia BAO, An LIN, Hua HUANG, Xianjun YANG, Hemu CHEN
    Article type: PAPER
    Subject area: Image Recognition, Computer Vision
    2024Volume E107.DIssue 10 Pages 1332-1341
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Recent years have seen remarkable progress in human pose estimation. However, manual annotation of keypoints remains tedious and imprecise. To alleviate this problem, this paper proposes a novel method called Multi-Scale Contrastive Learning (MSCL). This method uses a siamese network structure with upper and lower branches that capture diffirent views of the same image. Each branch uses a backbone network to extract image representations, employing multi-scale feature vectors to capture information. These feature vectors are then passed through an enhanced feature pyramid for fusion, producing more robust feature representations. The feature vectors are then further encoded by mapping and prediction heads to predict the feature vector of another view. Using negative cosine similarity between vectors as a loss function, the backbone network is pre-trained on a large-scale unlabeled dataset, enhancing its capacity to extract visual representations. Finally, transfer learning is performed on a small amount of labelled data for the pose estimation task. Experiments on COCO datasets show significant improvements in Average Precision (AP) of 1.8%, 0.9%, and 1.2% with 1%, 5%, and 10% labelled data on COCO. In addition, the Percentage of Correct Keypoints (PCK) improves by 0.5% on MPII&AIC, outperforming mainstream contrastive learning methods.

    Download PDF (2256K)
  • Jiakai LI, Jianyong DUAN, Hao WANG, Li HE, Qing ZHANG
    Article type: PAPER
    Subject area: Natural Language Processing
    2024Volume E107.DIssue 10 Pages 1342-1352
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Chinese spelling correction is a foundational task in natural language processing that aims to detect and correct spelling errors in text. Most spelling corrections in Chinese used multimodal information to model the relationship between incorrect and correct characters. However, feature information mismatch occured during fusion result from the different sources of features, causing the importance relationships between different modalities to be ignored, which in turn restricted the model from learning in an efficient manner. To this end, this paper proposes a multimodal language model-based Chinese spelling corrector, named as MISpeller. The method, based on ChineseBERT as the basic model, allows the comprehensive capture and fusion of character semantic information, phonetic information and graphic information in a single model without the need to construct additional neural networks, and realises the phenomenon of unequal fusion of multi-feature information. In addition, in order to solve the overcorrection issues, the replication mechanism is further introduced, and the replication factor is used as the dynamic weight to efficiently fuse the multimodal information. The model is able to control the proportion of original characters and predicted characters according to different input texts, and it can learn more specifically where errors occur. Experiments conducted on the SIGHAN benchmark show that the proposed model achieves the state-of-the-art performance of the F1 score at the correction level by an average of 4.36%, which validates the effectiveness of the model.

    Download PDF (1565K)
  • Yuxin HUANG, Yuanlin YANG, Enchang ZHU, Yin LIANG, Yantuan XIAN
    Article type: PAPER
    Subject area: Natural Language Processing
    2024Volume E107.DIssue 10 Pages 1353-1361
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Chinese-Vietnamese cross-lingual event retrieval aims to retrieve the Vietnamese sentence describing the same event as a given Chinese query sentence from a set of Vietnamese sentences. Existing mainstream cross-lingual event retrieval methods rely on extracting textual representations from query texts and calculating their similarity with textual representations in other language candidate sets. However, these methods ignore the difference in event elements present during Chinese-Vietnamese cross-language retrieval. Consequently, sentences with similar meanings but different event elements may be incorrectly considered to describe the same event. To address this problem, we propose a cross-lingual retrieval method that integrates event elements. We introduce event elements as an additional supervisory signal, where we calculate the semantic similarity of event elements in two sentences using an attention mechanism to determine the attention score of the event elements. This allows us to establish a one-to-one correspondence between event elements in the text. Additionally, we leverage the multilingual pre-trained language model fine-tuned based on contrastive learning to obtain cross-language sentence representation to calculate the semantic similarity of the sentence texts. By combining these two approaches, we obtain the final text similarity score. Experimental results demonstrate that our proposed method achieves higher retrieval accuracy than the baseline model.

    Download PDF (4642K)
  • Weizhi WANG, Lei XIA, Zhuo ZHANG, Xiankai MENG
    Article type: LETTER
    Subject area: Software Engineering
    2024Volume E107.DIssue 10 Pages 1362-1366
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Smart contracts, as a form of digital protocol, are computer programs designed for the automatic execution, control, and recording of contractual terms. They permit transactions to be conducted without the need for an intermediary. However, the economic property of smart contracts makes their vulnerabilities susceptible to hacking attacks, leading to significant losses. In this paper, we introduce a smart contract timestamp vulnerability detection technique HomoDec based on code homogeneity. The core idea of this technique involves comparing the homogeneity between the code of the test smart contract and the existing smart contract vulnerability codes in the database to determine whether the tested code has a timestamp vulnerability. Specifically, HomoDec first explores how to vectorize smart contracts reasonably and efficiently, representing smart contract code as a high-dimensional vector containing features of code vulnerabilities. Subsequently, it investigates methods to determine the homogeneity between the test codes and the ones in vulnerability code base, enabling the detection of potential timestamp vulnerabilities in smart contract code.

    Download PDF (951K)
  • Na XING, Lu LI, Ye ZHANG, Shiyi YANG
    Article type: LETTER
    Subject area: Information Network
    2024Volume E107.DIssue 10 Pages 1367-1371
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Unmanned aerial vehicle (UAV)-assisted systems have attracted a lot of attention due to its high probability of line-of-sight (LoS) connections and flexible deployment. In this paper, we aim to minimize the upload time required for the UAV to collect information from the sensor nodes in disaster scenario, while optimizing the deployment position of UAV. In order to get the deployment solution quickly, a data-driven approach is proposed in which an optimization strategy acts as the expert. Considering that images could capture the spatial configurations well, we use a convolutional neural network (CNN) to learn how to place the UAV. In the end, the simulation results demonstrate the effectiveness and generalization of the proposed method. After training, our CNN can generate UAV configuration faster than the general optimization-based algorithm.

    Download PDF (4776K)
  • Liu ZHANG, Zilong WANG, Jinyu LU
    Article type: LETTER
    Subject area: Information Network
    2024Volume E107.DIssue 10 Pages 1372-1375
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Based on the framework of a multi-stage key recovery attack for a large block cipher, 2 and 3-round differential-neural distinguishers were trained for AES using partial ciphertext bits. The study introduces the differential characteristics employed for the 2-round ciphertext pairs and explores the reasons behind the near 100% accuracy of the 2-round differential neural distinguisher. Utilizing the trained 2-round distinguisher, the 3-round subkey of AES is successfully recovered through a multi-stage key guessing. Additionally, a complexity analysis of the attack is provided, validating the effectiveness of the proposed method.

    Download PDF (362K)
  • Zhe WANG, Zhe-Ming LU, Hao LUO, Yang-Ming ZHENG
    Article type: LETTER
    Subject area: Artificial Intelligence, Data Mining
    2024Volume E107.DIssue 10 Pages 1376-1379
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    To accurately extract tabular data, we propose a novel cell-based tabular data extraction model (TDEM). The key of TDEM is to utilize grayscale projection of row separation lines, coupled with table masks and column masks generated by the VGG-19 neural network, to segment each individual cell from the input image of the table. In this way, the text content of the table is extracted from a specific single cell, which greatly improves the accuracy of table recognition.

    Download PDF (430K)
  • Zheqing ZHANG, Hao ZHOU, Chuan LI, Weiwei JIANG
    Article type: LETTER
    Subject area: Image Processing and Video Processing
    2024Volume E107.DIssue 10 Pages 1380-1384
    Published: October 01, 2024
    Released on J-STAGE: October 01, 2024
    JOURNAL FREE ACCESS

    Single-image dehazing is a challenging task in computer vision research. Aiming at the limitations of traditional convolutional neural network representation capabilities and the high computational overhead of the self-attention mechanism in recent years, we proposed image attention and designed a single image dehazing network based on the image attention: IAD-Net. The proposed image attention is a plug-and-play module with the ability of global modeling. IAD-Net is a parallel network structure that combines the global modeling ability of image attention and the local modeling ability of convolution, so that the network can learn global and local features. The proposed network model has excellent feature learning ability and feature expression ability, has low computational overhead, and also improves the detail information of hazy images. Experiments verify the effectiveness of the image attention module and the competitiveness of IAD-Net with state-of-the-art methods.

    Download PDF (6063K)
feedback
Top