Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 28, Issue 4
Displaying 1-23 of 23 articles from this issue
Preface
General Paper
  • Namgi Han, Hiroshi Noji, Katsuhiko Hayashi, Hiroya Takamura, Yusuke Mi ...
    2021 Volume 28 Issue 4 Pages 938-964
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    Recent studies have indicated that existing systems for simple factoid question answering over a knowledge base are not robust for different datasets. We evaluated the ability of a pretrained language model, BERT, to perform this task on four datasets, Free917, FreebaseQA, SimpleQuestions, and WebQSP, and found that, like other existing systems, the existing BERT-based system also can not solve them robustly. To investigate the reason for this problem, we employ a statistical method, partial least squares path modeling (PLSPM), with 24 BERT models and two probing tasks, SentEval and GLUE. Our results reveal that the existing BERT-based system tends to depend on the surface and syntactic features of each dataset, and it disturbs the generality and robustness of the system performance. We also discuss the reason for this phenomenon by considering the features of each dataset and the method that was used to evaluate the simple factoid question answering task.

    Download PDF (295K)
  • Van-Hien Tran, Van-Thuy Phi, Akihiko Kato, Hiroyuki Shindo, Taro Watan ...
    2021 Volume 28 Issue 4 Pages 965-994
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    The joint entity and relation extraction task detects entity pairs along with their relations to extract relational triplets. A recent study (Yu et al. 2020) proposed a novel decomposition strategy that splits the task into two interrelated subtasks: detection of the head-entity (HE) and identification of the corresponding tail-entity and relation (TER) for each extracted head-entity. However, this strategy suffers from two major problems. First, if the HE detection task fails to find a valid head-entity, the model will then miss all related triplets containing this head-entity in the head role. Second, as Yu et al. (2020) stated, their model cannot solve the entity pair overlap (EPO) problem. For a given head-entity, the TER extraction task predicts only a single relation between the head-entity and a tail-entity, even though this entity pair can hold multiple relations. To address these problems, we propose an improved decomposition strategy that considers each extracted entity in two roles (head and tail) and allows a model to predict multiple relations (if any) of an entity pair. In addition, a corresponding model framework is presented to deploy our new decomposition strategy. Experimental results showed that our approach significantly outperformed the previous approach of Yu et al. (2020) and achieved state-of-the-art performance on two benchmark datasets.

    Download PDF (395K)
  • Yu Tanaka, Yugo Murawaki, Daisuke Kawahara, Sadao Kurohashi
    2021 Volume 28 Issue 4 Pages 995-1033
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    Correcting typographical errors (typos) is important to mitigate errors in downstream natural language processing tasks. Although a large number of typo–correction pairs are required to develop typo correction systems, no such dataset is available for the Japanese language. Previous studies on building French and English typo datasets have exploited Wikipedia. To collect typos, the aforementioned methods apply a spell checker to words changed during revisions. As the lack of word delimiters in Japanese hinders the application of a spell checker, these methods cannot be applied directly to Japanese. In this study, we build a Japanese typo dataset from Wikipedia’s revision history. We address the aforementioned problem by combining character-based extraction rules and various filtering methods. We evaluate our construction method using which we obtain over 700K typo–correction sentence pairs. Using the new dataset, we also build typo correction systems with a sequence-to-sequence pretrained model. As an auxiliary task for fine-tuning, we train the model to predict the readings of kanji, leading to a higher accuracy in the correction of erroneous kanji conversion. We also investigate the effect of pseudo training data. Finally, we demonstrate the higher accuracy achieved by our system for the typo recognition task compared with other proofreading systems.

    Download PDF (636K)
  • Ryota Nakao, Chenhui Chu, Sadao Kurohashi
    2021 Volume 28 Issue 4 Pages 1034-1052
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    In machine translation of spoken language, it is known that phenomena specific to spoken language have a negative impact on translation accuracy. Therefore, in this study, as a preprocessing step for Japanese-English translation in our university lecture translation system, we improve the translation accuracy by automatically converting spoken-style Japanese texts to written-style. First, we create a corpus consisting of Japanese transcriptions of university lectures, their conversions into written language, and the corresponding English texts. Next, we train spoken-written conversion models and Japanese-English translation models using the corpus. As a result, we show that spoken-written Japanese conversion improves the accuracy of Japanese-English translation. In addition, we quantify which phenomena affect translation accuracy and to what extent.

    Download PDF (459K)
  • Chenlong Hu, Yukun Feng, Hidetaka Kamigaito, Hiroya Takamura, Manabu O ...
    2021 Volume 28 Issue 4 Pages 1053-1088
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    This work presents multi-modal deep SVDD (mSVDD) for one-class text classification. By extending the uni-modal SVDD to a multiple modal one, we build mSVDD with multiple hyperspheres, that enable us to build a much better description for target one-class data. Additionally, the end-to-end architecture of mSVDD can jointly handle neural feature learning and one-class text learning. We also introduce a mechanism for incorporating negative supervision in the absence of real negative data, which can be beneficial to one-class text models including mSVDD model. We conduct experiments on Reuters, 20 Newsgroup, and TREC datasets, and the experimental results demonstrate that mSVDD outperforms uni-modal SVDD and mSVDD can get further improvements when negative supervision is incorporated.

    Download PDF (587K)
  • Yuya Sawada, Hiroki Teranishi, Yuji Matsumoto, Taro Watanabe
    2021 Volume 28 Issue 4 Pages 1089-1115
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    Named entity recognition is a fundamental task to detect terminologies from a text such as scientific literature. Previous studies treat only entities in continuous spans although sentences may contain compound named entities with coordination, which cannot be identified by existing recognizers. To recognize those entities, we propose a pipeline method in which entities with coordination are first identified using an unsupervised method utilizing a pretrained language model and then individual embedded named entities are normalized. In experiments, we demonstrate that our coordination identification method is comparable to the state-of-the-art supervised model for GENIA treebank, and the performance of named entity recognition for GENIA term annotation is improved by our normalization method.

    Download PDF (680K)
  • Abdurrisyad Fikri, Hiroya Takamura, Manabu Okumura
    2021 Volume 28 Issue 4 Pages 1116-1140
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    The ability to capture the conversation context is a necessity to build a good conversation model. However, a good model must also provide interesting and diverse responses to mimic actual human conversations. Given that different people can respond differently to the same utterance, we believe that using user-specific attributes can be useful for a conversation task. In this study, we attempt to drive the style of generated responses to resemble the style of real people using user-specific information. Our experiments show that our method applies to both seen and unseen users. Human evaluation also shows that our model outperforms the baselines in terms of relevance and style similarity.

    Download PDF (284K)
  • Taichi Ishiwatari, Yuki Yasuda, Taro Miyazaki, Jun Goto
    2021 Volume 28 Issue 4 Pages 1141-1161
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    Interest in emotion recognition in conversations (ERC) has been increasing in various fields, because it can be used in the various tasks such as analyzing user behaviors and detecting fake news. Many recent ERC methods use graph neural networks to consider the relationships between the utterances of the speakers. In particular, the strong method considers self-speaker and inter-speaker dependencies in conversations by using relational graph attention networks (RGAT). However, graph neural networks do not consider sequential information. In this paper, we propose relational position encodings that provide RGAT with sequential information reflecting the relational graph structure. Therefore, our RGAT model can capture both the speaker dependency and the sequential information. Experiments on three ERC datasets show that our model is beneficial to recognizing emotions expressed in conversations. In addition, our approach empirically outperforms the state-of-the-art on several benchmark datasets.

    Download PDF (592K)
  • Hideya Mino, Hitoshi Ito, Isao Goto, Ichiro Yamada, Takenobu Tokunaga
    2021 Volume 28 Issue 4 Pages 1162-1183
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    This paper proposes a method for a context-aware neural machine translation (NMT) using both the ground truth and the machine-translated previous sentences of the target-side. Through the progress made in a sentence-level NMT, a context-aware NMT has been rapidly developed to exploit previous sentences as context. Recent work in the context-aware NMT incorporates source- or target-side contexts. In contrast to the source-side context, the target-side context causes a gap between training that utilizes a ground truth sentence and a machine-translated sentence as context. This gap leads to translation quality deteriorating because the translation model is trained with only the ground truth data that cannot be used in the inference. The proposed method can make the translation model robust against mistakes and biases made at the inference. We confirmed the improvements of our proposed approach compared to models using the previous approaches in English ↔ Japanese and English ↔ German translation tasks.

    Download PDF (417K)
  • Itsugun Cho, Hiroaki Saito
    2021 Volume 28 Issue 4 Pages 1184-1209
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    We constructed a high-quality open-domain dialogue generation model called Anna that is composed of a hierarchical self-attention network with multiple convolution filters and a topic-augmented network. During daily conversations, humans typically respond by understanding a dialogue history and assembling their knowledge regarding the topic. However, existing dialogue generation models are weak at capturing the dependencies among words or utterances, resulting in an insufficient understanding of context and the generation of irrelevant responses. Previous works have largely ignored topic information modeling in multi-turn dialogue, making responses overly generic. Although pre-training using large-scale transformer models has recently resulted in enhanced performance, large parameter sizes complicate such models. Anna effectively captures contextual dependencies and assigns greater weight to important words and utterances to compute context representations. We incorporate topic information into our model as prior knowledge to synthesize topic representations. Two types of representations jointly determine the probability distributions of responses, which effectively simulates how people behave in real conversations. Empirical studies on both Chinese and English corpora demonstrate that Anna outperforms baseline models in terms of response quality, parameter size and decoding speed.

    Download PDF (602K)
  • Soichiro Murakami, Sora Tanaka, Masatsugu Hangyo, Hidetaka Kamigaito, ...
    2021 Volume 28 Issue 4 Pages 1210-1246
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    This research investigates the task of generating weather-forecast comments from simulation results of numerical weather prediction. This task has the following requirements. (i) The changes in numerical values for various physical quantities must be considered; (ii) the weather comments should be dependent on delivery time and area information; and (iii) the comments should provide useful information for users. To meet these requirements, we propose a data-to-text model, incorporating three types of encoders for numerical forecast maps, observation data, and metadata. We also introduce weather labels representing weather information, such as sunny or rain, in our model to describe useful information explicitly. Furthermore, we conducted automatic and human evaluations. The results indicate that our model exhibits the best performance when compared with baseline models in terms of informativeness.

    Download PDF (1138K)
  • Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, Hiroya Tak ...
    2021 Volume 28 Issue 4 Pages 1247-1269
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. Herein, we introduce a new information extraction task, i.e., metric-type identification from multilevel header numerical tables, and provide a dataset extracted from scientific papers comprising header tables, captions, and metric-types. We propose joint-learning neural classification and generation schemes featuring pointer-generator-based and pretrained-based models. Our results show that the joint models can manage both in-header and out-of-header metric-type identification problems. Furthermore, transfer learning using fine-tuned pretrained-based models successfully improves the performance. The domain-specific of BERT-based model, SciBERT, achieves the best performance. Results achieved by a fine-tuned T5-based model are comparable to those obtained using our BERT-based model under a multitask setting.

    Download PDF (1195K)
System Paper
  • Masao Ideuchi, Yohei Sakamoto, Yoshiaki Oida, Isaac Okada, Shohei Higa ...
    2021 Volume 28 Issue 4 Pages 1270-1298
    Published: 2021
    Released on J-STAGE: December 15, 2021
    JOURNAL FREE ACCESS

    An enterprise resource planning (ERP) package consists of software to support day-to-day business activities and contains multiple components. System engineers combine the most appropriate software components for system integration using ERP packages. Because component selection is a very difficult task, even for experienced system engineers, there is a demand for machine-learning-based systems that support appropriate component selection by reading the text of requirement specifications and predicting suitable components. However, sufficient prediction accuracy has not been achieved thus far as a result of the sparsity and diversity of training data, which consist of specification texts paired with their corresponding components. We implemented round-trip translation at both training and testing times to alleviate the sparsity and diversity problems, adopted pre-trained models to exploit the similarity of text data, and utilized an ensemble of diverse models to take advantage of models for both the original and round-trip translated data. Through experiments with actual project data from ERP system integration, we confirmed that round-trip translation alleviates the problems mentioned above and improves prediction accuracy. As a result, our method achieved sufficient accuracy for practical use.

    Download PDF (252K)
Society Column
Supporting Member Column
Information
feedback
Top