Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 29, Issue 3
Displaying 1-19 of 19 articles from this issue
Preface (Non Peer-Reviewed)
General Paper (Peer-Reviewed)
  • Tareq Alkhaldi, Chenhui Chu, Sadao Kurohashi
    2022 Volume 29 Issue 3 Pages 762-784
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    Recent research shows that Transformer-based language models (LMs) store considerable factual knowledge from the unstructured text datasets on which they are pre-trained. The existence and amount of such knowledge have been investigated by probing pre-trained Transformers to answer questions without accessing any external context or knowledge (also called closed-book question answering (QA)). However, this factual knowledge is spread over the parameters inexplicably. The parts of the model most responsible for finding an answer only from a question are unclear. This study aims to understand which parts are responsible for the Transformer-based T5 reaching an answer in a closed-book QA setting. Furthermore, we introduce a head importance scoring method and compare it with other methods on three datasets. We investigate important parts by looking inside the attention heads in a novel manner. We also investigate why some heads are more critical than others and suggest a good identification approach. We demonstrate that some model parts are more important than others in retaining knowledge through a series of pruning experiments. We also investigate the roles of encoder and decoder in a closed-book setting.

    Download PDF (1033K)
  • Yukun Feng, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura, Manabu O ...
    2022 Volume 29 Issue 3 Pages 785-806
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    We propose a simple and effective method for incorporating word clusters into the Continuous Bag-of-Words (CBOW) model. Specifically, we propose replacing infrequent input and output words in CBOW with their clusters. The resulting cluster-incorporated CBOW model produces embeddings of frequent words and a small amount of cluster embeddings, which will be fine-tuned in downstream tasks. We empirically demonstrate that our replacing method works well on several downstream tasks. Through our analysis, we also show that our method is potentially useful for other similar models that produce word embeddings.

    Download PDF (1882K)
  • Hirokazu Kiyomaru, Sadao Kurohashi
    2022 Volume 29 Issue 3 Pages 807-834
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    Volitionality and subject animacy are fundamental and closely related properties of an event. Their classification is challenging because it requires contextual text understanding and a huge amount of labeled data. This paper proposes a novel method that jointly learns volitionality and subject animacy at a low cost, heuristically labeling events in a raw corpus. Volitionality labels are assigned using a small lexicon of volitional and non-volitional adverbs such as “deliberately” and “accidentally”; subject animacy labels are assigned using a list of animate and inanimate nouns obtained from ontological knowledge. We then consider the problem of learning a classifier from the labeled data so that it can perform well on unlabeled events without the words used for labeling. We regard the problem as a bias reduction or unsupervised domain adaptation problem and apply the techniques. We conduct experiments with crowdsourced gold data in Japanese and English and show that our method effectively learns volitionality and subject animacy without manually labeled data.

    Download PDF (548K)
  • Kazuki Akiyama, Akihiro Tamura, Takashi Ninomiya, Tomoyuki Kajiwara
    2022 Volume 29 Issue 3 Pages 835-853
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    This paper proposes a new abstractive summarization model for documents, hierarchical BART (Hie-BART), which captures the hierarchical structures of documents (i.e., their sentence-word structures) in the BART model. Although the existing BART model has achieved state-of-the-art performance on document summarization tasks, it does not account for interactions between sentence-level and word-level information. In machine translation tasks, the performance of neural machine translation models can be improved with the incorporation of multi-granularity self-attention (MG-SA), which captures relationships between words and phrases. Inspired by previous work, the proposed Hie-BART model incorporates MG-SA into the encoder of the BART model for capturing sentence-word structures. As for the improvement of summarization performance by the proposed method, the evaluation using the CNN/Daily Mail dataset shows an improvement of 0.1 points on ROUGE-L.

    Download PDF (622K)
  • Kono Shinji, Kanako Komiya, Shinnou Hiroyuki
    2022 Volume 29 Issue 3 Pages 854-874
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    BERT is a pre-trained model that can achieve high accuracy for various NLP tasks with fine-tuning. However, BERT requires tuning of many parameters, making training and inference time-consuming. In this study, we propose to apply smaller BERT for Japanese parsing by dropping some of its layers. In our experiments, we compared the parsing accuracy and processing time of the BERT developed by Kyoto University and smaller BERT with dropping some of its layers, using the Kyoto University Web Document Leads Corpus (referred to as the web corpus) and the Kyoto University Text Corpus (referred to as the text corpus). The experiments revealed that the smaller BERT reduced the training time by 83% and inference time by 65% for the Web corpus and 85% for the text corpus, while maintaining the accuracy degradation from the Kyoto University version of BERT to 0.87 points for the Web corpus and 0.91 points for the text corpus.

    Download PDF (1278K)
  • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Ma ...
    2022 Volume 29 Issue 3 Pages 875-900
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    Recent Rhetorical Structure Theory (RST)-style discourse parsing methods are trained by supervised learning, requiring an annotated corpus of sufficient size and quality. However, the RST Discourse Treebank, the most extensive corpus, consists of only 385 documents. This is insufficient to learn a long-tailed rhetorical-relation label distribution. To solve this problem, we propose a novel approach to improve the performance of low-frequency labels. Our approach utilized a silver dataset obtained from different parsers as a teacher parser. We extracted agreement subtrees from RST trees built by multiple teacher parsers to obtain a more reliable silver dataset. We used span-based top-down RST parser, a neural SOTA model, as a student parser. In our training procedure, we first pre-trained the student parser by the silver dataset and then fine-tuned it with a gold dataset, a human-annotated dataset. Experimental results showed that our parser achieved excellent scores for nuclearity and relation, that is, 64.7 and 54.1, respectively, on the Original Parseval.

    Download PDF (733K)
  • Kazuaki Hanawa, Ryo Nagata, Kentaro Inui
    2022 Volume 29 Issue 3 Pages 901-924
    Published: 2022
    Released on J-STAGE: September 15, 2022
    JOURNAL FREE ACCESS

    Feedback comment generation is the task of generating explanatory notes for language learners. Although various generation techniques are available, little is known about which methods are appropriate for this task. Nagata (2019) demonstrates the effectiveness of neural-retrieval-based methods in generating feedback comments for preposition use. Retrieval-based methods have limitations in that they can only output feedback comments existing in the given training data. Besides, feedback comments can be made on other grammatical and writing items other than preposition use, which has not yet been addressed. To shed light on these points, we investigate a wider range of methods for generating various types of feedback comments in this study. Our close analysis of the features of the task leads us to investigate three different architectures for comment generation: (i) a neural-retrieval-based method as a baseline, (ii) a pointer-generator-based generation method as a neural seq2seq method, (iii) a retrieve-and-edit method, a hybrid of (i) and (ii). Intuitively, the pointer-generator should outperform neural-retrieval, and retrieve-and-edit should perform the best. However, in our experiments, this expectation is completely overturned. We closely analyze the results to reveal the major causes of these counter-intuitive results and report on our findings from the experiments, which will lead to further developments of feedback comment generation.

    Download PDF (1533K)
Survey paper (Peer-Reviewed)
Society Column (Non Peer-Reviewed)
Information (Non Peer-Reviewed)
feedback
Top