Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 30, Issue 3
Displaying 1-16 of 16 articles from this issue
Preface (Non Peer-Reviewed)
General Paper (Peer-Reviewed)
  • Suzuko Nishino, Tatsuya Ishigaki, Sohei Washino, Hiroki Igarashi, Akih ...
    2023 Volume 30 Issue 3 Pages 883-906
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    We propose document retrieval and comment generation tasks for automating horizon scanning (Sardar 2010). The steps used are: 1) retrieving news articles that imply drastic changes of society, and 2) writing subjective comments on each article for others' ease of understanding. We manually collected articles to be retrieved and subjective comments. We compare several methods for two proposed settings. Our experiments show that 1) manually collected articles are different from general articles regarding the words used and semantic distances, 2) the contents in the comment can be classified into several categories. Our experiments showed that our BERT model achieves sufficiently high performance while the comment generation is challenging.

    Download PDF (682K)
  • Koh Mitsuda, Ryuichiro Higashinaka, Yuhei Oga, Sen Yoshida
    2023 Volume 30 Issue 3 Pages 907-934
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    To develop a dialogue system that can build common ground with users, the process of building common ground through dialogue needs to be clarified. However, the studies on the process of building common ground have not been well conducted; much work has focused on the final task results as common ground in which users perform a collaborative task. In this study, to clarify the process of building common ground in the dialogue where workers collaboratively perform a given task, we propose a data collection method for automatically recording the process of building common ground through a dialogue by using the intermediate result of a task. We analyzed 984 dialogues, and the result of investigating the process of building common ground suggests that several typical patterns can exist in that process and that common ground tends to be built if each worker’s understanding is conveyed through the affirmation of a counterpart’s utterances. In addition, toward dialogue systems that can build common ground with users, we conducted an automatic estimation of the degree of built common ground and found that its degree can be estimated with a dialogue and intermediate task result.

    Download PDF (2328K)
  • Mana Ishida, Hitomi Yanaka, Daisuke Bekki
    2023 Volume 30 Issue 3 Pages 935-958
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    The increasing number of clinical texts, such as electronic medical records and discharge summaries, has led to ongoing research on natural language processing in the medical field. Since medical compound words often appear in clinical texts, in-depth semantic analysis and natural language inference tasks for clinical text are challenging. In this study, we developed a semantic analysis and logical inference system for clinical texts, called Medc2l, by improving ccg2lambda, an integrated system that maps a sentence to a higher-order logical formula by syntactic analysis and semantic composition and performs inference between logical formulas. To handle compound words in clinical texts, we included a compound-word analysis module to ccg2lambda. The compound-word analysis module first assigns semantic tags to the constituents of compound words using sequence-labeling models. Based on this information, the module derives the Combinatory Categorial Grammar (CCG) trees of compound words and obtains their semantic representations that allow logical inference. In addition, we created an inference test set involving compound words with critical texts. The experiments using the test set showed that our inference system achieved comparable or superior performance to the deep learning-based NLI models. Our system particularly predicted labels accurately for non-entailment problems.

    Download PDF (1248K)
  • Itsugun Cho, Dongyang Wang, Ryota Takahashi, Hiroaki Saito
    2023 Volume 30 Issue 3 Pages 959-990
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    Current studies on the generation of personalized dialogue primarily contribute to an agent presenting a consistent personality and driving a more informative response. However, we found that the responses generated from most previous models were self-centered, with little consideration for the user in the dialogue. Moreover, we consider human-like conversations to be essentially based on inferring information about the persona of the other party. Therefore, we propose a novel personalized dialogue generator that detects implicit user personas. Because it is difficult to collect a large amount of detailed personal facts for each user, we attempted to model the potential persona of a user and its representation from the dialogue history with no external knowledge. The perception and fader variables were conceived using conditional variational inference. The two latent variables simulate the process of people becoming aware of each other’s personas and producing a corresponding expression in conversations. Subsequently, posterior-discriminated regularization was performed to enhance the training procedure. Finally, a selector was designed to help our model provide long-sighted responses. Comprehensive experiments demonstrate that compared to the state-of-the-art methods, our approach is more concerned with the user’s persona and achieves a notable boost across both automatic metrics and human evaluations.

    Download PDF (3585K)
  • Tatsuya Zetsu, Tomoyuki Kajiwara, Yuki Arase
    2023 Volume 30 Issue 3 Pages 991-1010
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    In this study, we propose a method for controllable text simplification using lexically constrained decoding. Currently existing methods often leave out the difficult words in the output sentences and need more flexibility in sentence generation. The proposed method creates constraints for identifying words that should not appear in simple output sentences and those that should appear in output sentences. Three elements are involved in creating the constraints: edit operation prediction for each word in the sentence, difficulty determination based on a word-level lexicon, and replacement word identification. Then, a seq2seq (sequence-to-sequence) model based on lexical constraints simplifies text while controlling the difficulty level of the output sentence. The proposed method can simplify text according to the target difficulty level without losing grammatical correctness or disturbing the meaning of sentences.

    Download PDF (679K)
  • Dongyuan Li, Kotaro Funakoshi, Manabu Okumura
    2023 Volume 30 Issue 3 Pages 1011-1041
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    Text infilling aims to restore incomplete texts by filling in blanks and has attracted increasing attention recently because of its wide application in ancient text restoration, conversation generation, and text rewriting. However, attribute-aware text infilling is yet to be explored, and existing methods seldom focus on the infilling length of each blank or the number and location of the blanks. In this study, we propose a plug-and-play Attribute-aware Text Infilling method using a Pre-trained language model (A-TIP) that contains a text-infilling component and a plug-and-play discriminator. Specifically, we first designed a unified text-infilling component with modified attention mechanisms and intra- and inter-blank positional encoding to better perceive the number of blanks and the infilling length for each blank. We then propose a plug-and-play discriminator to guide generation and improve attribute relevance without decreasing text fluency. Finally, automatic and human evaluations on three open-source datasets indicate that, compared to all the baselines, A-TIP achieves state-of-the-art performance. An additional ablation study demonstrated the robustness of A-TIP.

    Download PDF (2363K)
  • Keisuke Shirai, Atsushi Hashimoto, Taichi Nishimura, Hirotaka Kameko, ...
    2023 Volume 30 Issue 3 Pages 1042-1060
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). We explain the data collection and annotation procedure and evaluate the dataset by measuring the inter-annotator agreement. Finally, we investigate the importance of each annotation component by conducting multi-modal information retrieval experiments.

    Download PDF (3320K)
  • Qiang Zhang, Jason Naradowsky, Yusuke Miyao
    2023 Volume 30 Issue 3 Pages 1061-1087
    Published: 2023
    Released on J-STAGE: September 15, 2023
    JOURNAL FREE ACCESS

    We introduce the task of implicit offensive-text detection (OTD) in dialogues, where a statement may have either an offensive or nonoffensive interpretation depending on the listener and context. We argue that reasoning is crucial for understanding this broader class of offensive utterances, and release SLIGHT, a test dataset to support research on this topic. Experiments using the data show that state-of-the-art methods for offense detection perform poorly when tasked with detecting implicitly offensive statements, achieving only ∼11% accuracy. In contrast to the existing OTD datasets, SLIGHT features human-annotated chains of reasoning that describe the mental process through which an offensive interpretation can be reached from an ambiguous statement. We explore the potential of a multihop reasoning approach, by utilizing the existing entailment models to evaluate the probabilities of these chains. Our results demonstrate that reasoning through chains can yield performances better than that of a baseline entailment setting without chains. Furthermore, the analysis of the chains provides insights into the human interpretation process and emphasizes the importance of incorporating additional commonsense knowledge.

    Download PDF (762K)
Society Column (Non Peer-Reviewed)
Information (Non Peer-Reviewed)
feedback
Top