人工知能学会論文誌

一般論文

原著論文

重複特徴量と非重複特徴量に対する継続学習による連合学習の改善

森隼基, 寺西勇, 古川諒

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. A-N72_1-11
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_A-N72

ジャーナルフリー

抄録を表示する抄録を非表示にする

Federated learning is a promising machine learning technique that enables multiple clients to collaboratively build a model without revealing the raw data to each other. Among various types of federated learning methods, horizontal federated learning (HFL) is the best-studied category and handles homogeneous feature spaces. However, in the case of heterogeneous feature spaces, HFL uses only common features and leaves client-specific features unutilized. In this paper, we propose a HFL method using neural networks named Federated Learning Enhanced by Continual learning for common and uncommon features (FLEC), which improves the performance of HFL by taking advantage of unique features of each client via a continual learning approach. FLEC splits the whole network into two networks corresponding to common features and unique features, respectively. It jointly trains the first network by using common features through vanilla HFL and locally trains the second network by using unique features and leveraging the knowledge of the first one via lateral connections without interfering with the federated training of it. We conduct experiments on various real world datasets and show that FLEC greatly outperforms several baselines such as a vanilla HFL that only uses common features, a local learning method that uses all features each client has, and a missing data imputation method that fills in the features each client does not have with zeros or averages.

抄録全体を表示

PDF形式でダウンロード (671K)
社会情報と個体情報に基づくアリの意思決定を考慮したMAX-MIN Ant System with Two Memories

遠藤博人, 穴田一

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. B-NC3_1-12
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_B-NC3

ジャーナルフリー

抄録を表示する抄録を非表示にする

One method for solving combinatorial optimization problems is Ant Colony Optimization (ACO), which models the ants' habit of efficient foraging behavior through global communication via pheromones. However, conventional ACO does not take into account important ant decision-making processes other than global communication via pheromones. Therefore, we propose a new ACO that introduces into the model decision-making processes based on both social information (information obtained through global and local communication) and individual information (ants' own past experience), which are considered important for ants in the real world. In evaluation experiments, we applied the proposed ACO to the traveling salesman problem, a typical combinatorial optimization problem, and confirmed that the solution search performance is significantly improved compared to conventional methods. This indicates that the approach of modeling ants' decision-making based on social and individual information is effective in ACO. In addition, we believe that our approach to algorithm construction, which incorporates interactions between individuals into the model, has shown the potential to be effective in ACOs.

抄録全体を表示

PDF形式でダウンロード (1084K)

特集論文「知的対話システム」

原著論文

意味内容に基づくインタビュアー応答生成モデルの作成と評価

—対話による食に関するユーザ嗜好の獲得—

曽傑, 中野有紀子, 坂戸達陽

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. IDS6-A_1-15
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_IDS6-A

ジャーナルフリー

抄録を表示する抄録を非表示にする

Obtaining user preferences facilitates a better understanding of users to provide them with customized services. This paper proposes an interviewer response generation model for eliciting users’ food preferences. We collected 118 text-based dialogues that an interviewer asked the interviewee concerning their food preferences. We then assessed the responses to elicit detailed preference information, and represented the intention (communicative function) and meaning of these responses (semantic content) associated with objects, constructed from information pertaining to them, such as the names and ingredients of dishes, and their attributes, such as taste or cooking method. We created a GPT-3-based model which simultaneously generates the communication function, semantic content, and response sentences from the dialogue history, through the application of fine-tuning techniques. We investigated the performance of the proposed model by comparing it with the ground-truth interviewer utterances, Zero-shot ChatGPT, and a fine-tuned GPT-3 model that directly generates only response sentences as baselines. A user study evaluating the impression of the response sentences using a questionnaire showed that, in terms of eliciting interviewee food preferences, the proposed model’s response sentences were superior to those of the baseline models and comparable to real human interviews. These results are attributed to the proposed model’s frequent generation of questions, contributing to information extraction across various conversational contexts. We further found that, in comparison to ChatGPT, the questions generated by the proposed model are characterized by detailed questions concerning the words and content mentioned in or associated with the conversation context.

抄録全体を表示

PDF形式でダウンロード (2580K)
適応的に発話タイミングを変える聞き手アウェアな音声ガイダンス

森大毅, 森本洋介

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. IDS6-B_1-10
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_IDS6-B

ジャーナルフリー

抄録を表示する抄録を非表示にする

We aim to realize an automated spoken guidance system that monitors listener’s response tokens such as backchannels and fillers, and adapts its the behavior to them. Such a system is expected to improve the efficiency of the explanation, and reduce the user’s mental workload. As long as backchannels are detected regularly, the system continues to explain. Constrastively, if backchannels are not detected for a certain period of time, the system confirms the user’s understanding. In addition, when a filler is detected, the system stops talking immediately and waits for user’s utterance. In order to realize the system, we worked on real-time detection of listener’s response tokens and its integration into a dialogue system. To confirm the effectiveness of the system, an interaction experiment was conducted. The experiment was designed to compare the proposed listener-aware system that adapts its behavior according to listener’s response tokens, with a system that does not adapt to the listener. The result suggested that the adaptive behavior of the listener-aware speech guidance influenced users’ strategies of social signaling to artifacts. It also showed a greater variability in the listener-aware system’s pause length and shorter explanation time, depending on the user’s level of understanding. On the other hand, no positive effect of the proposed system on the user’s level of understanding was observed.

抄録全体を表示

PDF形式でダウンロード (2784K)
生成モデルによる応答タイミング推定と動的Prompt-Tuneを用いた応答詳細性のパラメーター制御

室町俊貴, 狩野芳伸

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. IDS6-C_1-8
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_IDS6-C

ジャーナルフリー

抄録を表示する抄録を非表示にする

A spoken dialogue system is required to continuously listen to a human user for smooth conversation. We propose a method that simultaneously performs response generation and response timing estimation. Our proposed method estimates response timing by adding pseudo-samples where response should be irrelevant, which allows using text-only conversation dataset without audio information. Furthermore, our proposed method can control substantialness of responses by user-specified parameter integrated with the Dynamic-Prompt-Tune method, which uses prompt token embedding dynamically generated from the parameter. Our automatic and manual evaluation showed that the proposed method can generate responses with more natural timing and more in line with the response substantialness parameter compared to the baseline model.

抄録全体を表示

PDF形式でダウンロード (831K)
情報量に基づき共通選好に言及する対話ロボットが人同士の対話意欲に及ぼす影響の評価

古志野瑛元, 内田貴久, 吉川雄一郎, 伴碧, 三野星弥, 酒井和紀, 石黒浩

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. IDS6-D_1-11
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_IDS6-D

ジャーナルフリー

抄録を表示する抄録を非表示にする

The goal of this study is to maintain dialogue motivation between users through the mediation of a dialogue robot. Previous studies proposed artificial agents that present dialogue topics to facilitate relationship-building among individuals for a short period. However, sustaining dialogue motivation between users using such agents has not been investigated. This study proposes an algorithm for presenting topics about other people to sustain dialogue motivation between users for certain periods. Specifically, we designed a dialogue robot that discusses topics of common preference, emphasizing high information content between users. To validate our approach, we applied the proposed algorithm to a dialogue robot and conducted experiments involving university and graduate students belonging to the same community. The results confirmed that our proposed method prevented them from decreasing their engagement in dialogue even after one month of their interaction with the robot. Additionally, topics with high information content were more likely to be remembered by users. In other words, these findings indicate that when a robot introduces a topic with high information content, participants might perceive it as unusual. This perception prompts them to retain the topic with clarity; as a consequence, it might enhance the participants’ willingness to engage in dialogue with each other. Based on the findings of this research, using highly informative topics in dialogue can be an effective strategy for cultivating long-term relationships among individuals. This also insight emphasizes the potential role of robots in facilitating human relationship-building. Future studies need to examine the effectiveness of the proposed method with various communities.

抄録全体を表示

PDF形式でダウンロード (2538K)
DCZAR: ゼロ照応解析に基づく項省略補完による対話応答生成

上山彩夏, 狩野芳伸

原稿種別: 原著論文（技術）
2024 年 39 巻 3 号 p. IDS6-E_1-8
発行日: 2024/05/01
公開日: 2024/05/01

DOIhttps://doi.org/10.1527/tjsai.39-3_IDS6-E

ジャーナルフリー

抄録を表示する抄録を非表示にする

Human conversation attempts to build common ground consisting of shared beliefs, knowledge, and perceptions that form the premise for understanding utterances. Recent deep learning–based dialogue systems use human dialogue data to train a mapping from a dialogue history to responses, but common ground not directly expressed in words makes it difficult to generate coherent responses by learning statistical patterns alone. Inspired by the idea of zero anaphora resolution (ZAR), we propose Dialogue Completion using Zero Anaphora Resolution (DCZAR), a framework that explicitly completes omitted information in a dialogue history and generates responses from the completed history. The DCZAR framework consists of three models: a predicate-argument structure analysis (PAS) model, a dialogue completion (DC) model, and a response generation (RG) model. The PAS model analyzes the omitted arguments (zero pronouns) in the dialogue, and the DC model determines which arguments to complete and where to complete them and explicitly completes the omissions in the dialogue history. The RG model, trained by the complementary dialogue history and response pairs, generates a response. The PAS and RG models are constructed by fine-tuning the common pretrained model with a dataset corresponding to each task, while the DC model uses a pretrained model without fine-tuning. We used the Japanese Wikipedia dataset and Japanese postings to Twitter to build our pretrained models. Since tweets are like dialogues in that they contain many abbreviations and short sentences, the model pretrained with tweets is expected to improve the performance of ZAR and dialogue response generation. Experimental results show that the DCZAR framework can be used to generate more coherent and engaging responses. Analysis of the responses shows that the model generated responses that were highly relevant to the dialogue history in dialogues with many characters.

抄録全体を表示

PDF形式でダウンロード (4032K)

J-STAGEへの登録はこちら（無料）