人工知能学会研究会資料 言語・音声理解と対話処理研究会
Online ISSN : 2436-4576
Print ISSN : 0918-5682
73回 (2015/3)
選択された号の論文の9件中1~9を表示しています
  • 石塚 浩之
    原稿種別: 研究会資料
    p. 01-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    Through the qualitative analysis of a transcript from an English-Japanese simultaneous interpreting performance, this study explores the actuality of the interpreter's mental operations in utterance comprehension. The transcript prepared for this study is a set of parallel texts, which represents temporal correspondence between the source utterances and the interpreter's translation. This study focuses on how the interpreter structures and retains topical information, and on how she uses it throughout the rest of her performance. The analysis suggests that the interpreter's mental representation is not simply an accumulation of linguistic information received from the source speech, but a complex that can include an implicit structure as a result of cognitive operations such as pragmatic inferences and the construction of mental models.

  • 堀田 尚希, 駒谷 和範, 佐藤 理史, 中野 幹生
    原稿種別: 研究会資料
    p. 02-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    A spoken dialogue system should respond quickly after a user finishes speaking, but this often causes incorrect segmentation of user utterances by erroneous voice activity detection. We previously developed a method that performs a posteriori restoration for the incorrectly segmented utterances. A crucial part of the method is to classify whether the restoration is required or not. In this paper, we improve the accuracy by adapting the classification to each user. We focus on speaking tempo of each user, which can be obtained during dialogues. We reveal a correlation between each user's tempos and their appropriate thresholds used in the classification. We then derive a linear regression function that converts the tempos into the thresholds. We adapt two classifiers: that simply using a threshold and decision tree learning. Experimental results showed the proposed user adaptation for the two classifiers improved the classification accuracies by 3.3% and 2.1%.

  • 市川 熹, 川端 良子, 菊池 英明, 堀内 靖雄, 黒岩 眞吾
    原稿種別: 研究会資料
    p. 03-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    In the dialog among the mother tongue talkers, overlap utterances are done in transition-relevance places (TRP). This appearance seems to appear as the result of some capability which makes the mental burden of the dialog by the mother tongue light. We examined the age to win this capability about the Japanese mother tongue talkers. Last year, it did an analysis to the 6 year-old nursery school children. As a result, it found that it was already won. This time, it analyzed the dialog of the 5 year-old kindergartner. A difference among individuals was seen among the nursery school children to the acquisition.

  • 山口 貴史, 井上 昂治, 吉野 幸一郎, 高梨 克也, 河原 達也
    原稿種別: 研究会資料
    p. 04-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    We investigate the relationship between backchannels and the syntactic structure in the delimited preceding utterances in attentive listening such as counseling. First, we find out the relationship between particular patterns of backchannels and the category of the clause boundary. Next, we analyze the syntactic structure by using the depth of the syntax tree and the number of cases related to the end of utterance. It is shown that there is a relationship between particular patterns of backchannels and the complexity of the preceding utterances. The results suggest that we can choose different kinds of backchannels depending on the preceding utterance.

  • 高梨 克也, 堀 謙太, 内藤 知佐子, 黒田 知宏
    原稿種別: 研究会資料
    p. 05-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    In tele-auscultation, since a doctor cannot operate an auscultator by herself, she must refer a target point by a marker and have a helper in a distant place move an auscultator by proxy. This article analyzes a simulated tele-auscultation experiment and proposes an interactional pattern observed in the process of multimodal communication from pointing by marker by a doctor, operation of an auscultator by a helper to auscultation by the doctor. This pattern is then considered in terms of division of transmission between multi-channels of a system for tele-auscultation and a tele-conference system for conversation. At the end, problems on the system environment found in the experiment are addressed.

  • 蛇穴 祐稀, 今渕 貴志, プリマ オキ ディッキ A., 伊藤 久祥, 安田 清
    原稿種別: 研究会資料
    p. 06-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    In this study, we developed an interactive conversational agent software to ameliorate the sympthon of dimentia patients. This software works as a speech therapy tool, which acts as a conversation partner to a patient. We defined three sets of reminiscent questions into the software. Each set containes 15 questions. The software utilizes constrained local model (CLM) and voice detections to determine the utterances of patients. Once the CLM recognizes a patient's facial landmarks, it starts to ask him using the pre-defined questions. The software will continue to ask using subsequent questions when it doesn't detect utterances from either distance changes between mouth landmarks or changes of voice of the patient. Our experiments show that the voice detection solely enables utterance detections in a low environmental noise while the CLM succeeds to detect utterances regardless of the environmental noise.

  • 土肥 健太, 寺岡 丈博, 榎本 美香
    原稿種別: 研究会資料
    p. 07-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    To reveal methods of making effective pauses and communicating as a character in a comedy skit, we have analyzed strategies for creating a histrionic, i.e., exaggerated and overly theatrical, comic performance that is often observed in comedy skits. We compared a manzai performance, which is considered to be a realistic comedy skit, with a comedy skit that had a histrionic performance by the comedy duo ``Sandwich Man'' doing the same material, in order to investigate the differences in the inter-utterance structure and posture-configuration structure of the two comedy styles. The results for the inter-utterance structure indicated that there were differences in the pauses between utterances, but there were no differences in the speech rates (sec/mora) between the comedy styles. Additionally, we found that in the posture-configuration structure, one of the performers turned his face toward his partner's face for a longer time in the histrionic comedy skit than that in the manzai skit. Also, in the histrionic comedy skit, the sum of the performers' shoulder-widths was shorter as seen by the audience than that in the manzai performance. Therefore, we concluded that utterance pauses and the performers' shoulder-widths are important factors in creating a histrionic comic show.

  • 白土 峻平, 寺岡 丈博, 榎本 美香
    原稿種別: 研究会資料
    p. 08-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    The purpose of this paper is demonstrate a general pattern of sequences of speech acts between a commander and a large number of receivers carrying out the commanded action. We analyze interaction data in which multi-participants collaboratively drag huge trees (about 18M) from a mountain hill to a village for the fire festival at Nozawa Onsen in Nagano. In the result, we reveal that the basic sequence is lined with `a command to start an action', `an acceptance of the command', `a rallying cry to start the action', and `a responding cry to start the action. In a smooth commanding, `a command to end the action' is put at the end of the sequence.

  • 坊農 真弓
    原稿種別: 研究会資料
    p. 09-
    発行日: 2015/03/05
    公開日: 2021/06/28
    会議録・要旨集 フリー

    This study offers a critique of representationalist theories of cognition by observing how embodied actions, such as speakers' mouth movements during speech and listeners' nodding to indicate a collaborative attitude, are encoded as bodily memories. This paper draws on a corpus-based micro-analysis of multimodal interaction using sign language and tactile sign language and considers two phenomena: (1) the use of mouthing during sign language interaction, and (2) the use of nodding and backchannel cues during tactile sign language interaction. In analysis 1, I found that native signers used mouthing in ways that resembled its original function (e.g., for conveying images of unknown words in their minds). In analysis 2, I found examples in which, at early stages of using tactile sign language, deafblind individuals with congenital deafness used nodding and backchannel cues similar to a visual signer's. However, deafblind individuals with a long history of tactile signing shifted drastically toward a more tactile modality for expressing backchannel cues. As a result of these observations, I apply insights from research regarding embodied actions to communication involving sign language and tactile sign language.

feedback
Top