-
Mori KIYOTADA, Miyoshi YASUO
Article type: SIG paper
Pages
01-04
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Many videos in various languages are posted on video-sharing websites such as YouTube. Watching the videos is promising to be a listening practice for second language learners. However, many of the videos posted on these websites were not produced as listening materials, and some speakers have distinctive accents and other problems making them difficult for learners to understand. For this reason, learners often adjust the playback speed to an easy-to-listen-to speed for them. This research aims to provide an environment in which learners can adjust the accent of the speaker in the video to be more like that of their mother tongue and make it easier to listen to, in order to enable further effects of scaffolding in combination with speed adjustment. We investigated the use of adversarial generative networks (GANs) and other speech conversion methods for this purpose and conducted experiments using MelGAN-VC to convert speech. As a result, it was confirmed that it is difficult to suppress noise to the extent that it does not bother the learners.
View full abstract
-
Iguchi KENTA, Kojima YUGO, Kobayashi YOH, Takai TOMORU, Hirata AYANO, ...
Article type: SIG paper
Pages
05-10
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Today, COVID-19 has spread all over the world, and situations in which you clap have been increasing. Therefore, clapping, especially clapping as a vehicle of communication, seems interesting as an object of research. In this research, it is clarified what features each instance of clapping made by an audience has, and what reasons can be considered for these features. The words individuality and interactivity were our keywords for this research. The word individuality means the rate at which a person claps their hands without any influence from other people. The word interactivity means the rate at which a person claps their hands with influence from other people.In this research, three clapping events formed by more than 100 people were observed and some common features were found. First, clear spread of clapping was not observed, and clapping might spread due to aural signals. Second, clear patterns are found as audiences stop clapping. Third, the sense of belonging has an effect on clapping. This study is expected to become useful for future communication studies.
View full abstract
-
Hashimoto EKAI, Shun SHIRAMATSU, Sora MATSUMOTO, Hidekazu AOSHIMA
Article type: SIG paper
Pages
11-14
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In recent years, there has been an increasing movement away from traditional topdown organizational management toward flat organizational management in which members can perform their work more autonomously. However, compared to hierarchical organizations, overall management becomes more difficult when individual autonomy is respected. Therefore, this study aims to develop a matching mechanism that can manage an organization with the right people in the right places while respecting the autonomy of each individual. In this paper, specifically, we develop a prototype dialogue system to estimate user skills and interests and to collect personal attribute tags on Slack. We also conducted evaluation experiments to verify the tagging performance. As a result, although skill tags can be collected, hope tags cannot be sufficiently collected by our prototype system. As a future work, we are planning to generate appropriate questions to collect users' hope tags.
View full abstract
-
Mizukami ETSUO, Murata KAZUYO, Morimoto IKUYO
Article type: SIG paper
Pages
15-18
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Ito TAKAYUKI
Article type: SIG paper
Pages
19
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Sakai YU, Shiramatsu SHUN, Oda MOTOKI, Onochi MITSUHIRO
Article type: SIG paper
Pages
20-23
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Usuda YASUYUKI
Article type: SIG paper
Pages
24-29
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
This study demonstrates that everyday conversation corpus can be seen as an archive of the society through a series of analysis of conversation in the Corpus of Everyday Japanese Conversation (CEJC). CEJC is a corpus constructed by the National Institute for Japanese Language and Linguistics (NINJAL) that consists of 200 hours of natural conversation recorded from 2016 to 2020. At the period, our everyday life is significantly changed due to COVID-19, which is also the case for the conversation in the corpus. To show the availability of the corpus as an archive, a few excerpts of office meetings which are recorded at the period of the beginning of COVID-19 in a dental clinic are analyzed. In the excerpts participants are talking about prevention of infection. Through the analysis, it is shown that social concerns are dealt with in relation to the position of the participants. It can be said that the data in the corpus is substantial enough to be analysed in detail, which is to be mentioned as a strength of everyday conversation corpus as archives of society.
View full abstract
-
Shiramatsu SHUN, Suenaga AYAHA, Yoshimura YUKI, Ito TAKAYUKI
Article type: SIG paper
Pages
30-37
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
ChatGPT and GPT-3.5, released by OpenAI in 2022, have become a social phenomenon, also known as the "democratization of AI," as they have been widely adopted by people unrelated to programming. Despite the problem that the responses generated by these large-scale language models have a high probability of containing false or fake information, they are capable of generating logical responses with respect to a very wide range of domains. If the possibility of including fakes is taken into account, such language models can be utilized to support ideas in the divergent phases of discussions and in idea workshops for solving social issues. There is also a possibility that the large-scale language model can be used to generate questions and structure discussions as facilitators do. In this paper, we introduce an evaluation experiment of a prototype using GPT-3 and dialogue examples about solving social issues on ChatGPT, and consider the possibility of its utilization.
View full abstract
-
Takagi KENTO, Ryo INUI, Tsuyoshi YAMAMURA
Article type: SIG paper
Pages
38-43
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
SNS posts are an effective information because they contain a wide variety of postings. However, posts on SNS contain unique expressions which are different from those used in newspapers and other media. Therefore, it is difficult to analyze them using traditional natural language processing, and special processing is required. In this study, we focus on Split-Characters among the unique expressions. Split-Characters refer to characters in which one character is divided into multiple characters. In the previous study, OCR was used to visually process Split-Characters. However, because OCR is a method for identifying Split-Characters by character recognition, it does not use contextual information and does not consider the propriety of the corrected sentence. In this study, we propose methods for Interpreting Split-Characters using contextual information. Three models with contextual information are used: N-gram, RNN, and BERT. We propose methods to interpret Split-Characters using these models, and verify whether the proposed methods can convert SplitCharacters into correct ones.
View full abstract
-
Yasukawa HIROKI, Masahiro MIZUKAMI, Seitaro SHINAGAWA, Hiroaki SUGIYAM ...
Article type: SIG paper
Pages
44-49
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
For a response generation model that reflects the characteristics (e.g., a person's interests and preferences) to work practically, it is required that the model has acquired a person embedding space that can be interpolated to enable response generation by a speaker who corresponds to an intermediate speaker between different persons and that the person embedding is easy to control. In this study, we trained the model using a large amount of dialogue data with user identifiers, which are suitable for acquiring an interpolable person embedding space, and dialogue data with persona sentences (sentences describing the characteristics of a person), which are highly controllable for person representation, by mixing the two types of dialogue data. We propose a dialogue model that can generate responses via this person embedding. To demonstrate the effectiveness of the proposed method, we compared it with a conventional response generation model that does not explicitly model persona embedding and evaluated the interpolability and controllability of the persona embedding obtained by the proposed method.
View full abstract
-
Takasaki MEGURU, Naoki YOSHINAGA, Masashi TOYODA
Article type: SIG paper
Pages
50-55
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
When a dialogue system has a long-term conversation with a person, it is desirable to generate responses taking past dialogue sessions into account. However, the conversation logs used for training dialogue systems do not necessarily contain many responses considering the past dialogue context. Therefore, it is difficult to generate responses that fully respect the past dialogue context if the dialogue system is only trained by concatenating the past dialogue context with the current context. In this paper, we propose a multi-task learning method for response generation to force the dialogue system to consider the past context adequately. The auxiliary self-supervised task is to generate the system-side utterance included in the most similar past dialogue context to the current context. In the experiment, we trained our proposed models on the Mulit-session Twitter Dialogue Dataset and verified the effect of our data augmentation methods.
View full abstract
-
Mizukami ETSUO
Article type: SIG paper
Pages
56-61
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Yang SEUNGKYOO
Article type: SIG paper
Pages
62-67
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Nakamoto SEIYA, Horiuchi YASUO, Hara DAISUKE, Kuroiwa SHINGO
Article type: SIG paper
Pages
68-73
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We analyze the phonemes of linear hand movements in Japanese Sign Language by using data measured by high-precision optical motion capture. The results of the analysis show that for words with phonemes of movements in six directions (Up, Down, Outward, Toward, Right, and left), the trajectories of the movements fall within a certain range, suggesting that the linguistically defined movement phonemes can be distinguished as movements in three-dimensional space.
View full abstract
-
Saito KOKI, Furuya YUKI, Ogura KOSUKE, Mitsuda KOH, Higashinaka RYUICH ...
Article type: SIG paper
Pages
74-79
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Building common ground in dialogue is important for effective communication. Our previous study demonstrated that rich modality and close social relationships between workers can facilitate building common ground in a remote collaborative task. In this study, we analyzed the factors that contribute to building common ground with the dialogue data collected in our previous study. The results showed that the switching pause was significantly longer in the condition where the modality was rich and the workers were close than in the other conditions. When the switching pause was longer, one worker tended to respond more thoughtfully to the other worker's questions. These findings suggest that the amount of utterances containing concrete information regarding the collaborative task is a key factor in building common ground smoothly.
View full abstract
-
Mizutani RINTARO, Suzuki HISASHI
Article type: SIG paper
Pages
80-85
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Attentive listening is an effective communication technique for rapport building. Our project developed a 3D-visible multimodal attentive listening system. The proposed system projects a 3D avatar on a naked-eye stereoscopic display in order to effectively communicate non-verbal information such as posture mirroring effects. The proposed system has spoken communication based on listening dialogues, such as backchannels, empathy, repeats, modality-based responses, and mutual questions. Furthermore, the proposed system has multiple means of communication consisting of nonverbal information such as facial expressions, eye gaze, blinking, nodding, and posture mirroring. In this article, our project experimented with 60 participants, consisting of university undergraduate and graduate students. The participants freely talked for three minutes about hobbies, favorite/disliked foods, future goals, worries, how they spend their holidays, childhood memories, club activities, part-time jobs, failures, and other topics in a one-on-one room. The participants then responded to a 22-item sensitivity evaluation. Our results show that the avatar's listening attitude, such as the way she listens and shows empathy, was highly evaluated by the participants. Moreover, the majority of participants felt that their avatars were friendly and that they wanted to talk to them.
View full abstract
-
Mitui RIKUYA, Horiuchi YASUO, Kuroiwa SHINGO
Article type: SIG paper
Pages
86-91
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We propose a set of features that can describe the degree of variation in the prosody of voice pitch, speed, and timbre for pairs of neutral and emotional speech with the same utterance content. These features can be measured in time series and can also be used as features for the entire utterance when averaged over the entire utterance. Regression analysis of these three features and the results of emotional voice intensity ratings showed that each feature is valid for expressing the intensity of prosody. Some examples of time series analysis of prosody using these features are also shown.
View full abstract
-
Fukuda MIKITO, Arimoto YOSHIKO
Article type: SIG paper
Pages
92-97
Published: February 27, 2023
Released on J-STAGE: February 27, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
This report examined the psychological and physiological effects of game events generated in response to a player's laughter by measuring player's heart rate (HR), skin conductance level (SCL), zygomaticus major (ZYG), and corrugator supercilii (COR) to elucidate whether the virtual world responding to player's laughter more attracts them. Participants played two conditional virtual online games, and their HR, SCL, ZYG and COR were recorded during the game. The experiment consisted of two conditions, i.e. laugh event condition and non_laugh event condition. In the laugh event condition, the system responds to the player's laughter with the game event. In the non_laugh event condition, the system presents game events when the player is not laughing. A three-way analysis of variance was performed using HR, SCL, ZYG and COR signals to test the hypothesis that there is time-series variation in each physiological response between event presentation (laugh/non_laugh) and between event types (advantageous/disadvantageous). As a result, the presentation of the event to the player's laughter decreased HR, significantly activated SCL, and significantry deactivated ZYG. On the other hand, the presentation of the event to non-laughing players decreased HR, significantry activated ZYG and COR. This result suggested that the presentation of game events makes laughing players more emotionally aroused and suppresses their pleasant emotions, while those affect non-laughing players differently.
View full abstract