Proceedings of the Technical Committee on Speech Communication

Simulations of acoustic properties for the nasal and paranasal cavities and attempts of validation by acoustic measurements

Fukushima Yuki, Tajima Motoharu, Hironori Takemoto

2023Volume 3Issue 2 Article ID: SC-2023-7
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-7

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

The finite-difference time-domain method can calculate acoustic properties of the geometrical model of the nasal and paranasal cavities extracted from CT data. The calculation results are validated by acoustic measurements of the physical model constructed from the same CT data. The measurements, however, have been unsuccessful. This is because measurement signals input through the nostrils are not observed at the glottis with a sufficient signal-to-noise ratio. To overcome this problem, an exponential horn which can supply measurement signals with high amplitude and a physical model with thick wall to depress the wall vibration were introduced in the present study. As a result, a transfer function was successfully measured, to evaluate the calculated one. The evaluation implied that fine structure of the paranasal cavities could not be reproduced with sufficient accuracy in the physical model.

View full abstract

Download PDF (1535K)
Interactive test environment for acoustic conditions of speech communication

Hideki KAWAHARA, Ken-Ichi SAKAKIBARA, Nao HODOSHIMA, Hideki BANNO, ...

2023Volume 3Issue 2 Article ID: SC-2023-8
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-8

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

Physical instantiation of speech communication depends on the acoustic environment. The acoustic environment not only modifies listening behavior but also modifies speech production behavior and makes speech attributes di!erent. Establishing a precisely controllable acoustic environment simulator is necessary to investigate such e!ects experimentally. Enabling flexible, interactive manipulation of such simulation environment consisting of real-time signal processing will help researchers to e"ciently acquire tacit and deep knowledge of speech communication and investigate and quantify their e!ects by experiments. This paper introduces the preliminary investigations and implementation of such tools to stimulate discussions.

View full abstract

Download PDF (3856K)
Prosodic Features of Japanese Emotional Expressions using One-word Utterance “n”

－Results of a preliminary survey by native Japanese speakers－

Lae Lae Htun, Tetsuya SHIMAMURA, Mee SONU

2023Volume 3Issue 2 Article ID: SC-2023-9
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-9

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

Our aim was to understand the scientific features of emotional expressions in Japanese by native and non-native speakers. Specifically, this study focused on how non-native speakers recognize the emotional expression of native speakers and vice versa. As a first attempt, the present study analyzed various emotional expressions using the one-word utterance “n” by young female native Japanese speakers. Based on the preliminary survey, the results of the analysis showed three types of F0 dynamic patterns in “n.” Positive and agreeable emotions of “n” exhibited the “Rise and Fall” pattern, negative emotions of “n”exhibited the “Gradual Fall” pattern, whereas doubtful emotions of “n” exhibited the “Rise” pattern. The minimum frequency of F0 did not considerably differ in all these emotions, whereas the maximum frequency of F0 was high for all emotions except negative emotions. These results suggest that the F0 movement may be related to emotional expression, that is, positive and negative.

View full abstract

Download PDF (1123K)
Extraction of expressions associated with voice quality of female cartoon voice actors

Matsuri Yasuda, Tatsuya Kitamura

2023Volume 3Issue 2 Article ID: SC-2023-10
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-10

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

Expressions associated with the voice quality of female cartoon voice actors were extracted to reduce mismatch between their voice quality and cartoon characters of animation films and video games. We first collected expressions describing the voice quality, and then investigated their understandability, synonymity, and similarity. Five expression pairs were extracted through a cluster analysis.

View full abstract

Download PDF (1404K)
Study on response selection method considering context and turn taking in chatting

Sodeya SHINTARO, Motoyuki SUZUKI

2023Volume 3Issue 2 Article ID: SC-2023-11
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-11

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

In recent years，methods using GPT have attracted attention in the research field of chat dialogue systems．In this method，responses can be automatically generated by inputting situations and context．into GPT，but it is not always possible to generate natural responses．This may cause errors such as generating silly remarks．Therefore，methods have been proposed in which multiple response candidate sentences are generated from GPT and an appropriate response sentence is selected from among these using a selection model．In this paper，we used SentenceBERT as a feature extractor and re-learned it with a small amount of data to create a model that is consistent with the context and the speaker，and evaluated its performance.

View full abstract

Download PDF (1296K)
A Study on Simplification of Input Features in Speech Generation Method from Lip Video Images

Naoki KANAZAWA, Motoyuki SUZUKI

2023Volume 3Issue 2 Article ID: SC-2023-12
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-12

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

In recent years, there have been several studies on speech generation from lip video images. Many conventional methods use DNN models based on CNNs or RNNs to generate speech waveforms. In such methods, the model learns speaker specific features such as skin color and moles, and the performance degrades when data from speakers other than the training speaker is used as input. Therefore, we proposed a method to remove speaker-specific features from the input features in order to generate speech waveforms with high performance for any speaker. In this paper, we generated speech waveforms using the proposed input features and evaluated them using any STOI. As a result, the performance of the proposed method was worse than that of the lip video input method, but we confirmed the effectiveness of the proposed method in suppressing the degradation caused by differences in speakers.

View full abstract

Download PDF (1707K)
Methods of initial reading of Japanese language textbooks

Nagisa KATO, Ayako SHIROSE

2023Volume 3Issue 2 Article ID: SC-2023-13
Published: February 24, 2023
Released on J-STAGE: February 15, 2024

DOIhttps://doi.org/10.60274/asjsc.SC-2023-13

RESEARCH REPORT / TECHNICAL REPORT RESTRICTED ACCESS

Show abstractHide abstract

Among the acts of reading texts, "initial reading" is the first encounter with a text. It is assumed that the method of initial reading, which is how a text is read at the beginning of a unit, especially requires consideration of its effect and purpose. Therefore, in this study, in order to understand which method of initial reading is common among oral and silent reading in actual educational settings, and how the method of initial reading is selected and judged, we carried out a survey of textbook instruction manuals for teachers and a questionnaire survey of Japanese language teachers.

View full abstract

Download PDF (1022K)

Register with J-STAGE for free!