バイオメカニズム
Online ISSN : 1349-497X
Print ISSN : 1348-7116
ISSN-L : 1348-7116
3部 モデル解析
日本語母音声のカオス性解析とその特徴について
大 聖一郎和田 充雄山口 明宏広奥 暢
著者情報
ジャーナル フリー

2002 年 16 巻 p. 285-299

詳細
抄録

Recently, nonstationary properties of various chaotic phenomena in biological systems were studied extensively. It is well known that there are some nonlinearities in speech signals. With this fact in mind, we evaluated chaotic properties appearing in the five Japanese vowels. Taking our previous studies into account, here we report the method of nonlinear analysis, comprehensively. Following these results, we show that some positive components appear in the Lyapunov spectrum of all five Japanese vowels. Furthermore, we try to apply recurrence imaging analysis, which type of nonstationary analysis, to our biological system. In this paper, we propose a new technique based on resolution parameter to categorize the five Japanese vowels without any interruptions, by nonstationary signals in the vowels. We hope the report will be useful among the fields of speech-engineering and complex systems.
(Study content and flow)
1. We studied whether chaotic dynamics appear in the five Japanese vowels by estimation of the Lyapunov exponent based on the report of our previous study. In the report, each vowel showed the chaotic property, since a positive Lyapunov exponent was observed in the Lyapunov component in each vowel.
To construct the embedding space from the time series, we estimated a delay time and embedding dimension. False Nearest Neighbor analysis is used in determining dimension, and mutual information criterion was used in estimating the delay time.
2. Since the value of the delay time influenced the estimation of the embedding dimension, we tried to determine both values consistently. In fact, the previous approach is a method that is executed for correct embedding. Even though we accurately estimated d and τ, the speech data (dynamics) include linear stochastic properties, which would be calculated as spurious exponents under improper embedding. So we also realized that the positive component is not caused by internal gaussian noise but is generated by some nonlinear vocal source using surrogate data methods.
3. It is known that the method of generating of surrogate data employs a variety of algorithms. We exploited the algorithms for FT (Fourier transforms) and AAFT (amplitude adjusted Fourier transforms), and tested whether each algorithm can reject these null hypotheses.
4. In this work, we exploited Recurrence Plot (RP) imaging, which is a nonstationary analysis of the time series. As a result, we saw that each vowel has particular pattern in the RP image and can extract the attractor information from the image.
The vowel data are based on the acoustic voice of a male subject (age 40) recorded in an ATR speech database. These data are a time series of sound pressures with a sampling rate of 20 kHz and 16-bit quantization.

著者関連情報
© 2002 バイオメカニズム学会
前の記事 次の記事
feedback
Top