音声波形の時間領域での情報局在性

吉田 秀樹; 中野 正博; 行正 徹; 松村 翔平

doi:10.24466/jbfsa.12.2_11

Abstract

The present study is to address the extraction of speech information in the time course from filtered waveforms with a passband of one octave. Great many studies have found out significant information for the precise analysis of some resonances in the frequency domain, proposing many practical applications such as reproduction, recognition, compression for storage, and so on. Some neurophysiological studies, however, reported that frequency analysis in the auditory system was mechanically simple, generating a series of spikes, time-locked to the narrow-band vibrations on the basal membrane of the cochlea in the inner ear. Because the following extraction process is far from understanding, we synthesized speeches with increase of the minimal part of amplitude envelope. Conflict between subjective scores and quantization errors was observed when all the minimal parts in amplitude envelope were changed to 70% of the precedent envelope maxima in amplitude. Although quantization error is computationally 88.1%, which is larger than the allowance (>20-40%), the noise is subjectively inaudible. According to the forward masking effect, the results are interpreted as evidence that the envelope maxima and minima in the repeated structure correspond to the key information and the short-term connection with less perception, respectively. We estimate that inaudible structures amount to 70.4% for six channels belonging to a bandwidth of 80-5, 120Hz, and contribute to around 20% saving for storage.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!