Abstract
Our interest lies in the modeling of speech waveforms along a time axis, but not the spectral domain. According to information processing in the cochlea of the inner ear and/or the inferior colliculus in the midbrain, speech waveforms were divided into multiple spectral channels with one octave each; based on phase locking in the auditory nerve fibers, extrema in narrowband waveforms were sampled at an irregular timing. Because amplitude envelope plays a key role in auditory processing in the nerve nuclei and/or cerebral cortices, including the inferior colliculus central nucleus, the kernel function was designed with the linkage of less than seven cosine functions at duration of one term each. Although the linear estimation of angular velocity in the fine structure of amplitude envelope resulted in less than nine percent of phase error, suggesting that subjective roughness will not be caused, more than one hundred percent of quantization error was yielded for kernels at bandwidths of over 1,280 Hz. Contrary to high correlation among every short-term epoch in speech waveform synchronized with a pitch, 4.2 percent of kernels belong to the exception with large deviation, which will not be suitable for linear approximation.