The nature of pitch and its neural coding have been studied for over a century. A popular debate has revolved around the question of whether pitch is coded via ``place'' cues in the cochlea, or via timing cues in the auditory nerve. In the most recent incarnation of this debate, the role of temporal fine structure has been emphasized in conveying important pitch and speech information, particularly because the lack of temporal fine structure coding in cochlear implants might explain some of the difficulties faced by cochlear implant users in perceiving music and pitch contours in speech. In addition, some studies have postulated that hearing-impaired listeners may have a specific deficit related to processing temporal fine structure. This article reviews some of the recent literature surrounding the debate, and argues that much of the recent evidence suggesting the importance of temporal fine structure processing can also be accounted for using spectral (place) or temporal-envelope cues.
The present study examined the dynamic properties of the across-frequency integration mechanism, specifically the extent to which the information about the direction of changes in the interaural-time difference (ITD) is integrated or compared across frequencies. The stimulus was a complex tone consisting of two sinusoidal carriers, one at 400 and the other at 700 Hz. A sinusoidal modulation in the ITD was imposed on one carrier alone or the two carriers simultaneously. The ITD of each carrier was centered at 0 µs, and the modulation started and ended with the zero phase. ITD modulations, when imposed on the two carriers simultaneously, were in-phase or anti-phase between them. Experiment 1 measured the threshold modulation depth for detecting the modulation with an adaptive method. The thresholds were generally lower when both carriers were modulated than when only one was, indicating across-frequency integration of the information about the presence of modulation. The threshold, however, was not significantly different between the in-phase and anti-phase conditions, even when the modulation rate was as low as 1 Hz. Experiment 2 measured the discriminability between in-phase and anti-phase modulations. Modulation depth was fixed at a supra-threshold value (600 µs). The performance varied largely among the listeners, and it was near the chance level for half of listeners even for a 1-Hz rate. The study failed to present compelling evidence that the auditory system is sensitive to the relative phase of ITD modulations for the conditions tested. This suggests that the directional information of even slow (∼1 Hz) ITD modulation is not combined effectively across frequencies, at least for the conditions tested.
Perceptual learning was used to examine mechanisms of pitch perception. Thresholds (F0DLs) were measured for discrimination of the fundamental frequency (F0) of complex tones with a nominal F0 of 100 Hz and cosine-phase or random-phase harmonics. Tones were bandpass filtered and presented in threshold equalizing noise. A group trained using stimuli with the filter centered on LOW harmonics (1–5) showed a large training effect, with transfer to stimuli with MID harmonics (11–15) or MID-HIGH harmonics (14–18), but no transfer to stimuli with HIGH harmonics (28–32). A group trained with MID or MID-HIGH stimuli showed a large training effect, with transfer to the LOW stimuli and no transfer to the HIGH stimuli. A group trained with HIGH stimuli showed no training effect for any stimuli. The results suggest that similar mechanisms were used for F0 discrimination of the LOW, MID, and MID-HIGH stimuli, and that a different mechanism was used for the HIGH stimuli. It is proposed that the LOW, MID and MID-HIGH stimuli were discriminated using temporal fine structure (TFS) information, in the former case TFS information about individual resolved harmonics, and in the latter two cases TFS information about the periodicity of the waveform evoked by interfering harmonics.
Tympanic membrane (TM) vibration under bone-conducted ultrasonic (BCU) stimulation was measured in four living human subjects using a laser Doppler vibrometer (LDV) to investigate the contributions of nonlinear distortions in the osseotympanic effects and/or the inertial effects of the middle-ear ossicles to ultrasonic perception in bone conduction. A signal processing algorithm to increase the signal-to-noise ratios of measured LDV signals by removing only optical spike noise components from the wave signals was presented in this study. Evidence of nonlinear distortions, especially the generation of audible subharmonics in the outer and middle ear, was then examined. We did not find any audible signals corresponding to the subjective pitch of a BCU tone in the TM vibrations. This suggests that nonlinear distortions in the osseotympanic and inertial effects do not contribute to BCU perception. Specific properties of perception may be related to mechanisms in the cochlea or afferent neural pathway. With this consideration, we discuss the possibility that the pitch perception of BCU does not relate to tonotopical motion of the basilar membrane corresponding to the subjective pitch, given that TM vibration can reflect the motion of cochlear fluid and hence the motion of the basilar membrane.
Technical ear training aims to improve the listening of sound engineers so they can skillfully modify and edit the structure of sound. Despite recent increasing interest in listening ability and subjective evaluation in the field of audio- and acoustic-related fields and the subsequent appearance of various technical ear-training methods, the subject of how to provide efficient training for a self-trainee has not yet been studied. This paper investigated trainees' performances and showed that an (inherent or learned) ability to correctly describe spectral differences using the terms of a parametric equalizer (center frequency, Q, and gain) was different for each person. To cope with such individual differences in spectral identification, the authors proposed a novel method that adaptively controls the training task based on a trainee's prior performances. In detail, the method estimates the weakness of the trainee, and generates a training routine that focuses on that weakness. Subsequently, we tried to determine whether the proposed method—adaptive feedback—helps self-learners improve their performance in technical listening that involves identifying spectral differences. The results showed that the proposed method could assist trainees in improving their ability to identify differences more effectively than the counterpart group. Together with other features required for effective self-training, this adaptive feedback would assist a trainee in acquisition of timbre-identification ability.