Abstract
This paper presents an unsupervised speaker adaptation method from short utterances. The code spectra for the templates are adapted to those of an input speaker by interpolating the estimated speaker-difference vectors for given typical spectra. These dif ferencevectors are estimated so as to minimize the fuzzy objective function for the adapt edreference codebook under some constraints. The fuzziness (F) and constraint pa rametersare examined using SPLIT-based word recognition tests with 28-word vocabu laryand reference patterns from a male speaker. Using 1.8 s training samples, the results show that the proposed method with F=1.5 gives a 9.0% higher recognition rate for the four male speakers than the minimum VQ distortion method (F=1.0). For 20 male speakers, this method improves the average recognition rate from 92.5% without adaptation to 97.5% using 3.6s training samples. Furthermore, a sequential adaptation scheme attains an average recognition rate of 97.4% using test speech itself for adapta tion.