IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
Representations of Speech by Sparse Coding Algorithm
Kotani ManabuShirata YasunobuSatoshi MaekawaOzawa SeiichiAkazawa Kenzo
Author information
JOURNAL FREE ACCESS

2000 Volume 120 Issue 12 Pages 1996-2002

Details
Abstract

It was reported that a sparse coding algorithm produced a set of basis functions being spatially localized, oriented, and bandpass for natural images. The application of Independent Component Analysis (ICA) to the natural images has shown to be similar results to the sparse coding's result. However, the ICA can be applied in the case of basis function matrices to be non-singular and invertible. There are not such limitations in the sparse coding algorithm. This property allows that the code is overcomplete, that is, the number of code elements is greater than the effective dimensionality of the input space. The purpose of this paper is to examine what characteristics of speech the sparse coding algorithm extracts from natural sounds. Speech data was Japanese five vowels uttered by a female speaker during about 1sec. Most of the basis functions were localized in frequency after the training. Some basis functions only shifted in time and resembled each other. Each basis function was compared with the speech data and the result was that some basis functions responded selectively to each vowel. The frequency analysis for the basis function showed that some basis functions extracted the pitch frequency and the formant of each vowel.

Content from these authors
© The Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top