IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Interactive Learning of Spoken Words and Their Meanings Through an Audio-Visual Interface
Naoto IWAHASHI
著者情報
ジャーナル フリー

2008 年 E91.D 巻 2 号 p. 312-321

詳細
抄録
This paper presents a new interactive learning method for spoken word acquisition through human-machine audio-visual interfaces. During the course of learning, the machine makes a decision about whether an orally input word is a word in the lexicon the machine has learned, using both speech and visual cues. Learning is carried out on-line, incrementally, based on a combination of active and unsupervised learning principles. If the machine judges with a high degree of confidence that its decision is correct, it learns the statistical models of the word and a corresponding image category as its meaning in an unsupervised way. Otherwise, it asks the user a question in an active way. The function used to estimate the degree of confidence is also learned adaptively on-line. Experimental results show that the combination of active and unsupervised learning principles enables the machine and the user to adapt to each other, which makes the learning process more efficient.
著者関連情報
© 2008 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top