Interdisciplinary Information Sciences
Online ISSN : 1347-6157
Print ISSN : 1340-9050
ISSN-L : 1340-9050
New Informatics Paradigm to Manage Quality and Value of Information
Spoken Term Detection of Zero-Resource Language Using Posteriorgram of Multiple Languages
Satoru MIZUOCHITakashi NOSEAkinori ITO
Author information
JOURNAL FREE ACCESS

2022 Volume 28 Issue 1 Pages 1-13

Details
Abstract

We propose in this paper a query-by-example spoken term detection (QbE-STD) method for keyword detection from zero-resource language speech databases. The proposed method employs the phonetic posteriorgram (PPG) trained with multiple resource-rich languages and combines multilingual PPGs for speech representation. The keywords are detected using the dynamic time warping method. We examined three types of combination of multiple languages such as concatenation of PPG (PPG_CONC), a combination of language resources to calculate multilingual PPG (PPG_ALL), and multi-task training of PPG using multiple languages (PPG_DIV). We carried out an experiment of the QbE-STD from Kaqchikel speech. As a result, the use of PPG showed better detection performance than the method based on the conventional speech feature (MFCC), and the use of multiple languages gave a further improvement of detection.

Content from these authors
© 2022 by the Graduate School of Information Sciences (GSIS), Tohoku University

This article is licensed under a Creative Commons [Attribution 4.0 International] license.
https://creativecommons.org/licenses/by/4.0/
Previous article Next article
feedback
Top