Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Mixture Probabilistic Context-Free Grammar
An Improvement of a Probabilistic Context-Free Grammar Using Cluster-Based Language Modeling
Kenji Kita
Author information
JOURNAL FREE ACCESS

1996 Volume 3 Issue 4 Pages 103-113

Details
Abstract
This paper proposes an improved probabilistic CFG (Context-Free Grammar), called the mixture probabilistic CFG, based on an idea of cluster-based language modeling. This model assumes that the language model parameters have different probability distributions in different topics or domains. In order to performs topic-or domaindependent language modeling, we first divide the training corpus into a number of subcorpora according to their topics or domains, and then estimate separate probability distribution from each subcorpus. Therefore, a mixture probabilistic CFG has several different probability distributions for CFG productuions. The language model probability of a sentence is calculated as the mixture of these probability distributions. The mixture probabilistic CFG enables us to make a context-or topic-dependent language model, and thus accurate language modeling would be possible. The proposed model was evaluated by calculating test-set perplexity using the ADD (ATR Dialogue Database) corpus and a Japanese intra-phrase grammar. The mixture probabilistic CFG had a test-set perplexity of 2.47/phone, while simple probabilistic CFG had a test-set perplexity of 2.77/phone. We also conducted speech recognition experiments using three language models, including pure CFG (without probabilities), simple probabilistic CFG, and the mixture probabilistic CFG. In our experiments, the mixture probabilistic CFG attained the best performance. The proposed model was also evaluated using sentence-level clustering. This evaluation used the dialogue corpus in which each utterance is annotated with an utterance type called IFT (Illocutionary Force Type). Using these IFTs, we divided the corpus into 9 clusters, and then estimated production probabilities from these clusters. Without IFT clustering, the perplexity was 2.18 per phone, but using IFT clustering, the perplexity was reduced to 1.82 per phone.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top