Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper
Monolingual Phrase Alignment Based on Word Embedding
Masato YoshinakaTomoyuki KajiwaraYuki Arase
Author information
Keywords: Phrase Alignment
JOURNAL FREE ACCESS

2021 Volume 28 Issue 2 Pages 508-531

Details
Abstract

We present a word embedding-based monolingual phrase aligner. In monolingual phrase alignment, an aligner identifies the set of phrasal paraphrases in a sentence pair. Previous methods required large-scale lexica or high-quality parsers. Consequently, applying them to languages other than English is difficult. Unlike them, the proposed method uses only a pre-trained word embedding model, and thus it relies solely on raw monolingual corpora. Our method yields word alignments using pre-trained word embedding and then extends them to phrase alignments using a heuristic approach. Then, it composes a phrase representation from word embedding and searches for a set of consistent phrase alignments on a lattice of phrase alignment candidates. The experimental results in this study on the English dataset show that our method outperforms the previous phrase aligner. We also constructed a Japanese dataset for analysis, confirming that our method works with languages other than English.

Content from these authors
© 2021 The Association for Natural Language Processing
Previous article Next article
feedback
Top