IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Recent Advances in Machine Learning for Spoken Language Processing
Short Text Classification Based on Distributional Representations of Words
Chenglong MAQingwei ZHAOJielin PANYonghong YAN
著者情報
ジャーナル フリー

2016 年 E99.D 巻 10 号 p. 2562-2565

詳細
抄録

Short texts usually encounter the problem of data sparseness, as they do not provide sufficient term co-occurrence information. In this paper, we show how to mitigate the problem in short text classification through word embeddings. We assume that a short text document is a specific sample of one distribution in a Gaussian-Bayesian framework. Furthermore, a fast clustering algorithm is utilized to expand and enrich the context of short text in embedding space. This approach is compared with those based on the classical bag-of-words approaches and neural network based methods. Experimental results validate the effectiveness of the proposed method.

著者関連情報
© 2016 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top