Short Text Classification Based on Distributional Representations of Words

Chenglong MA; Qingwei ZHAO; Jielin PAN; Yonghong YAN

doi:10.1587/transinf.2016SLL0006

Special Section on Recent Advances in Machine Learning for Spoken Language Processing

Short Text Classification Based on Distributional Representations of Words

Chenglong MA, Qingwei ZHAO, Jielin PAN, Yonghong YAN

著者情報

キーワード: short text classification, word embedding, gaussian model

ジャーナルフリー

2016 年 E99.D 巻 10 号 p. 2562-2565

DOI https://doi.org/10.1587/transinf.2016SLL0006

詳細

抄録

Short texts usually encounter the problem of data sparseness, as they do not provide sufficient term co-occurrence information. In this paper, we show how to mitigate the problem in short text classification through word embeddings. We assume that a short text document is a specific sample of one distribution in a Gaussian-Bayesian framework. Furthermore, a fast clustering algorithm is utilized to expand and enrich the context of short text in embedding space. This approach is compared with those based on the classical bag-of-words approaches and neural network based methods. Experimental results validate the effectiveness of the proposed method.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）