Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Detecting Nonstandard Word Usages on Social Media
Tatsuya AokiRyohei SasanoHiroya TakamuraManabu Okumura
Author information
JOURNAL FREE ACCESS

2019 Volume 26 Issue 2 Pages 381-406

Details
Abstract

We focus on nonstandard usages of common words on social media, where words, sometimes, are used in a totally different manner from that of their original or standard usage. In this work, we attempt to distinguish nonstandard usages on social media from standard ones in an unsupervised manner. We also constructed new Twitter dataset consisting of 40 words with nonstandard usages and then used the dataset for evaluation in an experiment. For this task, our basic idea is that nonstandard usage can be measured by the inconsistency between the target word’s expected meaning and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and also provide us with findings, for example, on how to construct context embeddings, and which corpus to use.

Content from these authors
© 2019 The Association for Natural Language Processing
Previous article Next article
feedback
Top