To develop high-performance natural language understanding (NLU) models, it is necessary to have a benchmark to evaluate and analyze NLU ability from various perspectives. The English NLU benchmark, GLUE (Wang et al. 2018), has been the forerunner, and benchmarks for languages other than English have been constructed, such as CLUE (Xu et al. 2020) for Chinese and FLUE (Le et al. 2020) for French. However, there is no such benchmark for Japanese, and this is a serious problem in Japanese NLP. We build a Japanese NLU benchmark, JGLUE, from scratch without translation to measure the general NLU ability in Japanese. JGLUE consists of three kinds of tasks: text classification, sentence pair classification, and QA. We hope that JGLUE will facilitate NLU research in Japanese.
View full abstract