An overview of the SemEval-2 Japanese WSD task is presented. The new characteristics of our task are (1) the task will use the first balanced Japanese sense-tagged corpus, and (2) the task will take into account not only the instances that have a sense in the given set but also the instances that have a sense that cannot be found in the set. It is a lexical sample task, and word senses are defined according to a Japanese dictionary, the Iwanami Kokugo Jiten. This dictionary and a training corpus were distributed to participants. The number of target words was 50, with 22 nouns, 23 verbs, and 5 adjectives. Fifty instances of each target word were provided, consisting of a total of 2,500 instances for the evaluation. Nine systems from four organizations participated in the task.
View full abstract