In this study, we generated word vectors (Word2Vec) from the data of the Corpus of Historical Japanese and, based on the similarity of these vectors, we constructed a parallel corpus of the Takano (established in the Kamakura period) and Amakusa (established in the Muromachi period) editions of The
Tales of the Heike. This parallel corpus enabled us to identify the changes between the Kamakura and Muromachi periods.
A case study revealed, the following four points about the Amakusa edition:
(i) It tends to add particles such as
wa,
mo,
made,
ga, and
wo to bare noun phrases in the Takano edition.
(ii) It sometimes adds nouns to nominalized verb phrases with the adnominal forms found in the Takano edition.
(iii) It tends to translate the quotative particle
tote (quotative + sequential) in the Takano edition to
toiute (quotative say-sequential), except when the use is purposive.
(iv) It tends to use compound sentences where the Takano edition uses two separate sentences.
View full abstract