The Long Arc of History: Neural Network Approaches to Diachronic Linguistic Change

Eun Seo Jo; Mark Algee-Hewitt

doi:10.17928/jjadh.3.1_1

Abstract

Historians have traditionally relied on close readings of select primary sources to evaluate linguistic and discursive changes over time, but this approach can be limiting in its scope. Numeric representations of language allow us to statistically quantify and compare the significance of discursive changes and capture linguistic relationships over time. Here, we compare two deep learning methods of quantitatively identifying the chronology of linguistic shifts: RNN classification and RNN language modeling. In particular, we examine deep learning methods of isolating stylistic from topical changes, generating “decade embeddings,” and charting the changing average perplexity in a language model trained on chronologically sorted data. We apply these models to a historical diplomatic corpus, finding that the two world wars proved to be notable moments of linguistic change in American foreign relations. With this example we show applications of text-based deep learning methods for digital humanities usages.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!