Since the early 2010s, quantitative text analysis has gained renewed prominence across the social sciences; yet the method remains both familiar and novel within the field of China studies. This article reviews this hybrid status by (1) tracing the Cold War origins of this method, the intellectual and geopolitical conditions that nurtured its first wave, and the factors behind its subsequent eclipse; (2) analyzing the method’s recent resurgence and the principal ways it has been redeployed in contemporary scholarship; and (3) discussing the method’s future potential along with unresolved theoretical, empirical, and data-access challenges. First, we show that quantitative text analysis was actively adopted under stringent data constraints during the Cold War, but that its popularity faded after the 1980s as the data environment changed. Second, drawing on publication data, we document a marked resurgence since the 2010s. We highlight recent studies that have deepened understanding of authoritarian governance in contemporary China, especially in the areas of censorship and information manipulation. Finally, we assess future prospects. We note opportunities for deeper integration with machine-learning techniques and an expanding range of research questions. At the same time, we identify key obstacles: tightening data regulations, the need to link quantitative findings with qualitative insights, and a persistent tension between this approach and traditional area-studies scholarship.
View full abstract