We conducted a study using Twitter, one of the social networking services that is widely used to post daily messages in real time. Recently, the method of using big data to predict the prevalence of infectious diseases, such as influenza and COVID-19, has been attracting attention. Pollen dispersal is influenced by real-time weather conditions and complex external factors, such as temperature, humidity, and wind direction. The appearance of allergic symptoms to pollen exposure is immediate. Infectious disease outbreaks and the onset of hay fever are very similar in that they are influenced by complex external factors. Therefore, we conducted a study to investigate whether tweets related to hay fever posted on Twitter could be related to the amount of pollen dispersal. Using the Python programming language, we retrieved data through Twitter API (Application Programming Interface), obtaining a total of 316,505 daily tweets between February 3 and May 22, 2022. We examined the relationship between the number of tweets related to hay fever and the amount of cedar and cypress pollen dispersal in Tokyo and Matsumoto. Our analysis revealed that the number of tweets related to hay fever increased as the amount of cedar pollen dispersal increased in Tokyo, with a strong correlation (0.85; p<0.001), but not in Matsumoto. Next, we analyzed the contents of the tweets related to hay fever using morphological analysis. The most frequently used terms were “sneezing” and “runny nose,” while “stuffy nose” was used infrequently. On the other hand, “itch” and “itchiness” were frequently used to express the feeling of itchiness. Using Twitter, a typical SNS, we were able to grasp the real-time trends of hay fever, a medical condition.
View full abstract