Early SNS-Based Monitoring System for the COVID-19 Outbreak in Japan: A Population-Level Observational Study

Background The World Health Organization declared the novel coronavirus outbreak (COVID-19) to be a pandemic on March 11, 2020. Large-scale monitoring for capturing the current epidemiological situation of COVID-19 in Japan would improve preparation for and prevention of a massive outbreak. Methods A chatbot-based healthcare system named COOPERA (COvid-19: Operation for Personalized Empowerment to Render smart prevention And care seeking) was developed using the LINE app to evaluate the current Japanese epidemiological situation. LINE users could participate in the system either though a QR code page in the prefectures’ websites or a banner at the top of the LINE app screen. COOPERA asked participants questions regarding personal information, preventive actions, and non-specific symptoms related to COVID-19 and their duration. We calculated daily cross correlation functions between the reported number of infected cases confirmed using polymerase chain reaction and the symptom-positive group captured by COOPERA. Results We analyzed 206,218 participants from three prefectures reported between March 5 and 30, 2020. The mean age of participants was 44.2 (standard deviation, 13.2) years. No symptoms were reported by 96.93% of participants, but there was a significantly positive correlation between the reported number of COVID-19 cases and self-reported fevers, suggesting that massive monitoring of fever might help to estimate the scale of the COVID-19 epidemic in real time. Conclusions COOPERA is the first real-time system being used to monitor trends in COVID-19 in Japan and provides useful insights to assist political decisions to tackle the epidemic.


INTRODUCTION
The 2019 coronavirus disease (COVID- 19) outbreak was first reported in Wuhan City, Hubei Province, China in late December 2019. 1,2 Since then, the causative virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has spread rapidly throughout China and to 184 countries and territories; as of April 7, 2020, a total of 1,430,141 confirmed cases and 82,119 deaths have been reported worldwide. 3,4 In Japan, there were 2,586 polymerase chain reaction (PCR)confirmed cases with symptoms and 80 deaths as of April 7, which is relatively low compared with other countries. 5 Clearer understanding of the current epidemiological situation of COVID-19 in Japan would improve preparation for and prevention of a massive outbreak, but the situation remains unclear due to the limited number of PCR tests conducted. 6,7 In particular, untraceable cases have been increasing in larger cities, such as Tokyo, 5 which suggests a shift from sporadic traceable trans-mission to an exponentially growing major outbreak. Unfortunately, surveillance using PCR tests or serological surveillance is not feasible because of limited resources, 8 and because testing of mild cases often requires them to use public transport to clinics, endangering the health of others. Such surveillance would also take a long time and is not capable of responding to outbreaks in a timely manner. Therefore, rapid and large-scale monitoring for capturing the current epidemiological situation of COVID-19 in Japan is needed. 9 On March 5, 2020, in Japan, Kanagawa prefectural government entered a collaboration with LINE Corporation, 10 the provider of one of Japan's largest mobile messenger applications with a claim to 83 million monthly active users, accounting for 65% of Japan's total population. They launched a health care support system to support monitoring and follow-up of high-risk groups and potential cases of COVID-19, as well as to provide efficient support for those with mild symptoms. 11 This system was named COvid-19: Operation for Personalized Empowerment to Render smart prevention And care seeking (COOPERA). COOPERA is also intended to support the prefectural government to rapidly grasp the epidemiological situation in the region by analyzing the data provided by the users. Kanagawa Prefecture plans to collaborate with other prefectures to develop COOPERA in a wide range of areas in Japan. The timing of the openings varied by prefecture, with the next COOPERA launch occurring on March 17 in Aichi Prefecture and March 18 in Shiga Prefecture.
In this paper, we describe the novel health care support system, COOPERA, and the results of analyses of data collected in Kanagawa, Aichi, and Shiga Prefectures by March 30, 2020. To validate whether the system captures the COVID-19 situation, we compared the data collected by COOPERA with the confirmed cases of COVID-19 reported in each prefecture.

METHODS
The COOPERA system COOPERA uses a chatbot asking the participants to provide basic information, such as their current physical condition (non-specific symptoms, such as fatigue and fever) and their residence. Based on the information provided by the user, COOPERA has three major objectives: (1) Providing individualized support for self-care and suggesting preventive behavior to avoid infection events. COOPERA supports better health behavior of the participants through chatbots based on the input data. The system returns individualized information related to self-care or consultation with health care providers, and further suggests preventive actions to avoid infections (eg, washing hands).
(2) Real-time follow-up and feedback to participants. COOPERA follows up the participants asking their health condition every other day. Depending on the user's answers about their physical condition, age, and medical conditions, COOPERA will provide the user with information that will help them take appropriate action. For example, a fever that lasts more than 4 days, or a strong feeling of weariness (fatigue) or shortness of breath, is one of the guideposts for contacting the Coronavirus Consultation Center as defined by the Ministry of Health, Labour and Welfare. In accordance with this guideline, COOPERA also provides respondents with information on the need to contact the Coronavirus Consultation Center, as well as contact information for the center depending on where the respondent lives.
(3) Capturing the epidemiological situation to assist public health action. Information collected and analyzed is shared with the public health sector, which helps them implement effective measures at local to national levels.  March 20). Due to the company policy of LINE Corporation, the users (and the COOPERA participants) are 13 years old or older. For those who had multiple answers within 1 day, only the first answer was extracted.

Prefectural information
Maps of the three prefectures are shown in eFigure 1. Populations of Kanagawa, Aichi, and Shiga are 9.2 million, 7.5 million, and 1.4 million, respectively, as of March 2020. In addition to the data collected through COOPERA, we extracted the number of COVID-19 cases (confirmed using PCR test) reported by each prefecture. [12][13][14] As of March 30, 2020, the three prefectures had identified 130, 169, and 6 COVID-19 cases, respectively. As an overview of the health care resources of each prefecture, as of 2018, the number of hospitals per 100,000 population was 3.7, 4.3, and 4.0, respectively. The number of doctors per 100,000 population was 50.7=158.7, 45.8=161.9, and 42.9=178.0 for women=men, respectively.
Questionnaire COOPERA asks age, gender, occupation, medical history (malignant tumor with anticancer drugs, malignant tumor without anticancer drugs, cardiovascular diseases, kidney diseases, diabetes mellitus, receiving dialysis treatment, chronic obstructive pulmonary disease (COPD), treatment with immunosuppressants, and pregnant), preventive behaviors, residence information (zip code), and onset date of current and past month's symptoms that are surrogate indicators of COVID-19 infection but are nonspecific (presence or absence of fever, strong feeling of weariness [fatigue], or shortness of breath) and duration of these symptoms. For those who report symptoms, COOPERA asks additional questions about medical visits and clinical diagnoses at that time. In the follow-up questions, COOPERA asks their health condition again. In this study, four categories of symptoms are asked: fever above 37.5°C (Condition a), strong feeling of weariness or shortness of breath (Condition b), both Condition a and b (Condition c), or either Condition a or b (Condition d).
In this study, we analyze and report the initial response to the questionnaire, and follow-up data were not accounted for. In other words, cases that developed symptoms after the initial response were not included in this analysis. Yoneoka D, et al.

Percentage of people with symptoms
In this study, unless otherwise noted, the percentage of condition a-d uses both current and past-month symptom responses; if on March 20, a person responds "fever above 37.5°C" for current symptoms, that person's response is reflected in both the denominator and numerator of the March 20 percentage. Also, for the past month, if the person had a "fever above 37.5°C" on March 15, his or her answer will be reflected in the denominator for the past month, as well as in the numerator for the percentage on March 15.

Statistical analysis
Baseline data were reported as mean (standard deviation [SD]) or proportion. For change point detection in the proportion of daily reported cases (for each Condition), a piecewise linear regression model was fitted with (at most) ten change points. 15 The difference in slopes before and after the estimated change point(s) was tested using the Davies test. 16,17 The relationship between two time-series data, T1) the number of cases confirmed with PCR test reported each day and T2) the proportion of the participants with any symptom onset, was examined in Kanagawa Prefecture using the sample cross correlation function (CCF) to validate that the system is capturing the epidemiological situation. 16 Given the delay between symptom onset to confirmation observed elsewhere, 2 we expected that T1 and T2 should be correlated with some time lag. We shifted T1 by n days (−10 < n < 10) and calculated the correlation with T2. As a sensitivity check, CCF between T2 and the weekly number of influenza cases in Kanagawa Prefecture during the study period was calculated with lag of −5 to 5 days. 18 Statistical analyses were conducted with R software (version 3.6.0; R Foundation for Statistical Computing, Vienna, Austria). The type I error rate was fixed at 0.05.

Ethics statement
Ethical approval was granted by the ethics committee of Keio University School of Medicine, under authorization number 20190338. We only obtained data from those who have given consent for the prefecture that administers the questionnaire to provide their response data to a third party for research use. Respondents must give their consent on the LINE chatbot before they proceed to the questionnaire response page.

RESULTS
A total of 206,218 participants, including 124,766 (60.5%), 66,558 (32.3%), and 14,894 (7.2%) in Kanagawa, Aichi, and Shiga Prefectures, respectively, were reported from March 5 to 30, 2020. Table 1 shows the basic characteristics of the participants at the initial response date for the three prefectures combined (see eTable 1 for the prefecture-specific data). Most participants (96.93%) did not have any symptoms when enrolled in the system (ie, No-symptom group). The distribution of symptomatic conditions was 1.37%, 2.38%, 0.68%, and 3.07% for Condition a, b, c, and d, respectively. Mean and SD of age at the baseline was 44.2 (SD, 13.2) years. The age distribution of the group without symptoms was right skewed with mean age of 44.36 (SD, 13.18) years, while mean age of the group with any symptoms (ie, Condition a to d) was 38.1 (SD, 13.2) years. More women participated: 68.5% of the sample were female, 31.3% were male, and 0.2% other. The popular preventive actions were covering mouth and nose (eg, with masks or handkerchiefs) when coughing or sneezing (90.4%), washing hands with soap (89.8%), and hand disinfection with alcohol (66.4%) in the group without symptoms. Table 2 shows the proportion of participants with each preventive action stratified by symptomatic conditions on two or three time points in each prefecture, when the banners were presented and massive flow of participation was observed, showing that the preventive actions have not largely changed during the study period.
The timeline of the proportion of each condition for each day from February 1 to March 30, 2020, is shown in Figure 1 and Figure 2. Figure 1 depicts the proportions and the number of confirmed cases in the three prefectures. The significant change points were observed on March 24, 2020 for Condition a (P = 0.036), and March 23, 2020 for Condition b, c and d (P = 0.012, P < 0.001, and P = 0.024, respectively). The proportion of participants with each non-specific symptom among participants with comorbidities and who were pregnant are shown in Figure 3. The participants with COPD, immunosuppressant treatment, or malignant tumor with anticancer drugs had fever more than those with the other comorbidities and pregnancy.
Last, we validated whether COOPERA captures the COVID-19 situation by comparing the proportions in Kanagawa Prefecture with the confirmed cases of COVID-19 ( Figure 4). We found significant correlation between T1 and T2 with 0 to 3 days delay, suggesting that non-specific symptoms reported in COOPERA captures the COVID-19 epidemic in the region and is a powerful tool to infer the trend of COVID-19 pandemic. Similar results were obtained in a sensitivity analysis where the study period was divided into two periods before and after the start date. In addition, the CCF between the proportion of the participants with fever and the reported number of influenza cases in Kanagawa Prefecture was calculated (eFigure 2), showing no significant positive correlation between them (with P = 0.56-0.99 for each lag between −5 to 5 days).

DISCUSSION
In this study, we used a novel health care support system, COOPERA, and analyzed over 200,000 observations in three prefectures in Japan. We found that 96.93% of participants had no symptoms, and the proportion of participants with fever significantly correlated with the number of COVID-19-positive cases after 0 to 3 days, suggesting that a questionnaire-based massive epidemiological monitoring may be able to capture the actual epidemic situation. This result was robust even when we divided the study period into two periods before and after the start date. However, the number of participants with fever includes both COVID-19 cases and other cases with fever due to common cold or influenza. Thus, it is possible that the proportions we have shown here do not directly represent COVID-19 prevalence. To address this, we calculated the CCF between the proportion and the reported number of influenza cases and obtained no SNS-Based Monitoring System for COVID-19  significant positive correlation between them (eFigure 2). This suggests the system has (at least partially) captured the COVID-19 epidemiological situation. Like several other infectious diseases, including SARS-CoV and MERS-CoV, SARS-CoV-2 is a coronavirus capable of human-to-human transmission, and those who have comorbidities are at high risk of mortality. [19][20][21] Our results imply that those who have chronic comorbidities, especially COPD, immunosuppressant treatment, or malignant tumor, might be more susceptible to COVID-19 infection. Further, in terms of preventive action, cough etiquette and handwashing with soap were common preventive actions (around 90%), while less than 10% of the participants executed telework and stagger commuting hours during the study period. This result might indicate that most people are already implementing individual-level preventive actions, such as handwashing, while society-level preventive efforts, which require governmental or community support, are still not being properly implemented and need to be more strongly encouraged. Given that teleworking in particular is not an individual choice and is a very important non-pharmaceutical intervention, the governments of these prefectures need to consider further measures to ensure companies make teleworking available to their staff on a wider basis.
The characteristics of COOPERA can be contrasted with existing infectious disease surveillance. 22,23 For example, there is a survey in Japan that collects information on pharmacy dispensing with the aim of early detection of domestic outbreaks of influenza. In addition, there is a survey on the implementation of school closure due to outbreaks of influenza in order to capture the epidemic. On the other hand, COOPERA is characterized by the ability to detect signs of an epidemic before it occurs in an area. With COOPERA, LINE users can immediately register their symptoms via LINE. In other words, it has a high degree of immediacy in that it can collect data before visits to pharmacies, school closures, and other measures are taken. In addition, COOPERA can also track the change over time of a user's individual symptoms (eg, when they appeared and how many days until they recovered).
Because personal protective equipment is in short supply and there is a risk of infection among healthcare professionals, it is not realistic to increase the number of PCR tests immediately. The fact that there is a correlation between the number of PCRpositive cases and the proportion of symptoms detected using COOPERA in this study suggests that COOPERA may be able to detect the scale of the spread of infection and the location of the outbreak when the number of infected cases increases and it becomes even more difficult to grasp the actual situation by PCR.
This study has several limitations. The first one is bias of participants. Since COOPERA monitoring was based on the online platform LINE and used a non-random sampling scheme, our findings might be influenced by selection bias: those who have no access to the internet or smartphone may be underrepresented. In addition, there is a possibility that LINE users who have a relatively high level of health awareness or are worried about some symptoms tend to respond to the system. Therefore, the proportion of participants with symptoms might have overestimated the prevalence of people with symptoms. In addition, it should be noted that the results showed a low percentage of positive symptoms in the day when the banner was displayed. There is a possibility that the participants who answered the questions on the day the banner was displayed were not suffered symptoms by disease conditions preventing them from joining the system at the moment, consequently underestimating the actual proportion of people with the symptoms. Conversely, the participants who answered the questions before or after the period of banner display could be more motivated to participate in the system if any symptoms present, which might end up with overestimating the percentage of symptomatic participants. Unfortunately, we only have information on the population of LINE users, which is claimed by the company to be about 83 million in total, and there is no detailed distribution data regarding the participants' background such as sex, age, or prefecture. Therefore, it is difficult to verify selection bias because it is not possible to evaluate the response rate among users and compare the demographic characteristics of non-respondents and respondents. In addition, since the detailed and analyzable data on those who received PCR testing was not publicly available, it should be noted that the CCF result might include this bias, which is difficult to examine. Second, it should be noted that the system could not avoid recall bias. In this study, COOPERA monitoring included questions regarding symptoms in the last month. People may not remember the symptoms in the last month accurately. This recall bias may have created upward trend of the proportion with symptoms toward the start of the system. Third, given the selection bias and recall bias in COOPERA, Figure 1 may give an exaggerated impression that the trend of the number of COVID-19 has been dramatically changed during the study period. In the figure, we showed the proportion of participants who have symptoms reflecting past symptomatic experience. However, trends in the daily proportion using only the "current symptom at each study period" data are shown in eFigure 3. No exponential trends were found, suggesting that the results of this study may overestimate the true prevalence of symptoms. However, if only "current symptoms" are used, the number of respondents varies greatly from day to day, and the calculation of the percentage is not stable, as can be observed in eFigure 3. For example, in Kanagawa Prefecture, there are several peaks, but the timing of the peak at the bottom coincides with the timing of the COOPERA information released at the Kanagawa Governor's press conference. This peak may have occurred as a result of the increase in the number of respondents following the press conference and media coverage of COOPERA, with the percentage stabilizing only for that day. By taking into account the "symptoms of the past month", the percentage calculation is stabilized as the information of the respondents on a given day is used to calculate the percentage over the past month (with a recall bias, of course).

Conclusions
In summary, this study is the first report based on a large-scale (over 200,000 participants) health care support system in Japan. Significant correlation between the time trend of participants with symptom with the time trend of PCR-confirmed cases supports the utility of the system in monitoring the COVID-19 epidemiological situation in Japan.