An interpretation of COVID-19 in Tokyo using a combination of SIR models

Koichiro MAKI

doi:10.2183/pjab.98.006

Abstract

A year and a half has passed since the outbreak of the COVID-19 pandemic. Mathematical models to predict infection are expected and many studies have been conducted. In this study, a new interpretation was created that could reproduce the daily positive cases in Tokyo using only a simple SIR model. In addition, the data on the ratio of transfer to delta variants could also be simulated. It is anticipated that this interpretation will be a basis for the development of forecasting methods.

1. Introduction

Over the past year and a half, COVID-19 has caused significant damage to the health of people and economic activities around the world.¹⁾^,²⁾ As a countermeasure, technology to forecast the spread of infection using mathematical models is indispensable. The SIR model (susceptible, infected, recovered) has been proposed as a mathematical approach,³⁾^–⁵⁾ and further development, such as the SEIR model, has been advanced.⁶⁾^–⁸⁾

Despite the simplicity and effectiveness of SIR-based models in predicting a variety of other infectious diseases, communities in many countries have reported inconformity in their applicability to COVID-19.⁹⁾^–¹⁵⁾ Therefore, it is necessary to further improve the SIR model and to develop a prediction tool for common infectious diseases, including COVID-19.

In this study, the applicability of the SIR model was scrutinized using data from newly positive individuals in Tokyo. A noteworthy point of view is the condition that the system in which the model holds is completely mixed.¹⁶⁾

2. Characteristics of the SIR model and its arguments

In the basic SIR model, differential equations of the susceptible fraction (S(t)) infected fraction (I(t)) and recovered fraction (R(t)) for each time are shown as follows:

\begin{equation} \frac{dS}{dt} = -\beta \mathit{SI}, \end{equation}

[1]

\begin{equation} \frac{dI}{dt} = \beta \mathit{SI} - \gamma I, \end{equation}

[2]

\begin{equation} \frac{dR}{dt} = \gamma I, \end{equation}

[3]

where β is the infection rate and γ is the recovery rate. Further, the sum of fractions is equal to 1 for each time:

\begin{equation} S + I + R = 1. \end{equation}

[4]

R0 is the basic reproduction number, as follows:

\begin{equation} \text{R}0 = \frac{\beta}{\gamma}. \end{equation}

[5]

Furthermore, the fraction of new positives (daily positive cases) is denoted by P(t), and since it is the reduction of the susceptible fraction per day, it can be obtained by the following equation:

\begin{equation} P = -\frac{dS}{dt} = \beta \mathit{SI}. \end{equation}

[6]

From Eqs. [1] and [3], a formula for the relationship between S and R is obtained as:

\begin{equation} S = \eta \exp \left(-\frac{\beta}{\gamma}R\right), \end{equation}

[7]

where η is the initial value of S when R = 0, and its corresponding time is the start of the infection, t = 0. Then, η has the form

\begin{equation} \eta = 1 - I(0). \end{equation}

[8]

The solution of the SIR model can be obtained numerically as a relationship between R and time from Eqs. [3], [4] and [7].

If the spread of infection starts with one person, the initial condition (I(0)) in a fractional representation requires:

\begin{equation} I(0) = \frac{1}{N}. \end{equation}

[9]

An important point is found in Eq. [6]. It is that the number of new positive cases (P) is proportional to the number of infected (I). The fact that the proportionality coefficient (βS) has the same value around all infected persons is an approximation using a constant (β) and an average ratio of the susceptibility. In the region where this approximation using the average value (mean approximation) holds, the calculated values of the theory should correspond closely to the measured values. Furthermore, since the range of this region is not infinite, it must be closed at the boundary of a region with a lower average infection rate. If, on the other hand, the average value of the external region is high, it will be one of the fluctuations in the fast and large external spread of infection. The same argument has to be repeated for the larger region. Within a closed region, there should be a saturation effect as the spread of infection progresses. In the mathematical model, this saturation effect is expressed by the proportionality coefficient (S).

In order to determine the population size of the region, it is necessary to determine the characteristics of such a closed system from real data. The characteristics are as follows: exponential rising due to a constant infection rate and exponential falling due to a saturation effect. Each closed region with these characteristics is a kind of “subcommunity”, and its population can be estimated from the mean approximation model. If the mean approximation is adequate, this indicates that there is sufficient mixing in the region. Next, how far the region of sufficient mixing extends is important information to be acquired, which is to be inferred from actual data. Therefore, applying the SIR model to an unreasonably large system makes it difficult to reconcile the shapes, and it is worthwhile only in a regional range where the recovered and infected are completely mixed. It is the condition of “Complete Mixing”¹⁶⁾ that the SIR model requires. We define the people included in this complete mixed region as a “Basic Community”.

Eventually, a larger population region should be an aggregation of these perfectly mixed subcommunities. To be precise, it should be possible to express the shape of the data over a long period of time in large population regions by a linear combination of functions of the SIR model solution. This method allows us to extract basic information about the community that is required to adapt the mathematical model to the data of a new positive case: the size of the population (N), the number of communities present at the same time (m), the infection rate (β), and the time when the infection started (1st day).

According to the above concept, the actual number of daily positive cases in the basic community is represented as

\begin{equation} \psi (t,N,\beta ,\eta) = N \cdot P(t,\beta ,\eta), \end{equation}

[10]

where R0 can be used instead of β, because γ should have the same value for all communities.

Figure 1 shows the time variation of the number of I(t) in each parameter. As the population of the basic community increases, the number of infected people increases proportionally, and at R0 = 2.7, 26,000 infected people (hospitalized patients) are estimated for a total community of 100,000 people. For R0 = 2.7, 2.0 and 1.5, there is a steep slope in the spread of infection after about 2, 3 and 6 months, respectively.

Fig. 1.

Time variation of the number of infected people for each R0, N and γ = 0.1.

Figure 2 shows a profile of new positive patients. The maximum number of new positive cases in a community with a population of 100,000 at R0 = 2.7, 2.0, 1.5 is 3340, 1750 and 660, respectively. Corresponding to each parameter, the peak position of P is shifted by about 8 days earlier than I.

Fig. 2.

Daily positive cases (P) for each parameter.

3. Result of adaption to daily positive cases in Tokyo

The solution (P) of the SIR model for the i-th basic community is expressed by the following function:

\begin{equation} \psi_{i} = \psi (t_{i},N_{i},\text{R}0_{i},\eta_{i}), \end{equation}

[11]

where t, N, R0 and η are values of individual communities. Since the spread of an infection in a general community can be represented by a set of basic communities, it is a form of a simple combination of ψ to reproduce the overall infection data,

\begin{equation} \varPsi = \sum_{i}\psi_{i}. \end{equation}

[12]

Figure 3 shows the data for daily positive cases in Tokyo (available from Tokyo Metropolitan Government’s new coronavirus countermeasure website (https://stopcovid19.metro.tokyo.lg.jp/, in Japanese and English) and NHK website (https://www3.nhk.or.jp/news/special/coronavirus/data/pref/tokyo.html, in Japanese)) and the theoretical curve Ψ that reproduces the data. Each parameter that makes up the theoretical curve (Ψ) is shown in Table 1. As a result, Fig. 3 shows a high degree of agreement between the data for Tokyo and the calculated values of the summed SIR model solution.

Fig. 3.

Daily positive cases in Tokyo and theoretical curve Ψ. The SIR solution (ψ_i) of the basic community used to combine Ψ is shown in the figure.

Table 1. Parameter set for reproducing the data of daily positive cases in Tokyo. Equation [12] was fitted to the Tokyo data with these parameters (see Fig. 3)

Solution No.	1st day	β (R0 = β/γ)	N	m	Remarks Wave order, variant type
ψ1	2020/2/5	0.22	6000	1	1st
ψ2	2020/4/24	0.17	5000	5	2nd
ψ3	2020/7/20	0.21	5000	1	none
ψ4	2020/8/10	0.16	5000	9	3rd
ψ5	2020/11/28	0.31	8000	4	3rd
ψ6	2021/1/11	0.18	20000	3	4th
ψ7	2021/5/6	0.2	6500	5	5th
ψ8	2021/5/28	0.235	40000	4	5th, δ (L452R)

In Table 1, the second category “1st day” means the first day of the outbreak, and m is the number of “basic communities” that are simultaneously spreading the infection. The notable point is that the first day of the outbreak is during the peak period of the previous expansion wave. At this time, it indicates that the buds that will cause the spread of infection a few months later have metastasized to other basic communities. In the table, solutions ψ3 and ψ4 are not identified as large waves in Japan.

The solutions (P) of ψ₇ and ψ₈ in Table 1 are the two spreads of infection included in the fifth wave. Using the value of R0 in the table and the size of the basic community, the ratio of the delta mutant strains issued by the Tokyo Metropolitan Government (data available from https://www.bousai.metro.tokyo.lg.jp/_res/projects/default_project/_page_/001/015/548/63/20210916_10.pdf (in Japanese)) was reproduced as shown in Fig. 4. The good agreement between Tokyo’s data and our calculations in the time course of the transition to the delta mutant strain suggests that we were able to accurately separate the two infection peaks within the larger peak of the fifth wave.

Fig. 4.

Proportion of positive cases of the L452R(delta) variant compared to other strains. The staircase data are the actual values of the one-week average in Tokyo. The solid line shows the ratio of the calculated values of the other strains (ψ₇) to the delta strains (ψ₈).

Based on the Japanese Ministry of Health, Labour and Welfare’s report (Guideline for Medical Treatment of New Coronavirus Infection (COVID-19), edition 5.3 (2021), available from https://www.mhlw.go.jp/content/000825966.pdf (in Japanese)) that the infectivity of COVID-19 lasts for 7 to 10 days, the recovery rate was set to γ = 0.1 (1/day) in the calculation of these SIR model solutions.

4. Discussion

The reported targets of analysis so far seem to be large and complex infectious situations, such as national and local governments.⁹⁾^,¹⁷⁾ In such a large system, the SIR model is impossible to adapt without a convincing mean approximation. The nonconformity of the SIR model reported in Italy,¹³⁾ China¹⁰⁾^,¹⁴⁾ and Iran⁹⁾ is attributed to the inability to represent the variation in details due to the large population, N. In order for the SIR model to decompose the data, some of the features of the model need to be exposed in the shape of the data, and it is easier to find exposures in small communities than in large ones.

The overall spread of an infection, even in a large system, is explainable by a composite model that combines several basic communities. It is, for example, to create a mathematical model of Japan as a whole by superimposing the surrogate functions of the SIR solution constructed for individual prefectures.

The parameter set given in Table 1 is one of the solutions of the mathematical model for understanding the infection situation in Tokyo, a city with a population of 14 million. The communities that actually caused the spread of infection are classified into eight (ψ₁ to ψ₈) locations by the start time. It is estimated from the corresponding peaks that there are tens or hundreds of thousands of people. The members forming this basic community are connected by face-to-face relationships in human behavior, which does not imply a community separated by physical space. Although such a community is constantly recombining, it is regarded as if it had always existed as a mathematical model of infection. In this picture, Tokyo appears to be as follows: there are many basic communities with high infection rates scattered among a huge community with low infection rates spread all over the place, as well as several neighboring communities with high infection rates which were infected in a chain reaction. It’s just as if piles of dead grass dotting a lawn are ignited sequentially.

The fact that the 1st day of infection is near to the preceding peak, the probability of “the fire jumping to the next pile” is highest at the time of the most intense burning of the previous pile. As found in Fig. 2, the time to reach the peak is three to four months for R0 = 2.0, and the half-width of the peak is within two months. With this feature, the spread of the infection is discrete, like a series of waves that are triggered sequentially.

One basic community is sufficient to explain the fifth wave, yet, in addition, two communities, ψ₇ and ψ₈, are provided to explain the data on the delta strain (L452R). Consequently, the infection spread that appears to be one basic community sometimes becomes a combination of several communities as more detailed data is acquired.

In regards to the peak of the fifth wave, which produced a large number of infections, the infection rate (β = 0.235) is not much different from the previous peak, so the conspicuous difference is the large population of one basic community.

The basic community is a closed system that is instantaneously affected by recovered and infected people under completely mixed conditions. The time required for that complete mixing is at least until the peak of the infection is reached, since it is sufficient that the peak is formed due to the saturation effect. Within this closed system, the number of infections rises exponentially, reaches a maximum value, and then drops off exponentially. If the population of the basic community is even larger, the time dependence of the number of infected people will exhibit steeper ups and downs, even for the same infection rate. The reason why the mathematical model fits so closely with the data on the number of infections in Tokyo is due to the simplicity of the model, which consists of a few isolated large well-formed waves.

Ultimately, any analysis of the actual situation of an epidemic should focus on elucidating the underlying basic communities and their trends. It will then be possible to forecast infections, and to avoid diffusion of the disease among communities as a preventive measure.

Notes

Edited by Toshimitsu YAMAZAKI, M.J.A.

Correspondence should be addressed: K. Maki, Sasazuka 2-5-2-806, Shiroi, Chiba 270-1426, Japan (e-mail: makik@pri.nir.jp).

References

1) Sohrabi, C., Alsafi, Z., O’Neill, N., Khan, M., Kerwan, A., Al-Jabir, A. et al. (2020) World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19). Int. J. Surg. 76, 71–76.
2) Wang, L.-S., Wang, Y.-R., Ye, D.-W. and Liu, Q.-Q. (2020) A review of the 2019 novel coronavirus (COVID-19) based on current evidence. Int. J. Antimicrob. Agents 55, 105948.
3) Kermack, W.O. and McKendrick, A.G. (1927) A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. A 115, 700–721.
4) Brauer, F., Van den Driesche, P.V. and Wu, J. (2008) Mathematical Epidemiology. Springer, Berlin.
5) Giuseppe, G. (2020) A simple SIR model with a large set of asymptomatic infectives. arXiv:2003.08720v4 (revised and augmented version).
6) Carcione, J.M., Santos, J.E., Bagaini, C. and Ba, J. (2020) A simulation of a COVID-19 epidemic based on a deterministic seir model. Front. Public Health 8, 230.
7) Chowell, G., Sattenspiel, L., Bansal, S. and Viboud, C. (2016) Mathematical models to characterize early epidemic growth: a review. Phys. Life Rev. 18, 66–97.
8) Weinstein, S.J., Holland, M.S., Rogers, K.E. and Barlow, N.S. (2020) Analytic solution of the SEIR epidemic model via asymptotic approximant. Physica D 411, 132633.
9) Moein, S., Nickaeen, N., Roointan, A., Borhani, N., Heidary, Z., Javanmard, S.-H. et al. (2021) Inefficiency of SIR models in forecasting COVID-19 epidemic: a case study of Isfahan. Sci. Rep. 11, 4725.
10) Hu, Z., Ge, Q., Jin, L. and Xiong, M. (2020) Artifcial intelligence forecasting of Covid-19 in china. arXiv:2002.07112.
11) Maier, B.F. and Brockmann, D. (2020) Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China. Science 368, 742–746.
12) Postnikov, E.-B. (2020) Estimation of COVID-19 dynamics “on a back-of-envelope”: does the simplest SIR model provide quantitative parameters and predictions? Chaos Solitons Fractals 135, 109841.
13) Giordano, G., Blanchini, F., Bruno, R., Colaneri, P., Filippo, A.-D., Matteo, A.-D. et al. (2020) Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nat. Med. 26, 855–860.
14) Lin, Q., Zhao, S., Gao, D., Lou, Y., Yang, S., Musa, S. et al. (2020) A conceptual model for the coronavirus disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental action. Int. J. Infect. Dis. 93, 211–216.
15) Roda, W.C., Varughese, M.B., Han, D. and Li, M.Y. (2021) Why is it difficult to accurately predict the COVID-19 epidemic? Infect. Dis. Model. 5, 271–281.
16) Seno, H. (2012) Reproduction numbers of infectives for a time-discrete epidemic population dynamics model. RIMS Kôkyûroku 1789, 35–45 (in Japanese with English abstract).
17) Postnikov, E.B. (2021) Reproducing country-wide COVID-19 dynamics can require the usage of a set of SIR systems. PeerJ 9, e10679.

Non-standard abbreviation list

SIR

susceptible, infected, recovered

Corresponding author

Register with J-STAGE for free!