
On R-naught (R0), Modeling, and Outbreak Potential
Whenever pathogens like H5N1, Ebola, or more recently, Andes Hantavirus hit the news, social media is bombarded with questions about what the pathogen's estimated R0 is, or speculative statements citing that since the pathogen has an R0 of, 2.2, for example, it is certainly going to become the next pandemic. I wanted to write this post to serve to define R0, describe its limitations, and caution against its broad application as a predictor of pandemic risk. I have also seen several posts with SIR, SEIR, or SEIRD models, which while valuable in context, also have major limitations in their applications. Diseases are not always predictable, and pathogens' behaviors cannot simply be distilled down to R0 and compartmental models. This may be a quite lengthy post, but I do think the information is incredibly valuable.
What R0 Actually Means
R0, or the Basic Reproduction Number, is the average number of secondary infections caused by one infected individual in a fully susceptible population under a specific set of conditions. Keep that last phrase in mind there, specific set of conditions.
R0 is hardly a fixed biological constant. It changes with healthcare infrastructure, population density, infection control practices, behavior modifications, among other factors. A pathogen does not always have one single R0.
The Average Can Hide A Lot
Imagine two pathogens, Pathogen A and Pathogen B. Both have an R0 of 2.0.
- In an outbreak of Pathogen A, each infected person infects two others.
- In an outbreak of Pathogen B, nine infected people infect no one, and one infected person infects 20 in a superspreader event.
Both pathogens have an R0 of 2.0 but their real world behavior is drastically different. This is where the dispersion parameter (k) comes in.
What is k?
The dispersion parameter (k) describes the distribution of R0s around the average. In plain terms, R0 tells you the average number of people infected, while k tells you whether that average is shared by most cases or by a small number of superspreaders. For many emerging pathogens, k is just as informative as R0.
For a given pathogen with an R0 of 2.0:
- If k = 10, most people infect 2 others
- If k = 0.1, most people infect nobody, and a few people infect many others
A pathogen can have an R0 of >1 and still fail to spark an outbreak. Maybe k is very small, so there are superspreader events that cause an explosive start to the outbreak, but focus on high-risk groups and interventions tamps the spread of the pathogen much more effectively. K describes the heterogeneity of transmission and can paint a much better picture than R0 alone when assessing risk for broad disease spread.
Secondary Attack Rate (SAR)
Another measure we consider when assessing the likelihood of disease spread is the secondary attack rate. SAR is the proportion of susceptible contacts who will go on to develop disease after exposure to an index case. In diseases where household SAR is quite low, it suggests that while transmission is possible, it is quite inefficient. Again, it is so much more than just R0.
Compartmental Modeling
S(usceptible), E(xposed), I(nfectious), R(ecovered) [and sometimes D(ead) or I(nfectious) again, or V(accinated)] models are what are called compartmental models. Compartmental models simulate populations moving between some defined compartments. In the case of these SIR, SIRV, SIRS, SEIR, SEIS, SEIRD, SEIRV, and on and on, the population is moving between being susceptible, exposed, infectious, recovered, or any other iteration.
These models rely on R0 to determine the transmission rates, which we discussed earlier relies heavily on assumptions. These models are fairly inflexible, and assume homogenous mixing of populations, stable populations, and average contact rates. We know this is not possible to compartmentalize. Depending on the population, R0 may differ, k may differ, and therefore, all of the assumptions we rely on for modeling these scenarios are not as simple as S, I, and R.
Bottom Line
I think it is great to see laypeople interested in epidemiology and infectious disease dynamics. I would argue (though I am biased as it is my field) it is one of the most interesting fields there is. The advent of social media has empowered people to fear monger, mislead, misinform, and flat out lie. I caution you all to not try and distill these things down to metrics that do not consider real life. Remember, an R0 of 2 can look like a single superspreader event, or it can look like a protracted outbreak. Bundibugyo Ebola and Andes Hantavirus are absolutely things to watch and have some level of concern about, but jumping to the conclusion that they are imminent pandemic-causers based simply on an R0 of >1 is not something I would advise. It just really is not that simple.
Some reading below that I think is valuable to further explain these concepts:
https://pmc.ncbi.nlm.nih.gov/articles/PMC10227392/
https://pmc.ncbi.nlm.nih.gov/articles/PMC7442271/
https://wwwnc.cdc.gov/eid/article/25/1/17-1901_article
https://pmc.ncbi.nlm.nih.gov/articles/PMC3935673/
https://www.sciencedirect.com/science/article/pii/S0010482521004510