<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=749646578535459&amp;ev=PageView&amp;noscript=1">

Modeling for dummies

May 11, 2020


After a few day’s break from the “for dummies” series, I figured it was time to expand our thoughts on the science of pandemics again. We have covered Herd Immunity, Rt, Prevalence, The Mortality, and Your Mortality. You now have the basic science concepts to be able to run a simple pandemic model. But first the classic quote about models. 
                All models are wrong but some are useful.  

                                                                - George Box

Models are a way to depict future predictions. We keep seeing COVID death models changing all the time and I want to explain why that is happening. Spoiler alert - it’s not because the models were necessarily poorly built. It’s because they are trying to model an almost impossible thing to model (pandemic model of a novel virus in 2020 across 50 states) with both hands tied behind their backs (not enough good data). 

But before we get into those scenarios, let’s talk about a couple of domains that we are more familiar with... sports and business. 

In sports, at the beginning of the season, there are always predictions of who will make it to the Superbowl, the Finals etc. Even before the first game of the season, sports enthusiasts start talking about whether a team will do better or worse than last year. Now, after each week, those predictions change, odds of each bet change, and certainly who we think will win the title changes. Now we would be foolish to not change our predictions based on the performance of the prior week. 

The same thing applies to business. Proformas and forecasts become more and more uncertain the further out you go. There is one thing we know for certain - models are almost certainly wrong the second they are made especially those that are making predictions about an outcome related to a constantly changing system that has a lot of variables. 

I linked to an EMRAP lecture in the 4/11 Letter where Andrea Bertozzi Ph.D., a smart epidemiologist, gave a short primer on pandemic modeling. We are going to do an even more simple model with some easily available data. It will provide insight into how changing just ONE variable wildly changes the model’s outcome. Also, keep in mind that often when we hear about the predicted number of deaths from COVID, these predictions forecast deaths through a certain date. Most seem to pick about 4 months. We will predict the number until we reach the desperately desired state of herd immunity (see Herd Immunity for dummies). 

We are going to use just three data points:

  • Number of people in the US = approximately 320 million

  • Number of people diagnosed with COVID as of 5/7/20 = approximately 1.28 million 

  • Number of people who have died from COVID as of 5/7/20 = approximately 76k

Knowing these numbers we can come up with a VERY simple model that will estimate the number of deaths that will CERTAINLY BE WRONG after the dust settles. In these three scenarios, I will use 80% immunity as the amount of the population we need to have had the disease in order to achieve herd immunity. We have 320m people so we need 256m to have had COVID to achieve herd immunity.

Let’s do three scenarios just changing one variable… prevalence (see Prevalence for dummies for a refresher)... 

Scenario 1: We know the exact number of cases (again this is certainly wrong but it is important to calculate the easiest application of these concepts first) 

  • Let’s start off with the worst-case scenario where the prevalence is 0.4%. (1.28m/320m = .4% or 1 in 256 people have had the illness)

  • This would give us a case fatality rate (see Mortality of COVID for dummies) of 5.9%. (76k/1280k = 5.9%)

  • So, 256m x .059 = 15.1m deaths. 

  • So, in this prediction, we would estimate that 15,100,000 people will die from COVID before we achieve herd immunity. 

Scenario 2: We are underestimating the number of cases (this is certainly a bit closer to reality)

  • What happens if we use the prevalence number of 5%? So in this scenario, we are significantly underestimating the number of cases. 

  • The case fatality rate goes way down. Right? Because 5% prevalence would mean that we really have 16m cases (not 1.28m). (320m x .05 = 16m or 1 in 20 people have had the illness)

  • Of course, the number of deaths would stay the same at 76k.

  • In this scenario, we have a rough case fatality rate of .475%. (76k/16m = .475%)

  • So, in this model, we would estimate that 1,216,000 people will die from COVID before we achieve herd immunity. (256m x .00475 = 1.216m)

Scenario 3: We are grossly underestimating the number of cases (this is certainly the scenario we are all hoping for)

  • Well, what happens if we use the prevalence number of 25%? So in this scenario, we are grossly underestimating the number of cases. 

  • The case fatality rate plummets. Right? Because 25% prevalence would mean that we really have 80m cases (not 1.28m). (320m x .25 = 80m or 1 in 4 people have had the illness)

  • Again, the number of deaths would stay the same at 76k.

  • We have a rough case fatality rate of .095% (76k/80m = .095%) 

  • We would estimate 243k people will die from COVID before we achieve herd immunity. (256m x .00095 = 243k)

These scenarios are the simplest models you can do. They do not consider a HUGE number of variables. Such as:

  • Effects of improved therapeutics

  • Effects of a vaccine

  • Effects of the potential seasonality

  • Effects of each state enacting different measures

  • Effects of widespread contact tracing

  • Effects of widespread testing

I bring this model up not to say it is correct but to show how changing just one variable can give you a widely different outcome (Scenario 1 was 15m deaths and Scenario 3 was 243k deaths) just by changing one variable. This article traverses some of the nuances of how hard this process is. This article also provides three scenarios.   

Ok, my brain hurts. I’ll bring you part 2 of this thought process in a few days. 

Stay emotionally distant and physically connected, 

Just making sure you are paying attention…

Ok, for real now… stay emotionally connected and physically distant,  


PS: Thank you to the two people that helped make this letter readable and accurate. You know who you are. 

PPS: If you want to laugh a little at the ridiculousness of it all, read this

PPPS: If you want to laugh at the beauty of what we can do, watch this

PPPPS: If you want to see how helping is the natural thing we do. Read (second half of the article) at how the MCAT medical school entrance exam) registrations quintupled.  

PPPPPS: This is a post-script deep cut… my doctor friend who has recovered said... The whole world is simultaneously hoping they don’t get it, yet praying they already had it. That about sums this one up.