It is unlikely that COVID-19 vaccination reduced US mortality in 2021
The low 2021 mortality in high-vaccination counties, and vice versa, is a correlation, not a causal relationship.
US data by the CDC show that 2021 (1) all-cause excess mortality was lower in counties with high 2021 COVID-19 vaccine uptake than in counties with low uptake. The results were consistent – yet reduced – when testing the vaccine effect on (2) COVID-19-related excess mortality and (3) the ratio between COVID-19-related vs. -unrelated excess mortality. As the analyses accounted for alternative explanations, genuine vaccine protection is a likely interpretation. However, the latter effect (3) became non-significant when assuming 25% of COVID-19-related deaths were misclassified, a conservatively low estimate, as research has indicated 45%. On the other hand, assuming no miscalculation – or even underreporting – finding that the vaccine uptake by the end of 2020 seemingly strongly reduced mortality for the whole population in 2021 is unlikely, as only 1% of the population had received a dose at that time. Altogether, my analyses infer that the low 2021 mortality in high-vaccination counties, and vice versa, is a correlation, not a causal relationship.
(What happened in 2022 and 2023 is a different story, which I address later.)
Assuming a small US county in 2021 experienced 120 deaths, compared to a baseline of 100 expected, the all-cause excess mortality rate was 20% (Note 1). If it reported 12 COVID-19 deaths, COVID-19-related excess mortality was 12%, COVID-19-unrealted excess mortality was 8%, while the ratio between COVID-19-related vs. -unrelated excess mortality was 112/(120-12)=1.037. Multiplied by 100, the ratio is 103.7 (Note 2).
In the following, I apply the concepts explained here to investigate whether COVID-19 vaccination had a genuine impact on 2021 mortality rates in US counties. All data are from the US Centers for Disease Control (CDC), publicly available (Note 3).
All-cause excess mortality
Table 1 presents regression models that report on the association between US county-level per capita COVID-19 vaccine uptake by the end of March (Model 1), June (Model 2), and September 2021 (Model 3), respectively, as the independent variables (Note 4) and all-cause excess mortality in 2021 as the dependent variable. All models and analyses were weighted for counties’ population size (Note 5). I included the lagged dependent variable as a control, i.e., all-cause excess mortality in 2020 (DV20), the reason being that the approach “provides a simple way to account for historical factors that cause current differences in the dependent variable that are difficult to account for in other ways” (Note 6), according to Wooldrige, probably among the most authoritative voices in econometrics today.
Table 1. Regressions weighted for counties’ 2021 population size with robust standard errors. Dependent variable is 202 all-cause excess mortality.
Model 1 shows a significant negative association between per capita vaccine uptake by the end of March 2021 and all-cause excess mortality that year (Note 5). I.e., the higher a county’s vaccine uptake by the end of March 2021, the lower its all-cause excess mortality that year. The model further reports that, by the end of March, the average per capita vaccine uptake was 41.0 doses per 100 people among the 2,817 counties included, with a total population exceeding 300 million (Note 7).
Using the average per capita vaccine uptake score (41.0 doses per 100 people) as an input value in Stata’s margins command algorithm returned a value of 119.6. I.e., average vaccine uptake by March corresponded with 19.6% all-cause excess mortality. The same exercise, with zero vaccine uptake as the input value, returned a value of 127.7. I.e., counties with zero vaccination by March corresponded with 27.7% excess mortality (Note 8). These numbers tell that counties with average vaccine uptake, corresponding to the overall vaccine uptake, had a 6.34% reduction in all-cause excess mortality compared to counties with zero vaccine uptake.
The results reported in Models 2 and 3 are similar to those in Model 1. I.e., the higher a county’s vaccine uptake by the end of June and September 2021, the lower its all-cause excess mortality that year (Note 9).
COVID-19-related excess mortality
Intuitively, the COVID-19 vaccine cannot prevent deaths not related to COVID-19. Therefore, I replicated the analyses, applying COVID-19-related excess mortality as the dependent variable (instead of all-cause excess mortality), and report the results in Table 2. Statistically, the conclusion was unchanged. I.e., the higher (lower) a county’s vaccine uptake, the lower (higher) its 2021 COVID-19-related excess mortality. However, the effects were weaker compared to all-cause excess mortality (Note 10).
Table 2. Regressions weighted for counties’ 2021 population size with robust standard errors. Dependent variable is 2021COVID-19-related excess mortality.
The ratio between COVID-19-related vs. -unrelated excess mortality
Assuming a county experienced high COVID-19-related excess mortality, eventual low vaccine uptake is an unlikely explanation if the county also experienced high COVID-19-unrelated excess mortality. The reason is that vaccine uptake cannot reduce COVID-19-unrelated excess mortality. Thus, other issues than low vaccine uptake likely explain both high COVID-19-related and -unrelated excess mortality.
On the other hand, assuming a county experienced high COVID-19-related excess mortality, eventual low vaccine uptake is a plausible explanation if the county experienced low, absent, or even negative COVID-19-unrelated excess mortality. The reason is that the discrepancy has no other obvious alternative explanations.
To address the issue, I modeled a variable taking the ratio between counties’ COVID-19-related vs. -unrelated excess mortality (cf. my introductory example where it took a value of 103.7). Next, I replicated the analyses using this dependent variable (instead of those applied in previous tables), and report the results in Table 3.
Table 3. Regressions weighted for counties’ 2021 population size with robust standard errors. Dependent variable is the ratio between COVID-19-related vs. -unrelated excess mortality in 2021.
Assuming a genuine vaccine effect, ceteris paribus, one would expect a low ratio in high-vaccination counties, and vice versa, and statistically, according to Table 3, the conclusion remained unchanged compared to what was reported in the previous tables. I.e., the higher (lower) a county’s vaccine uptake, the lower (higher) its ratio between the COVID-19-related vs. -unrelated excess mortality. However, the effects were weaker compared to COVID-19-related excess mortality and even weaker compared to all-cause excess mortality.
Overreporting COVID-19-related deaths?
The validity of the above ratio-concept hinges on correctly classifying COVID-19-related deaths, which has been debated. One study argued that they were underreported, but regions reporting more COVID-19 deaths than excess deaths indicate the opposite.
Moreover, in the US, “COVID-19 deaths included all death certificates in which there was any mention of COVID-19”, which, in my opinion, at least with improved testing capacity in 2021, has led to overreporting rather than underreporting concerning genuine classification. An incentive for overreporting is that hospitals were paid more for patients listed as COVID-19 cases, and a recent Greek study found that among “530 in-hospital deaths, registered as COVID-19 deaths, in seven hospitals in Athens during the Omicron wave, 240 (45.28%) were reassessed as not directly attributable to COVID-19.”
Altogether, I conclude that COVID-19-related deaths were rather overreported than underreported. Also, as COVID-19 largely causes deaths in elderly and comorbid people, a share of them would have died from other causes during a given calendar year, even if the death was correctly attributed to the virus infection.
To address the issue, I reestimated the ratio between COVID-19-related vs. -unrelated excess mortality by conservatively assuming that 25% of COVID-19-related deaths were misclassified (Note 11).
I replicated the analyses using this dependent variable (instead of those applied in previous tables), and the results in Table 4 show non-significant effects. I.e., there were no significant associations between counties’ vaccine uptake and the ratio between COVID-19-related vs. -nonrelated excess mortality when conservatively assuming 25% misclassification.
Table 4. Regressions weighted for counties’ 2021 population size with robust standard errors. Dependent variable is the ratio between COVID-19-related vs. -unrelated excess mortality in 2021, assuming that 25% of COVID-19-related deaths are misclassified.
Vaccine uptake by the end of 2020
Not assuming overreporting, and perhaps even underreporting of COVID-19-related deaths, the validity of the significant results in Tables 1-3 should be higher than that of the non-significant results in Table 4. Having said that, the results in Table 5 should nonetheless be taken into consideration, where vaccine uptake was modelled by the end of December 2020 (Note 12).
Table 5. Regression weighted for counties’ 2021 population size with robust standard errors.
The seemingly markedly negative effect on 2021 all-cause excess mortality (Model 1), COVID-19-related excess mortality (Model 2), and the ratio between COVID-19-related vs. -nonrelated excess mortality (albeit non-significant concerning this variable, due to low observation), may, in line with Tables 1-3 indicate genuine vaccine effect (Note 13). However, one should notify that average vaccine uptake was only 1.02%, i.e., only a little more than 1% of the population had received their first dose, which is unlikely to have reduced, for instance, 2021 all-cause mortality for the whole population by 8.59%, or COVID-19-related excess mortality by 6.08%.
Therefore, I conclude that the low 2021 mortality in high-vaccination counties, and vice versa, is a correlation, not a causal relationship.
Conclusion
US data by the CDC show that 2021 (1) all-cause excess mortality was lower in counties with high 2021 COVID-19 vaccine uptake than in counties with low uptake. The results were consistent – yet reduced – when testing the vaccine effect on (2) COVID-19-related excess mortality and (3) the ratio between COVID-19-related vs. -unrelated excess mortality.
As the analyses accounted for alternative explanations, genuine vaccine protection is a likely interpretation. However, the latter effect (3) became non-significant when assuming 25% of COVID-19-related deaths were misclassified, a conservatively low estimate, as research has indicated 45%.
On the other hand, assuming no miscalculation – or even underreporting – finding that the vaccine uptake by the end of 2020 seemingly strongly reduced mortality for the whole population in 2021 is unlikely, as only 1% of the population had received a dose at that time.
Altogether, my analyses infer that the low 2021 mortality in high-vaccination counties, and vice versa, is a correlation, not a causal relationship.
Notes
The baseline for this study is the county-level deaths in 2018 and 2019, divided by the population size in those years. For example, if 100 people died in a county both years and the population was 10,000 both years, the baseline was (100+100)/(10,000+10,000)=.01. If 120 died in 2021 and the population was the same, the county’s 2021 all-cause excess mortality was 20% (.012/.01=1.2).
Regarding validity issues, please refer to my discussion below.
The CDC does not report data if a county’s deaths are between 1 and 9, coded as missing. A very small number of cases with zero reported deaths were also coded as missing. If data for either 2018 or 2019 were missing, data for both years, used as a baseline, were further coded as missing. The US has a little over 3,200 counties, and this study has baseline data from 3,096 of them.
It is not straightforward to choose a period for vaccination uptake as it started by the end of 2020 and was rolled out during 2021. Data from the end of 2020 only showed very low vaccine uptake from a low number of counties. Therefore, in the initial analyses, I decided to use vaccine uptakes at those three mentioned periods. The end of March 2021 captured at least the first dose for the elderly and vulnerable. The end of June captured the rollout among young adults, and the end of September the rollout among young people, as well as booster vaccinations among the elderly and vulnerable. Comparing three different periods is also a good sensitivity check, as it informs us about the consistency or inconsistency in the association between counties’ vaccine uptake at different periods in 2021 and mortality that year. Choosing periods around midyear and three months before and after midyear, respectively, finally mitigated immortal time bias. I.e., vaccination at a given period does not affect previous mortality, and the issue is more prevalent at the end of the year than earlier. Vaccine data were included from counties reporting positive values on Completeness_pct. To model a proxy for doses per capita, I first summarized the number of doses administered in each county concerning Administered_Dose1_Recip, Series_Complete_Yes (as it typically includes two doses), Booster_Doses, Second_Booster_50Plus, and Bivalent_Booster_5Plus (the last boosters are most relevant for periods later than March 2021). Next, I divided the number by the population size in 2021 and multiplied it by 100. I.e., per capita vaccine uptake refers to the number of doses administered per 100 people at a given period.
The models report robust standard errors and two-tailed tests of significance concerning the regression coefficients; † p < .10; * p < .05; ** p < .01; *** p< .001. 95% confidence intervals are reported in parentheses, and values in brackets are log-log transformed estimates. Regarding vaccine uptake, I added the constant 1 before performing the log transformation.
I.e., issues causing differences in all-cause excess mortality in 2020, of which we are unaware, likely cause similar differences in 2021. For example, counties experiencing an abnormally low (high) number of deaths in 2018 and 2019 have led to estimating a relatively high (low) number of excess deaths in the following years (cf. Note 1). Additionally, county-level interventions related to COVID-19 in 2020 may have had lasting effects on mortality over the years. Including all-cause excess mortality in 2020 as a control variable, this approach, accordingly, at least partly controls for other issues we are unaware of. Another way to explain the inclusion of a lagged dependent variable is that it keeps the all-cause excess mortality in 2020 fixed. That is, the models estimate the vaccine effect on all-cause excess mortality in 2021, while cancelling out differences between counties in 2020. It helps isolate the vaccine effect we are interested in. Not controlling for all-cause excess mortality in 2020 would instead increase noise in the data, as counties with high or low 2020 all-cause excess mortality would likely experience the same in 2021. That would induce overestimations (underestimations) concerning counties with high (low) 2021 values. For a further discussion of the pros of including the lagged dependent variable and cons in a few instances, see, for instance, Rönkkö.
The reason we have fewer observations than the 3,096 counties from which we have 2018-2019 baseline data (Note 1) is chiefly due to missing vaccine data (Note 3). Nonetheless, as the 2,817 counties have a 2021 population of more than 300 million, mostly smaller counties are not included.
Notice that these are the numbers obtained after controlling for the lagged dependent variable, 2020 all-cause excess mortality, which, according to previous arguments (Note 6), is positively associated with the dependent variable, 2021 all-cause excess mortality. In later models, we will observe mostly robust, positive associations.
However, average vaccination uptake – 89.4 by the end of June and 110.7 by the end of September – induced an increasing reduction in all-cause mortality compared to March, at 8.61% and 10.8%, respectively. A further look at the data shows that average vaccine uptakes by the end of all three months corresponded with 16.6% all-cause excess mortality, while corresponding with increasing values for zero uptake counties, from 27.7% by the end of March to 30.9% by the end of June and 34.0% by the end of September. A likely reason is that low-vaccination counties have increased their uptake after March at a rate higher than that of high-vaccination counties. The reasoning is consistent with the increasing log-transformed estimate of vaccine uptake from March to September, reported in brackets (cf. Note 5).
Table 2 has fewer observations than Table 1 because some counties’ COVID-19-related deaths were less than 10 and therefore coded as missing (cf. Note 3). However, there is only a slight reduction in population size.
E.g., if the CDC officially classified 12 deaths as COVID-19-related, they were reclassified to 9, and 3 more cases were added to the COVID-19-unrelated deaths.
As the vaccine rollout had just begun, only data from 34 counties were available, covering a population of approximately 46.5 million people.
The comparable log-transformed measures concerning vaccine uptake are much higher than those reported in Tables 1-3.






