December 15, 2021 (updated: May 13, 2022)

New Maps, New Dem Strategy: Our NC Analysis

Updates 3/4/2022: This post has been updated to reflect the new NC maps.

Updates 5/6/2022: We’ve finalized our 2022 model and updated our demographic data with the 2020 American Community Survey. We’ve updated the numbers in this post accordingly.

The 2022 Congressional elections will involve new district maps in all 50 states. Districts will be reshaped, created, and eliminated, which will change Dems’ odds in each race. How can donors figure out which races have the biggest “bang for buck” for flipping tenuous GOP seats or protecting vulnerable Dems?

To help answer that question, we’ve been building and refining a demographic model that predicts Democratic lean in each district based on its makeup in terms of race, sex, education, and population density. We then compare those results to an existing model based on historical data to help Dem donors identify races that we think deserve support for “offense” or “defense”.

In this post, we’re focusing on North Carolina. Here’s what we’ll cover:

Dem-lean by district in NC: our demographic model vs. historical data (updated)
Districts worthy of Dem donor support
Coming next from Blue Ripple
Coda #1: Demographics of the new NC districts
Coda #2: Brief intro to our methods (for non-experts)

1. Dem-lean by district in NC: our demographic model vs. historical data

Our “demographic” model forecasts the potential Democratic lean of each new district in NC based on attributes like race, education, age, and population density. In the graph and table below, we compare our predictions to a “historical” model (from the excellent Dave’s Redistricting (DR) web-site) built up from precinct-level results in prior elections¹. (See methods at the end of this post for more details.) The axes show the projected 2-party Dem vote share with each model. The diagonal line represents where districts would fall on this scatter-plot if the two models agreed precisely. In districts to the left of the line, our demographic model thinks the D vote share is higher than historical results, and to the right of the line, we think it’s lower than the historical model predicts².

NB: For this and all scatter charts to follow, you can pan & zoom by dragging with the mouse or moving the scroll wheel. To reset the chart, hold shift and click with the mouse.

We generally focus our attention on districts that fall in the 45-55% range of Dem share in our demographic model and 47-53% in the historical model. That’s because we think a 3-point gap is one that either party could potentially close with some focused energy, resources, and strategic thinking. Our demographic model carries some additional uncertainties, so we expand the range a bit there. Our methodology for using our model and the historical data to classify districts is explained in this post.

With that in mind, here are a few observations on the new NC districts:

First, let’s dispense with the obvious ones. Several NC districts are clearly far outside the competitive range and don’t merit serious investment by Dems looking to maximize their impact. Both the demographic model and the historical model agree that NC-2, NC-4, NC-6, NC-12, and NC-14 are safe D. Both models also agree that NC-3, NC-5, NC-8, NC-10 and NC-11 are far out of reach for Dems.
We think the rest are competitive: NC-1 we see as D+4 but the historical model says safe-D (D+5), both models agree that NC-13 and NC-9 are toss-ups. NC-7 looks barely competitive to us (R+5) and marginally worse historically (R+6).
In sum, we think there are 5 safe D (NC-2, NC-4, NC-6, NC-12, NC-14), 3 competitive (NC-1, NC-9, and NC-13), 1 marginally competitive/safe R (NC-7), and 5 safe-R (NC-3, NC-5, NC-8, NC-10, NC-11³)

Here’s a different look at the data, in a table sorted by the Dem share in our demographic model.

Calculated Dem Vote Share, NC 2022: Demographic Model vs. Historical Model (DR)

State	District	Demographic Model (Blue Ripple)	Historical Model (Dave's Redistricting)	BR Stance
NC	12	66	64	Safe D (No near-term D risk)
NC	2	64	63	Safe D (No near-term D risk)
NC	14	60	56	Safe D (No near-term D risk)
NC	4	57	68	Safe D (No near-term D risk)
NC	6	57	56	Safe D (No near-term D risk)
NC	13	54	51	Toss-up (Down to the Wire)
NC	1	54	55	Becoming At-Risk (More Balanced than Advertised)
NC	9	51	47	Toss-up (Down to the Wire)
NC	7	46	44	Becoming Flippable (More balanced than Advertised)
NC	3	43	38	Safe R (No near-term D hope)
NC	8	43	33	Safe R (No near-term D hope)
NC	5	41	40	Safe R (No near-term D hope)
NC	10	41	31	Safe R (No near-term D hope)
NC	11	36	45	Safe R (No near-term D hope)

2. Districts worthy of Dem donor support

Based on the results above, we think there are three good options for Dem donors in NC: NC-1, NC-9, and NC-13.

Our findings in NC-9 provide a good opportunity to discuss why these models may differ. NC-9 looks like a lean-R district given the voting patterns in the precincts within it, but our demographic model suggests it’s a toss-up. What might this mean? One way to answer this question is to consider the difference between the two models. Our demographic model asks: if the voting-age citizens of this district turned out and voted like similar people in other parts of the country, what would we expect the outcome of this election to be? Whereas the historical model asks how we'd expect the election to turn out if the voters in this district turn out and vote as they have in previous elections. This points to a few possible reasons why a historically lean-R district like NC-9 might look like a toss-up in our model -- including, but not limited to, the following:

Our model may be wrong about how we define "similar" voters. We've incorporated factors like education and race, but maybe we've missed key things that make voters in NC-9 different from superficially "similar" voters in other districts nationwide.
Location-specific factors may support Dem voting. E.g., perhaps the Democratic party or local organizations are not particularly well-organized in NC-9 and that is reflected in the historical model using those voters.
Democrats may, in fact, have underperformed relative to their potential. Or there may have been demographic shifts in the district since the last election which favor Dems. It's possible that more effective candidate recruitment, campaigning, or organizing could, in fact, yield a more competitive race than history would suggest.

We don't know which (if any) of these explanations is correct. But our model suggests that if you want to support Team Blue in NC, and are open to the idea of playing some offense, then donating to a Dem in NC-9 might fit the bill.

3. Coming next from Blue Ripple

Here’s where we’re planning to take these analyses over the next few months:

We’re going to do the same type analysis in many of the states, in order to identify the best options for Dem donors in 2022 on both offense and defense nationwide. Here are our takes on Arizona, Michigan, Pennsylvania, and Texas.
We’re going to continue to refine and improve our demographic model–we’ll update this post and others as we do so. Feel free to contact us if you want more details on the mechanics, or if you’d like to propose changes or improvements.
As maps get solidified, we’ll set up ActBlue donation links for candidates (after the primaries) to make it easy for you to donate.

If you want to stay up-to-date, please sign up for our email updates! We’re also on Twitter, Facebook, and Github.

4. Coda #1: Demographics of new vs. old NC districts

One thing we haven’t seen discussed very much is how redistricting in NC has changed the demographics in each district. As a way of putting the demographic model results in context, let’s look at the underlying population two different ways:

The first chart below shows each of NC’s proposed 2022 districts, with the population broken down by race/ethnicity (Black, Hispanic, Asian, White-non-Hispanic and other) and education (college graduate and non-college graduate). Each bar also has a dot representing the (logarithmic) population density⁴ of the district. The scale for that dot is on the right-side axis of the chart. For reference, a log density of 5 represents about 150 people per square mile and a log density of 8 represents about 3000 people per square mile. We’ve ordered the districts by D-share based on our demographic model, which is helpful for understanding how the model responds to demographics and density.
In the second chart, we look at these demographics a different way, placing each NC district according to its proportion of college graduates and non-white citizens of voting age. We also indicate (logarithmic) population density via the size of the circle and modeled D-edge (D-share minus 50%) via color. This makes it easier to see that the model predicts larger D vote-share as the district becomes more educated, more non-white and more dense.

It’s hard to see anything specific from these charts, though we are continuing to examine them as we try to understand what might be happening in each specific district. Overall, our impression is that this new map in NC has made the safe D districts safer by adding voters-of-color (mostly Black voters but also some Hispanic and Asian voters) and thus made the rest of the districts easier for Republicans to win by removing those same voters from places where they might have made districts competitive.

5. Coda #2: Brief intro to our methods (for non-experts)

This part of the post contains a general summary of the math behind what we’re doing here intended for non-experts. If you want even more technical details, check out the links at the end of this section, visit our Github page, or contact us directly.

Our model is demographic. We use turnout data from the 2020 CPS voter supplement (a self-reported survey); voting and turnout data from the 2020 CES (a validated survey); and election result data from the 2020 presidential, senate and house elections. The survey data from the CPS and CES is broken down by several demographic categories, including sex, education and race/ethnicity.

The election results are trickier to use in the model since we don’t have demographic information paired with with turnout or vote choice. What we do know is the overall demographics of the state or house district. So we use the election-data to assign a likelihood to the post-stratification of our parameters across the demographics of the relevant region (from the micro-data ACS).

Then we look at the demographics of a particular house or state-legislative district (using tract-level census data from the ACS), breaking it down into the same categories and then apply our model of turnout and voter preference to estimate the 2-party vote share we expect for a Democratic candidate.

This is in contrast to what we call the historical model: a standard way to predict “partisan lean” for any district, old or new: break it into precincts with known voting history (usually a combination of recent presidential, senate and governors races) and then aggregate those results to estimate expected results in the district.

The historical model is likely to be a pretty accurate “predictor” if you think the same people will vote the same way in subsequent elections, regardless of where the district lines lie. So why did we build a demographic model? Three reasons:

We’re interested in places where the history may be misleading, either because of the specific story in a district or because changing politics or demographics may have altered the balance of likely voters.⁵
Our demographic analysis is potentially more useful when the districts are new, since voting history may be less “sticky” there. For example, if I’m a Dem-leaning voter in a strong-D district, I might not have bothered voting much in the past because I figured my vote didn’t matter. But if I now live in a district that’s more competitive in the new map, I might be much more likely to turn out.
We’re not as interested in predicting what will happen in each district, but what plausibly could happen in each district if Dems applied resources in the right way, or fail to when the Republicans do. The historical model is backward-looking, whereas our demographic model is forward-looking making them complementary when it comes to strategic thinking.

Two final points. First, when it comes to potential Dem share in each district, we’re continuing to improve and refine our demographic model. The Blue Ripple web-site contains more details on how it works and some prior results of applying a similar model to state legislative districts, something we will also do more of in the near future. Second, for the historical model comparator, we use data from the excellent “Dave’s Redistricting”, which is also the source of our maps for the new districts.

Want to read more from Blue Ripple? Visit our website, sign up for email updates, and follow us on Twitter and FaceBook. Folks interested in our data and modeling efforts should also check out our Github page.

One important note about the numbers. Dave’s Redistricting gives estimates of Democratic candidate votes, Republican candidate votes and votes for other candidates. We’ve taken those numbers and computed 2-party vote share for the Democratic candidate, that is, D Votes/(D Votes + R Votes). That makes it comparable with the Demographic model which also produces 2-party vote share.↩︎
We’ve also done this modeling for the old districts and compared that result to the actual 2020 election results. See here.↩︎
In NC-11, our model prediction is substantially below the historical prediction. We’ve seen this pattern in a few other districts which roughly fit the description “college town,” in this case, Asheville, NC. One hypothesis is that in ignoring the age of voters, our model misses a significant factor which is especially significant in places with an unusual concentration of young likely voters. For techincal reasons having to do with the way the census distributes tract-level data, it is challenging to get age, education and race/ethnicity information at the same time, and we’ve chosen to keep education and race/ethnicity. There are some ways to extract a probabilistic estimate of age distribution in a tract and implementing something of that sort is on our 2024 road map.↩︎
We use logarithms here because density varies tremendously over districts, from tens to hundreds of thousands of people per square mile. We use population-weighting because the resulting average more closely expresses the density of where people actually live. For example, consider a district made up of a high-density city where 90% of the population live and then large but low-density exurbs where the other 10% live. Most people in that district live at high density and we want our density to reflect that even though the unweighted average density (people/district size) might be smaller.↩︎
We’re also interested in voter empowerment strategies. In particular, questions about where and among whom, extra turnout might make a difference. The historical model is no help here since it does not attempt to figure out who is voting or who they are voting for in a demographically specific way.↩︎