How to rig a polling average
GOP-aligned pollsters are propping up Trump in polling averages that don't account for pollster "house effects." This risks misleading the public.
This was a premium post for paying members of Strength In Numbers. However, I think it is important that people who want to can read this article and digest its arguments, so I have removed the paywall. Please support work like this by becoming a paying member. You get lots of great benefits and access to a community of likeminded news nerds.
Polls of President Donald Trump's job approval are giving wildly different results depending on which pollster is conducting the survey. Check this out in our public data portal: Some polls show the difference between Trump's approval rating and disapproval rating to be 10 points positive (54% approve, 44% disapprove), while others give a rating of negative 16 (41% approve, 57% disapprove). That’s a 26-point spread!
Someone is going to be wrong here.
When poll results vary significantly among certain pollsters, we call those "pollster effects" or "house effects." House effects are generally medium in size, about 2 or 3 points for a single proportion (or 4-6 points for a net approval measure, as Trump’s approval is commonly expressed). A smart average of polls takes house effects into account; if one pollster has previously released a bunch of polls at Trump +10 and they publish a new +11, your average should discount the raw reading.1
But several prominent averages do not apply such house effects. And as a result, their estimates of Trump’s approval rating have been artificially inflated as more pro-Trump data enters their models. These models reflect some of that bias back to the public, and show “phantom swings” in data as a result of who publishes what, when.
Approval numbers are cited frequently by the press, so getting the average right is important, public-interest work. In this post, I’m going to show you how we know these firms are biased, how we can control for that, and what happens if you don’t.
The polls where Trump is always popular
Let me start by laying out the topline, descriptive differences in Trump approval polls from right-leaning pollsters and all other polls.
In the chart below, I have calculated three weighted moving averages of Trump's net approval rating (% approve - % disapprove) in each poll. One line (in red) represents the average among polls that are associated with the Republican Party in some way (either because they were commissioned by a right-leaning organization or done by a pollster that we here at Strength In Numbers deem partisan2), one (gray) is among all other polls — that are also only of registered or likely voters — and finally (in black) an average of all non-partisan polls.
(I separated the averages of other polls to RV-LV because the right-leaning pollsters tend to stick with that population. This makes the lines more directly comparable.)
Note two things from this chart. First, there is currently a large difference (about 9 points) between the average of GOP polls and the average of all other polls. This difference is large enough to be statistically significant; it is not a product of randomness or timing of poll releases, etc.
Second, note that this difference is larger now than it was earlier in Trump's term. Before roughly mid-April, there was no statistically significant difference between these two sources of polling (especially after accounting for differences in poll population).
I will dwell on this point for a second: It looks visually that something happened in April to cause GOP-aligned polls to detach from everyone else. People have raised to me the possibility that some pro-Republicans pollsters may have adjusted the way their surveys work to produce better numbers for Trump over time, relative to other polls. Technically, we do not know how each poll is conducted, especially for some of the more serious right-wing offenders, so we cannot rule out this possibility.
But more realistically, I believe the methods these pollsters use are simply less prone to picking up changes in opinion (for complicated reasons), and we’ve just been getting more of the most pro-Trump polls over time.3 I’ll show more on this in the next section.
The bigger point is that what you think of the president's popularity is largely a function of which polling firms you believe. If you live in a right-wing media environment that elevates data generated by GOP pollsters and far-right media outlets, you end up thinking Trump is net popular (or at worst even). But polls from basically everyone else show a much more negative picture for the president — in line with the signals from real-world data like anti-Trump protests and rallies for Democratic leaders.
The pollsters with the biggest pro-Trump house effects
Let’s take a closer look at the pollsters driving this trend.
We have pollsters marked as partisan from the following 8 pollsters: Quantus Insights, co/efficient, InsiderAdvantage, OnMessage Inc., TIPP Insights, McLaughlin & Associates, North Star Opinion Research, and the Trafalgar Group. Again, “marked as partisan” means the individual poll was conducted for a partisan sponsor or Republican candidate, or by a pollster we consider partisan.
Over the last month, polls from these 8 firms are 9 points more favorable to Trump on net approval than polls from all other pollsters.
As a point of reference, while we do not have any officially Democratic partisan polls in our Trump approval dataset, generally left-leaning pollsters such as Navigator Research have tended to be about 2 points worse for Trump on net approval than polls from all other non-partisan firms.
Let me show you just how large these pollster-level effects are.
In the chart below, I show the actual estimates of several pollsters' house effects from the statistical model generating our public average of Trump's approval rating. Note that these house effects are just for Trump's approval percentage — they show how much higher or lower than the average poll you'd expect the percentage of Americans who approve of Trump to be if the poll came from each of these pollsters.
And because I was curious about the weird mid-April disjunction in the time trends from above, I also show the house effect for each pollster in surveys conducted before versus after April 12th.
In the first line, for example, our model says the percentage of the public that approves of Trump's job as president tends to be about 5 points higher than the average survey in the subset of polls published by Insider Advantage. And that house effect may have modestly grown since earlier in Trump's term, from about 3 points before April to 6-7 points today.
Statistically speaking, all of the 8 pollsters who at least occasionally engage in partisan work have a marked statistical lean toward Trump. This means their data is friendlier to the president than the average poll, all else being equal. And the "all else" here is a lot: Our aggregate is a Bayesian time-series model that also accounts for the methodology a pollster used, its population, whether a poll was released by a partisan firm/conducted for a partisan client, and the overall level of Trump support on a given day.
We also account for extra error in poll results stemming from non-sampling error, which increases the uncertainty in surveys beyond the traditionally reported margin of sampling error from each pollster. In other words, we're being very generous here and giving pollsters the benefit of the doubt when their data deviates from the average. Yet the pro-Trump pollsters end up with big house effects anyway.
And the house effects on the other end of the spectrum are also much smaller. I've included rows in that chart for Navigator Research (a firm aligned with Democrats), AP-NORC (one of those polls that uses probability panels, which tend to be worse for Trump for whatever reason — this was also true for Biden, by the way!), and Marist University, which is a non-partisan firm but constantly gets criticized as being Democratic by the GOP pollsters in question here (obviously, those charges carry little weight). I’ve included YouGov, one of the pollsters with the smallest house effects, as a reference. None of the house effect estimates for these comparison pollsters is statistically significant across time.
Even the house effect for Navigator Research is statistically indistinguishable from zero. Compared to positive, pro-Trump effects for the pro-GOP firms publishing the most data.
Averages need house effects
Let’s add this back up to the level of the aggregate. How much does the decision to account for house effects affect our estimate of Trump’s approval, anyway?
To answer this question, I've created a model of Trump's approval rating that doesn't adjust for house effects, and compared it to one that does. You will be able to see that even with a lot of other high-quality data, if you don't take house effects into account, your average ends up getting anchored to these big pro-Trump outliers anyway. This problem is worse the less data you have.
In the chart below, I show you Trump's approval rating in two averages. One model, represented by the dashed line, is the normal one we run here at Strength In Numbers — the fancy Bayesian average with all the adjustments, yada yada yada technical jargon etc. The second average, represented by the solid line, is the same model, but I've taken out the component that adjusts for house effects.
As of Monday morning, June 30 (when I'm writing this newsletter), Trump's approval rating in the full Strength In Numbers average is 44%, whereas we estimate that 53% of adults disapprove of the job he's doing as president. But in a model of the same data, Trump's approval is 45% and his disapproval is 52%. In net terms, Trump fares a full 2 points better if you don't account for pollster house effects. Again, using the same data, in a model that is otherwise the same.
In statistical terms, this is not so big a difference. After all, who is going to support Trump at a 45% approval but not 44%?
However, 2 points on net can be meaningful narratively. In our normal average, Trump's approval went negative in mid-February. But in the model without house effects, that didn't happen until mid-March. That's 30 days where the White House was able to say the public was on its side, when empirically, it wasn't. Thirty days is a long time in politics.
This two-point difference should also be considered an underestimate. That’s because the no-house-effects model is still taking things like poll mode, population, and extra noise into account. If you don’t do all that, his rating creeps up to -4, 46%-50%.
This exercise shows that a modeler needs to be careful to take into account the source that’s generating a particular piece of polling data, or else they're getting a tilted picture of how the public feels. This is the same thing you would do if you were trying to predict how well your kids would do on a test. Well, the grades come from different kids, with different skills and abilities, so you'd want to take that into account. For example, my triplet brothers were good at physics, while I favored history and English. We had “house effects” on standardized tests.
Composition effects and phantom trends
House effects matter not just for getting the level of Trump support right, but for identifying the correct trend. Take the Decision Desk HQ average, for example. DDHQ runs a polling average with a good method for identifying trends (it resembles some of my past work combining splines and weighted averages) — but it doesn't adjust for any factors other than time, sample size, and whether a survey is an internal poll released by a campaign. Quoting their methodology:
If a poll meets the standards set by the American Association of Public Opinion Research (AAPOR) and releases its methodology, field dates, and sample size, we include it in our average. The only factors that affect how DDHQ incorporates a poll into its analysis are its recency and whether it is an internal poll—meaning it was commissioned by a candidate or their campaign committee. Additionally, a minimum of two polls are required for us to publish an average for any race.
DDHQ says this is intentional; they want an average that just captures the data. Again, quoting them:
[Different methodological approaches] are all valid - but they also introduce many subjective decisions about the exact way to treat certain types of data, and these decisions are sometimes made without having enough data to justify one decision over another. […]
Our goal is to provide a straightforward, consumer-facing average that accurately represents the state of available polls while being easy to interpret and not overly sensitive to recent news events.
In my opinion, this is an obvious oversight. They have designed an average that measures the data, but is blind to the thing it is ultimately measuring: the actual state of public opinion.
Think about an extreme scenario where, for some reason, starting tomorrow, only InsiderAdvantage would be able to release polls of Trump's approval rating. The DDHQ average would immediately jump 12-15 points on net to Trump. If you read the average like most people do, people would infer this as Trump "gaining ground" in public opinion. But in reality, he gained ground in the data, and the data had become completely disconnected from opinion. DDHQ is betting people will realize the difference, but from a decade of making these averages and writing about this for a public audience, I can tell you the average person does not know the difference.
This is not a problem with imaginary consequences. In early June, Trump's net approval rating in the DDHQ average briefly went positive, having gained 9 points from his low in mid-April. Reference this screenshot from their website below:
That’s huge for Trump! At the time, he was sending the Marines to Los Angeles and had just announced an expanded deportation program for non-criminal unauthorized immigrants. How popular those policies must be to have resulted in such a surge!
Bummer for Trump, then, that this movement appears to be almost entirely driven by an onslaught of GOP data. From June 1-5, the DDHQ model ingested an Approve +5 poll from RMG Research conducted for a pro-Trump news website called Napolitan News Service, an Approve +1 poll by Quantus Insights for a right-leaning blog called TrendingPolitics, and an Approve +1 reading from Rasmussen Reports, a right-wing pollster that, in my opinion, has a very questionable commitment to ethics and good-faith measurement of public opinion — and got caught secretly working for the Trump campaign in 2024.4
And don't take my word for it. Over the same period, the Trump polling average at the New York Times bottomed out at disapprove +4. And an average of strictly non-partisan data from VoteHub didn't move at all.
Public opinion matters. We should get it right
I don't mean to pick on Decision Desk here. They are smart people (and some employees there are personal friends of mine), and notably not the only ones prone to phantom trends from composition effects in their data. RealClearPolitics notoriously suffers from this as a result of not taking house effects into account, but they also have a lot of other problems that make them an unreliable source for trends in public opinion data.
In summary, this episode should be a lesson (to both aggregators and people who rely on them) on the importance of taking the source of a data point into account. The fact of the science of measuring beliefs is that public opinion polling is hard, and not all polls are created equal. We know that here with our mental models of the real world, and a statistical model of polling data — that tries to speak for the people — needs to take it into account, too.
What the public thinks is an important force in modern democracies. We owe the people our time and devotion to getting these things right. If all models are wrong, but some are useful, we want to be in the group that’s at the very least useful.
Some people will have their averages register this poll as a 1-point increase from the last poll by that pollster, rather than a new, raw +11.
We have “deemed partisan” one firm, The Trafalgar Group, because the poll research team at FiveThirtyEight previously caught it releasing surveys conducted for partisan sponsors without disclosing this. The safest assumption we can make is that all the data generated by Trafalgar is done for partisan clients.
Most of the pollsters in question adjust their samples to a fixed benchmark for the share of voters who are Democrats, Republicans, or Independents in party identification, or to match the outcome of the last presidential election. One perfectly desirable consequence of these adjustments is to minimize survey-to-survey noise caused by changes in the types of people who are taking polls. But many of these firms also use cheap contact methods such as opt-in online marketplaces or robo-calling landlines, which yield very low-quality data. In my own work, I have found that the combination of these two qualities — partisanship weighting and noisy, cheap data from marketplaces — produces toplines that are weirdly stable across surveys. I think this is due to the quotas from the marketplaces or preference for party in the marginal targets for a raking algorithm. Anyway, just a theory.
The Strength In Numbers average does not include any data published by Rasmussen Reports
Another great piece. I learn something new in most of your newsletters!
The splines!