Jump to content

Presidential Polls And Party Identification: Debunking The Myths...

Leon Troutsky

Recommended Posts

So lots of conservatives are spouting nonsense about supposed liberal "bias" in the polls because of what they claim are over-sampled Democrats that create bias. Here is why it's nonsense:

First, the following article goes a long way to dispel the myth being touted by conservatives:

Thursday, September 27, 2012

The discussion of the party identification composition of poll samples comes up in every presidential election with which I've been involved. Interested observers often opine that when a given poll shows that Candidate X is ahead, it cannot be correct because there is a higher percentage of voters who identify with Candidate X’s party in the sample than there should be, based on comparison to some previous standard.

There are several reasons why this is a faulty approach to evaluating a poll's results.

Party identification is basically an attitudinal variable, not a stable population parameter. It is designed to vary. This is distinct from demographic variables such as age, gender, race/ethnicity, and education, which are, generally speaking, stable indicators measured by the U.S. Census Bureau. The only issues relating to demographic variables are measurement concerns -- e.g., how the census, which creates the targets, measures ethnicity versus how individual pollsters measure it. But, generally speaking, these are fairly stable targets.

Party identification is not measured by the U.S. Census Bureau, nor are there any other official state or national standards for what party identification "should be" in terms of the percent per party as it relates to the general population.

Many people use the exit polls as a standard. But exit polls use a distinct question wording, a different methodology (in person interviews at the polling place as opposed to telephone interviews), a different environment (people are asked their party identification just after having voted, which could affect how they answer), and different sampling techniques to develop who it is that is asked the question. So party identification figures as measured by a specific poll aren't easily compared to party identification as measured by an exit poll because of these and other potential issues.

Party identification changes as political tides change. General shifts in the political environment can affect party identification just as they can affect presidential job approval and results of the “Who are you going to vote for?” question.

Here is how Gallup asks party identification: “In politics, as of today, do you consider yourself a Republican, a Democrat, or an independent?”

Note that this question does not ask, “What was your party identification in November 2008?” Nor does it ask, “Are you registered with one party or the other in your state?” Our question uses the words "as of today" and "consider." It is designed to measure fluidity in political self-identification.

We know that party identification moves over time -- sometimes in very short periods of time, just like other political variables. Generally, if there is a political tide toward either of the two major parties, all questions we ask that are of a political nature will move in that direction. This includes the ballot, job approval, party identification, among others.

So, it would not be surprising to find that if Barack Obama is enjoying a surge in popularity in any given state, that surge will show up on the ballot question, on his job approval measure, and on the measure of party identification. So, data showing that Obama is ahead on the ballot in a specific state poll and that Democrats have a higher-than-expected representation on the party identification question, are basically just reflecting two measures of the same underlying phenomenon.

This doesn’t obviate the possibility that a sample is a “spurt” or a sample that happens to pick up higher than usual support for one candidate or the other for whatever reason. But, if it is a spurt, the cause is not “getting too many Democrats/Republicans in the sample.” It is instead a matter of “Getting too many people who, in response to all political questions, answer in a more Democratic or Republican” way.

Basically, if an observer is concerned about a poll’s results, that observer should skip over the party identification question and just look at the ballot directly. In other words, cut to the chase. Don’t bother with party identification sample numbers. Look directly at the ballot.

For example, we know that in Ohio:

  • Obama won by 5 points in 2008
  • Bush won by 2 points in 2004
  • Bush won by 3 points in 2000

Now if a given poll in Ohio in this election shows Obama with a 10-percentage-point lead, one should just ask, “How likely is it that Obama would be ahead by 10 points if he won by five points in 2008?” -- forgetting party identification, which we assume is going to be higher for the Democratic Party if Obama is ahead, anyway. The discussion of the ballot in the context of previous ballots is, in fact, a reasonable discussion. It may be unlikely that Obama will double his margin in 2012 from what occurred in Ohio in 2008. Or maybe not. But the focus should be directly on the ballot, and discussions of reasons why it might be different than one expects should not involve an attempt to explain the results by focusing on changes in party identification -- which is basically a tautological argument.

In Florida:

  • Obama won by 3 points in 2008
  • Bush won by 5 points in 2004
  • Bush won by [much] less than one point in 2000.

So, if one sees a poll saying that Obama is leading Romney by nine points in Florida, then one should ask how likely it is that Obama will exceed his 2008 margin by six points. That is a reasonable discussion. But one need not attempt to say that the nine-point lead in the poll is suspect because there were too many Democrats and not enough Republicans in the sample compared to 2008. The finding of differences in party identification is, instead, simply reflecting what one sees on the ballot.

Essentially, it is much more direct to just focus on the trends and comparisons of the ballot question than it is to introduce an extraneous look at trends in party identification.

I’ve been analyzing election surveys at Gallup since the 1992 presidential election, and I don’t personally put a great deal of stock in survey-to-survey variations in party identification. All of our weighting focus is on the effort to bring more solid demographic variables into alignment with census figures -- including in recent years cell phone and landline phone use. We don't find that party identification is stable enough to be of much use when it comes to comparing sample-to-sample variations, or sample to exit poll differences.

Link to comment
Share on other sites

Now, to explain some of the implications of the above article. The last paragraph is key, especially the part talking about weighting the results according to demographics. What is left unsaid, however, is that correcting the sample based on demographics actually corrects a lot of whatever partisan bias there may be.

This is because party identification is determined in large part by demographics. A black person who grows up in a poor, inner-city, single-family home is more likely going to be a Democrat. Thus, if a poll over-samples low income, urban, and single-family households then it should also have a higher proportion of Democrats as a consequence. When the polling firm corrects the oversampled demographics, they are SIMULTANEOUSLY correcting whatever partisan sampling issues exist as well.

This is not reflected in the .pdf write-up on their websites nor is it discussed in the news articles discussing the polling results. What gets reported is the weighted marginal (percentage for Obama/Romney) and a separate result for responses to the party identification question. Thus, not only is it erroneous to assume that there is some "correct" composition of the electorate that each poll "should" have in its sample, it is also erroneous to assume that the weighted vote share that is reported is seriously biased by a sample that has a (supposed) over-sampling of Democrats.

Ultimately, we'll find the truth of this matter on Election Day. The conservative argument is that the vast majority of polls (except Gallup) are severely biased. Thus, the polls show a close race when the outcome will "really" be a blowout Romney win of 5-7%. If Romney doesn't win by such a huge margin, the polls will be right and the people running around screaming about +7% D samples will be proven (once again, in most cases) to be morons who don't know what they're talking about.

Link to comment
Share on other sites

Another very good article about Nate Silver's model and the notion of "biased" polls:

The Nate Silver backlash

By Ezra Klein , Updated: October 30, 2012

It seems to happen every few weeks. The race tightens or widens or simply continues on exactly as it’s been, and some pundit or reporter declares it a staggering humiliation for the burgeoning world of election quants.

Niall Ferguson took his turn at bat in early September. “The economy is in the doldrums,” he wrote. “Yet the incumbent is ahead in the polls. According to a huge body of research by political scientists, this is not supposed to happen.”

Actually, according to a huge body of research by political scientists, that was exactly what was supposed to happen.

After the first debate, David Frum stepped up to the plate. “Political science proclaims, ‘debates don’t matter,’ ” he wrote. “After this election, we may need to retire a lot of political science.”

Actually, “political science” never declared debates don’t matter. A political scientist — George Washington University’s John Sides — had reviewed the evidence and written that “when it comes to shifting enough votes to decide the outcome of the election, presidential debates have rarely, if ever, mattered.”

That was true then. It’s true now. It will be true if Mitt Romney wins the election (saying something has “rarely, if ever, happened” does not mean it cannot happen, as a look at the rather unusual weather outside your window will prove). And it will be true if President Obama, as continues to look slightly likelier than not, wins the election.

Which brings me to the backlash against Nate Silver.

Before we get too deep in the weeds here, it’s worth being clear about exactly what Silver’s model — and that’s all it is, a model — is showing. As of this writing, Silver thinks Obama has a 75 percent chance of winning the election. That might seem a bit high, but note that the BetFair markets give him a 67.8 percent chance, the InTrade markets give him a 61.7 percent chance and the Iowa Electronic Markets give him a 61.8 percent chance. And we know from past research that political betting markets are biased toward believing elections are more volatile in their final weeks than they actually are. So Silver’s estimate doesn’t sound so off.

Moreover, Silver’s model is currently estimating that Obama will win 295 electoral votes. That’s eight fewer than predicted by Sam Wang’s state polling meta-analysis and 37 fewer than Drew Linzer’s Votamatic.

So before we deal with anything Silver has specifically said, it’s worth taking in the surrounding landscape: Every major political betting market and every major forecasting tool is predicting an Obama victory right now, and for the same reason: Obama remains ahead in enough states that, unless the polls are systematically wrong, or they undergo a change unlike any we’ve yet seen in the race, Obama will win the election.

There’s no doubt about that. Real Clear Politics, which leans right, shows Romney up by 0.8 percent nationally, but shows Obama up in Ohio, New Hampshire, Iowa, Nevada, Wisconsin, Pennsylvania and Michigan. Romney is up in Florida and North Carolina, but note that his lead in Florida is smaller than Obama’s lead in Ohio. And RCP shows Colorado and Virginia tied. Pollster.com, meanwhile, shows Obama leading by a point in Colorado and Virginia and the race tied in Florida.

It’s important to be clear about this: If Silver’s model is hugely wrong — if all the models are hugely wrong, and the betting markets are hugely wrong — it’s because the polls are wrong. Silver’s model is, at this point, little more than a sophisticated form of poll aggregation.

But it’s just as important to be clear about this: If Mitt Romney wins on election day, it doesn’t mean Silver’s model was wrong. After all, the model has been fluctuating between giving Romney a 25 percent and 40 percent chance of winning the election. That’s a pretty good chance! If you told me I had a 35 percent chance of winning a million dollars tomorrow, I’d be excited. And if I won the money, I wouldn’t turn around and tell you your information was wrong. I’d still have no evidence I’d ever had anything more than a 35 percent chance.

There are good criticisms to make of Silver’s model, not the least of which is that, while Silver is almost tediously detailed about what’s going on in the model, he won’t give out the code, and without the code, we can’t say with certainty how the model works. But the model is, at this point, Silver’s livelihood, and so it’s somewhat absurd to assume he’d hand it out to anyone who asks. For better or worse, those aren’t rules we apply in other markets, or even in journalism, where off-the-record conversations inform much that journalists say.

Another criticism is that Silver’s model is so based on polls by the end that it barely actually counts as a model, and Silver is simply doing a savvy job repackaging and commercializing polling data that is out there already. This criticism, of course, says nothing about his accuracy.

Then there’s the argument that Silver adds unnecessary factors, though note that models without those factors, like Wang’s, are showing an even more lopsided win for the president.

But the good arguments over Silver’s model are being overwhelmed by very bad ones.

First, there are the conservatives who don’t like Silver’s model because, well, they don’t like it. Obama’s continued strong showing is prima facie evidence of bias. Or, to put it slightly differently, the model must be skewed.

The answer to this is simple enough: If Silver’s model is systematically biased, there’s a market opportunity for anyone who wants to build a better model. That person would stand to gain hugely if they outpredicted punditry’s reigning forecaster (not to mention all the betting markets and all the other forecasters). The math behind what Silver is doing isn’t that complicated and the polls are easily available. But so far, the most popular conservative take on the polls was UnskewedPolls.com, to which … LOL. If Silver’s model is so easy to best, then what’s the market failure keeping a less-biased source from besting it?

Then there’s the backlash from more traditional media figures. Some of the arguments here have been downright weird, as when Politico’s Josh Gerstein wrote, “Isn’t the basic problem with the Nate Silver prediction in question, and the critique, that it puts a percentage on a one-off event?” Or when Politico’s Jonathan Martin wrote, “Avert your gaze, liberals: Nate Silver admits he’s simply averaging public polls and there is no secret sauce.” Or when Politico’s Dylan Byers wrote, “So should Mitt Romney win on Nov. 6, it’s difficult to see how people can continue to put faith in the predictions of someone who has never given that candidate anything higher than a 41 percent chance of winning.”

Come to think of it, a lot of the odder critiques of Silver have been coming out of Politico. But that makes a kind of sense. Silver’s work poses a threat to more traditional — and, in particular, to more excitable — forms of political punditry and horse-race journalism.

If you had to distill the work of a political pundit down to a single question, you’d have to pick the perennial “who will win the election?” During election years, that’s the question at the base of most careers in punditry, almost all cable news appearances, and most A1 news articles. Traditionally, we’ve answered that question by drawing on some combination of experience, intuition, reporting and polls. Now Silver — and Silver’s imitators and political scientists — are taking that question away from us. It would be shocking if the profession didn’t try and defend itself.

More recently, we in the media — and particularly we in the media at Politico — have tried to grab an edge in the race for Web traffic by hyping our election stories far beyond their actual importance. The latest gaffe is always a possible turning point, the momentum is always swinging wildly, the race is endlessly up in the air. It thus presents a bit of a problem for us if our readers then turn to sites like Silver’s and find that none of this actually appears to be true and a clear-eyed look at the data shows a fairly stable race over long periods of time.

My guess is Silver and his successors will win this one, if only because, for all the very real shortcomings of models, election forecasters have better incentives than homepage editors. For instance, note that all these attacks on Silver take, as their starting point, Silver’s continuously updated prediction for the presidential election, which includes point estimates for the popular vote and electoral college, and his predictions for the Senate races. Those predictions let readers check Silver’s track record and they force Silver, if he wants to keep his readers’ trust, to make his model as accurate as he can. That’s a good incentive structure — certainly a better one than much of the rest of the media has — and my guess is his results, over time, will prove it.

© The Washington Post Company

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Create New...