Last week, Public Policy Polling revealed it had conducted a poll in Colorado Senate District 3 prior to the recall election on Sept. 10, but had not released the results before the election. Now, post-election, PPP was claiming that their results were surprisingly accurate, showing Democrat Angela Giron losing the recall by 12 percentage points (54% saying yes to the recall, 42% saying no) – which turned out to be almost the exact margin of the election.
The announcement produced a barrage of criticisms, but I was surprised that not one of the critics asked the first question that came to my mind: Did PPP also poll in the other district where there was a recall? That was District 11, where the recall focused on Colorado senate president, John Morse. If PPP had conducted a poll there also, why weren’t we hearing anything about those results?
If PPP did not conduct a pre-election poll in what was clearly the more significant district, why not? What prompted them to go into District 3, to see whether the results they got would “make sense,” and not do the same in District 11 where the stakes seemed even higher?
In either case, there appears to be more to the story than PPP’s withholding information in just the one district.
(I left a phone message with PPP’s Tom Jensen, and will post an update when he gets back to me.)
Jensen explained in his post that he hadn’t published the results before the election, because:
“In a district that Barack Obama won by almost 20 points I figured there was no way that could be right and made a rare decision not to release the poll. It turns out we should have had more faith in our numbers because she was indeed recalled by 12 points.”
I suppose we could offer congratulations to PPP for being accurate, but that was not the general reaction of its readers, nor of other pundits. Indeed, many – including former New York Times election guru, Nate Silver – excoriated PPP for withholding its results simply because they were not what the researchers at PPP expected. The decision was particularly questionable, since PPP is widely known as a “left-leaning” polling organization, so there is the suspicion that PPP withheld its results for partisan reasons.
Huffington Post subsequently reported an analysis by the New Republic’s Nate Cohn, that PPP’s polling methodology more generally is “unscientific and unsettling,” mostly because of PPP’s ad hoc approach to weighting the results. (There is an extended exchange between Cohn and Jensen posted on PPP’s website.)
Most pollsters weight their results after completion of interviewing, in order to make sure the sample is generally representative of the population. If the final sample has more men than women, while the population has more women than men, the program would give more “weight” to female responses to compensate for their under-representation.
In pre-election polling, it’s more difficult to know what the “true” proportions of sub-groups are, because there is never 100% turnout. Thus, the sample may have many more men than women, but we wouldn’t know if that’s because women just didn’t respond as much as men, or because the true electorate — the people who will actually turn out to vote — will include more men than women.
Still, pollsters typically base their estimates of the true electorate on past elections, or they don’t weight at all — relying instead on their interviewing process to come up with the correct proportions. Almost all pollsters, however, will screen their samples to make sure that only the people most likely to vote are counted in the results. Designing questions to determine “likely voters” is itself an art form requiring a great deal of judgment.
Typically, pollsters determine ahead of time what parameters they will use for their weighting and screening process, based on previous data and the best scientific evidence they have available. Apparently, PPP waits until it sees what the results are before applying its weights, thus adding a layer of personal judgment to their process. The significance of PPP’s ad hoc approach to weighting is that it can “adjust” its results to make sure they are not outliers, compared with other polls. As Cohn says of PPP’s 2012 polling,
“Throughout its seemingly successful run, PPP used amateurish weighting techniques that distorted its samples—embracing a unique, ad hoc philosophy that, time and time again, seemed to save PPP from producing outlying results.”
In reaction to criticism about the post-election release of its results, PPP’s Jensen posted a defense, which in my view raises even additional ethical questions. What bothered me the most was the statement,
“In the case of the Giron recall [in District 3], this was the first legislative recall election in Colorado history. There’s been a lot of voter confusion. We decided to do a poll there over the weekend and decide whether to release it publicly depending on whether the results made sense or not.” [No mention made of District 11.]
What this suggests is that if the poll showed the Democrat winning, the results would have “made sense” to him, and thus would have been released.
That is hardly the standard that pollsters should adopt before posting their results.
Unfortunately, that approach does seem consistent with PPP’s flexible and ad hoc weighting scheme , allowing the pollster to weight the data so that it fits the pollster’s a priori expectations – or the results that other polls are showing.
But the whole point of polls is to find out what exactly does make sense about public sentiment. If Jensen believes he has some extra-polling method of determining what the public is thinking, why do the polls?
PPP posted a rejoinder to Cohn’s and others’ criticisms about its methodology:
“15 things PPP got right going it alone
“In the slew of criticism of PPP this week, the most insidious suggestion has been that we ‘copy’ our results off of other pollsters. The other criticisms are basically differences of opinion about our methodology, but this one goes to a whole different level because it implies seriously unethical behavior on our part.
“That attack neglects the fact that we have polled more races that no one else wanted to poll than anyone else in the country over the last 5 years, and generally gotten it right. Here are 15 examples of where we were either the only pollster to look at a race, or the first to pick up on a surprising shift in a contest…”
The truth is, there is no evidence that PPP has ever “copied” its results off other pollsters, or made any adjustments in their weighting scheme for partisan or any other non-objective reasons. David Weigel of Slate was mostly forgiving:
“Let’s give a few whacks to PPP but agree that this doesn’t ruin its credibility. ‘We didn’t think the candidate could possibly be losing this badly’ is an insane reason to [with]hold a poll—that’s a poll people want to read, especially on a race that had no public data at all! But if PPP was trying to cover up the decision, it could have by never admitting that the poll existed. Who was asking? There was no heat; the mistake was answered by transparency.”
I’m not convinced the transparency is complete – the lack of any mention of District 11 still bothers me.
Still, without inside information, it’s difficult to make a judgment. In any case, the whole episode is rather curious, that a pollster would try to claim accuracy for a poll, whose results were published only after the election.
More problematic is the firm’s general philosophy – that it would conduct a pre-election poll with the expectation that it would publish the results only if they made sense.
That’s a bit too squishy for me.