How should we respond to electability polling?

February 05, 2024 9 minute read

Photo by Elliott Stallion on Unsplash

Approaching Indonesia’s presidential election on this year’s Valentine’s Day, pollsters are racing to publish the presidential candidates’ electability statistics. Some argued that the polling results did not represent the actual data, some said that the pollsters got paid by one of the candidates, and some just agreed with the polling results. Thus, the difference between public opinions and vague perception of presidential electability polling results raises a question of how the public should respond to the polling results.

And this article will discuss the issue further.

TL;DR, it’s a very simple question, but to answer it requires a complex thought process. For me, we first need to answer the questions by asking further questions such as Why we need the electability polling in the very first place? Why we cannot entirely trust polling results? When can we trust it? And what to do about it?

Why do we need electability polling?

This question is a little bit philosophical, but we can try to explore the potential answer on a very practical level. Fundamentally, the benefit of electability polling is mostly experienced by the political party or presidential candidates to see their political position in a presidential race. The same logic is also applied to the legislative candidates who are running for the election. By checking their electability score, one can strategically improve their campaign method. For instance, they could see which age group or income group they dominated, and which group required more campaign coverage so that they could use the campaign budget efficiently.

Moreover, good electability polling result is also worth media coverage, making it a “free” if not cheap campaign program for candidates.

Similarly, for the general public, electability polling can help us in the decision-making process to select which presidential or legislative candidates. For instance, if a voter sees no hope in their presidential candidate to win the election, one can switch to other candidates with similar values to increase their competitiveness in the election. Certainly, all of these options are applicable with the condition that the polling results are trustworthy.

Photo by Bernard Hermant on Unsplash

Why we cannot entirely trust polling results?

Pollsters might produce deceitful polling results either intentionally or unintentionally. In other words, although most pollsters aim to generate a high-quality and accurate survey, there is a ‘chance’ that pollsters can lie through statistics or be reckless with them.

Pollsters have control over which standard error or confidence interval they’re going to use to estimate the candidate’s electability while designing the survey, and slight differences in the standard error or confidence interval selections might produce entirely different results.

This reminds me of what one of my professors in statistics said in class. “Statistics can be the most powerful lying tool we have ever made”. This is true when I discussed how researchers can easily manipulate the statistical results by playing with the level of confidence (technically called p-hacking).

Another way to ‘half-lie’ or just be reckless with statistics is by framing the question prompt. Pollsters can design a question form that leans to certain answers. One simple example is to reorder certain preferred question or answer options on the top position. As a result, this might end up with the so-called survey “framing effect”.

Furthermore, the method that a pollster used for sampling also potentially results in ‘selection bias’. The reason for such bias is that the randomization condition is not fully achieved. This might happen due to the polling only being able to access certain groups of people (not the entire survey population) — for instance, only those with access to email or phone. Consequently, in that case, the survey result might lean toward those characteristics.

However, alleviating the selection bias is sometimes difficult due to the funding limitation. A pollster should not fully achieve the randomization condition with a considerate disclaimer in their publication. Acknowledging this limitation and survey nature may help the public interpret the polling results with grains of salt.

Hacking the p-values, designing a poll with an intentional framing effect, and being unaware of the selection bias are considered as violating the academic code of conduct, and it might produce disinformation in public discourse.

Photo by lilartsy on Unsplash

When can we trust polling results?

For me, at least there are three main considerations to interpreting polling results and judging whether they are trustworthy. First, the very basic one is to carefully check the sampling and data collection method that an independent pollster used.

The sampling method is crucial, as it determines the individuals or units selected to represent the broader population. A reliable poll should utilize a representative sample that accurately reflects the demographics and characteristics of the target population. The randomization of the sampling process is essential to ensure that every member of the population has an equal chance of being included, minimizing biases and enhancing the sample’s representativeness. Additionally, the mode of data collection, whether through phone calls, online surveys, or in-person interviews, plays a pivotal role, as different demographics may respond differently based on the chosen communication method.

Second, one can reduce the risk of selection bias when interpreting one poll or survey by aggregating the results with polling or survey results conducted by another pollster. This, in theory, can lead to more representative results, particularly when the additional polling results are using different data collection methods.

Third, survey timing is also important to be considered as the candidate electability is quite volatile with respect to time. It can be affected by certain issues or events. For instance, the polling before the candidate debate may be different than after the candidate debate. With the updated polling results, presidential candidates might realign their campaign strategy to focus on their weaker points.

Photo by petr sidorov on Unsplash

Why some polling cannot predict the actual result?

One great example of this question is the 2016 US election between Hillary Clinton and Donald Trump. As featured in the Pew Research Center, many people were surprised by the actual results, given the polling had been consistently showing that Clinton would defeat Trump in the 2016 election — most projections even predicted the chance of Clinton would win over Trump by 70% to 99%.

Many were confused by those results and experts suggested three possible explanations regarding to this situation. First, is because of the non-response bias. This occurs when certain groups of people consistently don’t participate in surveys, even when efforts are made to reach everyone. Some hard-to-reach groups, such as less educated Trump supporters, might be avoiding polls due to frustration and distrust of institutions. This could lead to a pro-Trump segment of the population being underrepresented in the survey results compared to their actual proportion.

Second, popularly known as the “shy Trumper” hypothesis. It implies that liking Trump was seen as unpopular, and his supporters didn’t want to admit their support in surveys. This concept is similar to the “Bradley effect,” where voters were thought to hide their intention not to vote for a black candidate, resulting in an unexpected outcome in the 1982 election.

The last possible explanation is how pollsters figure out who will vote. In other words, the challenge is when the pollsters made models to guess who will participate and what the election turnout will be. This is tricky, and even small changes in assumptions can affect predictions a lot. It’s possible that the actual voters, especially in the Midwest and Rust Belt states that surprised everyone, were not the ones the pollsters expected. The usual models also include enthusiasm, and the lack of excitement, especially among Democrats in 2016, might have messed up this part of the measurement.

Interestingly, the latter examination of the 2016 election voters showed that there was a stark difference between voters’ and nonvoters’ characteristics. In short, nonvoters tended to be younger, less educated, less wealthy, and nonwhite when compared to those who actually voted. Additionally, nonvoters leaned more toward the Democratic side resulting in an advantage for Trump in the actual election.

The US 2016 election offers valuable lessons about the challenges in gauging public opinion. The unexpected outcome revealed potential biases, such as non-response bias, where certain groups, like less educated Trump supporters, were underrepresented due to their avoidance of surveys. Furthermore, the “shy Trumper” hypothesis underscored the reluctance of some Trump supporters to openly express their preferences might also be a challenge for the pollsters to capture. Moreover, the difficulty in accurately predicting voter turnout and enthusiasm levels highlighted the complexity of creating models for election forecasting.

Photo by Mika Baumeister on Unsplash

In a nutshell

Conducting accurate electability polling poses a persistent challenge for pollsters due to inherent statistical issues such as selection bias, nonresponse bias, and the identification of actual turnout. Additionally, pollsters may inadvertently or intentionally be influenced by question framing and data visualization framing. To maintain quality and accuracy, it is crucial for pollsters to transparently present their methodology and disclaimers alongside their results. Including verbatim questions alongside statistical outcomes is preferable to allow the public insight into the questions posed.

As the general public, it is essential for us to exercise caution when interpreting polling results. Beyond scrutinizing methodological disclaimers and the data collection process, seeking alternative opinions from multiple pollsters is strongly recommended to obtain a more comprehensive understanding of the political landscape.

Cheers,

Ega Kurnia Yazid

I welcome feedback and constructive criticism. To discuss further you can comment on each post, or you can reach me through any social media platforms you can find on my webpage.

Share on

Twitter Facebook LinkedIn

Ega Kurnia Yazid

How should we respond to electability polling?

Why do we need electability polling?

Why we cannot entirely trust polling results?

When can we trust polling results?

Why some polling cannot predict the actual result?

In a nutshell

Share on

You May Also Enjoy

Are fixed-effects models sufficient for causal inference?

Setting up a virtual environment for a Python project

What I wish I knew when I learned Data Science