Why polls predicted Hillary Clinton to win
Below, we present some extra information we could not include in our research note on the 2016 US presidential elections:
-
The AAPOR report (PDF) is available at: http://www.aapor.org/getattachment/Education-Resources/Reports/AAPOR-2016-Election-Polling-Report.pdf.aspx
-
The predictions by 538: https://projects.fivethirtyeight.com/2016-election-forecast/
-
All 2016 US election certified results used here are compiled by David Wassermann and can be found in a spreadsheet: https://docs.google.com/spreadsheets/d/133Eb4qQmOxNvtesw2hdVns073R68EZx4SfCnP4IGQf8.
-
Basically there are two ways to calculate the number of samples with Trump or Clinton as winner drawn from a multinomial distribution (i.e., a distribution with 3 probabilities to win: Trump (PT), Clinton (PC) and others (Po)). The first way is to use a normal sampling distribution to approximate the multinomial sampling distribution (see: http://www.pmean.com/04/MultinomialProportions.html). To find the number of samples in which Trump wins, that is PT < PC, one must calculate the area of the assumed normal probability density function starting at PT = PC while using a continuity correction. The other way is to randomly draw many samples from the observed multinomial distribution and count the number of samples with Trump or Clinton as a winner. Because we do not have to assume a priori sampling distributions, we choose this latter option using 1 million random samples per state using R (see below).
-
The scripts in R (zipped) that were used in our research note about the 2016 US presidential election: R scripts Trump vs Clinton (zip, 4 kB)
-
Some extra Tables with calculations used in the note: Table 1 - 3 (pdf, 164 kB)
- Short YouTube summary: https://www.youtube.com/watch?v=HpLoxfKioUc&t=41s