Over the last few years I’ve cataloged 1,055 public polls pertaining to the 2016 presidential general election into the most detailed database of its kind in existence. In order for a poll to warrant inclusion into our database, it must be conducted at the state-level; national polls and 50-state polls are excluded. Each poll must also meet basic transparency criteria; the three main thresholds are sample size, sample type and sample dates. Additional methodological information, like question wording and order, are gathered if provided. Each poll’s results are mirrored and preserved forever. Demographic sub-sample results are also tabulated. The culmination of this data is presented below on the eve of the election.
Our projections only utilize the sampling end-date and each candidate’s value; the demographic information and methodology, or anything else for that matter, have no bearing on the projections and are gathered to enable advanced filtering and complex analysis. A local regression is used when the quantity of polls is sufficient and a least-squares regression otherwise. More information about these technical aspects are available in our methodology series.
The purpose of polling is to inform not predict. Our projections intend to represent an aggregate view of state-level public polling, a sort of polling implied outcome that is not intended to be predictive. If polling is correct, the election's outcome should match our projections.
The projected outcome in each region, as derived from polling using the applicable regression, for each candidate and demographic, is presented below followed by our final electoral map. The smaller gray percentages depict the likelihood that the given candidate wins that row based on available information.
The electoral college representation of the projections above, is below. The District of Columbia did not experience a single poll the entire cycle, it is therefore gray and unassigned due to a lack of data; they will most certainly vote for Clinton however which places the count at 317 for Clinton to 221 for Trump.
I’ve also collected data for the Iowa and Wisconsin Senate elections, their projections are presented below and follow the same guidelines as the presidential contests.
When the election’s over, we’ll continue to use polling to explain and learn from its outcome. I principally want to generate a Brier score against our probabilities to assess the baseline accuracy of using purely polling to predict election outcomes.