Morning Consult asked more than 18,000 registered voters nationally throughout August 2016 who they would support in general election presidential matchups. We used a statistical technique called multilevel regression and poststratification (MRP) to construct state-level estimates from the national survey data. Overall, Hillary Clinton would top Donald Trump 321-195 in electoral votes to clinch the White House if the election were held today.
We develop state-level estimates from our national survey data by utilizing a statistical technique known as multilevel regression and poststratification (MRP). MRP has been widely used in industry and in academia, and MRP estimates of state and Congressional District level public opinion have generally been shown to outperform national polling, especially when there are few respondents in smaller geographic areas
Responses to the general election vote choice question are modeled via multilevel regression as a function of both individual level and state-level variables. Our models use age, gender, and education as individual level predictor variables. For our state-level variables, we chose variables that may influence state-level vote choice such as the percent change in state gross domestic product (GDP), state unemployment rates, state median household income, and state-level outcomes from 2012 Presidential election. We include an additional decay parameter that increases the importance of more recent polls vis a vis polls conducted in early August.
Then, in the next step, we calculate a weighted sum of the individual demographic-geographic type for each state. Namely, we poststratify the predictions from our re-gression models on age, education and gender obtained from the 5-year estimates of the adult citizen population from the 2013 American Community Survey (ACS). These variables were chosen because we needed true values of the individual level variables and their interactions (e.g., males 50+ with a college degree, etc.), which are available in the ACS.
Standard errors for our estimates were calculated by taking 50 bootstrap samples with replacement from our full national dataset (n = 18,000+) for each hypothetical match up and then assessing this empirical distribution at the state level.