A Appendix A: Methodology

A.1 YouGov sampling and weights

YouGov interviewed 2,387 respondents who were then matched down to a sample of 2,000 to produce the final dataset. The respondents were matched to a sampling frame on gender, age, race, and education. The frame was constructed by stratified sampling from the full 2016 American Community Survey (ACS) one-year sample with selection within strata by weighted sampling with replacements (using the person weights on the public use file).

The matched cases were weighted to the sampling frame using propensity scores. The matched cases and the frame were combined and a logistic regression was estimated for inclusion in the frame. The propensity score function included age, gender, race/ethnicity, years of education, and geographic region. The propensity scores were grouped into deciles of the estimated propensity score in the frame and post-stratified according to these deciles.

The weights were then post-stratified on 2016 U.S. presidential vote choice, and a four-way stratification of gender, age (four-categories), race (four-categories), and education (four-categories), to produce the final weight.

A.2 Demographic subgroups

We use the following demographic subjects in our analysis:

  • Age group as defined by Pew Research Center: Millennial/post-Millennial adults (born after 1980; ages 18-37 in 2018), Gen Xers (born 1965-1980; ages 38-53 in 2018), Baby Boomers (born 1946-1964; ages 54-72 in 2018), Silents/Greatest Generation (1945 and earlier; ages 73 and over in 2018)
  • Gender: male, female
  • Race: white, non-white
  • Level of education: graduated from high school or less, some college (including two-year college), graduated from a four-year college or more
  • Employment status: employed (full- or part-time), not employed
  • Annual household income: less than $30,000, $30,000-70,000, $70,000-100,000, more than $100,000, prefer not to say
  • Political party identification: Democrats (includes those who lean Democrat), Republicans (includes those who lean Republican), Independents/Others
  • Religion: Christian, follow other religions, non-religious
  • Identifies as a born-again Christian: yes, no
  • Completed a computer science or engineering degree in undergraduate or graduate school: yes, no
  • Has computer science or programming experience: yes, no

We report the unweighted sample sizes of the demographic subgroups in Table A.1.

Table A.1: Size of demographic subgroups
Demographic subgroups Unweighted sample sizes
Age 18-37 702
Age 38-53 506
Age 54-72 616
Age 73 and older 176
Female 1048
Male 952
White 1289
Non-white 711
HS or less 742
Some college 645
College+ 613
Not employed 1036
Employed (full- or part-time) 964
Income less than $30K 531
Income $30-70K 626
Income $70-100K 240
Income more than $100K 300
Prefer not to say income 303
Republican 470
Democrat 699
Independent/Other 831
Christian 1061
No religious affiliation 718
Other religion 221
Not born-again Christian 1443
Born-again Christian 557
No CS or engineering degree 1805
CS or engineering degree 195
No CS or programming experience 1265
CS or programming experience 735

A.3 Analysis

We pre-registered the analysis of this survey on Open Science Framework. Pre-registration increases research transparency by requiring researchers to specify their analysis before analyzing the data (Nosek et al. 2018). Doing so prevents researchers from misusing data analysis to come up with statistically significant results when they do not exist, otherwise known as \(p\)-hacking.

Unless otherwise specified, we performed the following procedure:

  • Survey weights provided by YouGov were used in our primary analysis. For transparency, Appendix B contains the unweighted topline results, including raw frequencies.

  • For estimates of summary statistics or coefficients, “don’t know” or missing responses were re-coded to the weighted overall mean, unconditional on treatment conditions. Almost all questions had a “don’t know” option. If more than 10% of the variable’s values were don’t know" or missing, we included a (standardized) dummy variable for “don’t know”/missing in the analysis. For survey experiment questions, we compared “don’t know”/missing rates across experimental conditions. Our decision was informed by the Standard Operating Procedures for Don Green’s Lab at Columbia University (Lin and Green 2016).

  • Heteroscedasticity-consistent standard errors were used to generate the margins of error at the 95% confidence level. We report cluster-robust standard errors whenever there is clustering by respondent. In figures, each error bar shows the 95% confidence intervals. Each confidence ellipse shows the 95% confidence region of the bivariate means assuming the two variables are distributed multivariate normal.

  • In regression tables, * denotes \(p<0.05\), ** denotes \(p<0.01\), and *** denotes \(p<0.001\).

A.4 Data sharing

Our survey data, as well as the R and Markdown code that produced this report, are publicly available through the Harvard Dataverse. Below is the citation for the replication data:

Zhang, Baobao; Dafoe, Allan, 2019, “Replication Data for: Artificial Intelligence: American Attitudes and Trends (January 9, 2019)”, https://doi.org/10.7910/DVN/SGFRYA, Harvard Dataverse, V1, UNF:6:XASNQjh6L8LmDwZVrXw4Iw==


Lin, Winston, and Donald P Green. 2016. “Standard Operating Procedures: A Safety Net for Pre-Analysis Plans.” PS: Political Science & Politics 49 (3): 495–500.

Nosek, Brian A, Charles R Ebersole, Alexander C DeHaven, and David T Mellor. 2018. “The Preregistration Revolution.” Proceedings of the National Academy of Sciences 115 (11): 2600–2606.