UPDATE (12/22/2015): Polling company Morning Consult just released a similar mode effect study but also including interactive voice response (IVR) as well as live call center and online. You can scroll down to read the summary in the LA Times.
Back in April we wrote why Hearing Instruments Association’s commissioning of AZ Marketing Research and Online Survey Solutions produced the steaming pile of manure called MarkeTrak 9, which used online surveying as opposed to the 60,000 home National Family Opinion panel in MarkeTrak I through VIII; followed up here the same day with a highly relevant Pew Research Center study on Older Adults and Technology showing how “seniors continue to lag in tech adoption;” and in June on how UNT Prof. Amyn Amlani elegantly smashed MarkeTrak IX to pieces in Assessing the Validity of MarkeTrak IX Adoption Rates. While performing unrelated research on biases introduced in political polling methodology, we stumbled across the extensive and highly relevant From Telephone to the Web: The Challenge of Mode of Interview Effects in Public Opinion Polls Pew study, which quantifies the “mode effect” of different survey methods yielding different results. For political purposes some of these mode variations are minor; but as you’ll see in our excerpts the variations on subjects related to hearing healthcare can be significant.
Although this article pertains directly to the spoilage of MarkeTrak 9 (or more charitably put, how comparing MarkeTrak VIII to MarkeTrak 9 is like comparing apples to oranges), the topics discussed here, such as biases of interviewer-asked questions vs self-driven surveys also apply not only to audiologic research but also right down to the “shop floor” for patient satisfaction surveys such as the Glasgow Hearing Aid Benefit Profile and Client Oriented Scale of Improvement (COSITM). In addition, we interject pertinent comments with appropriate references related to speech perception vs mode, age, and cognition/working memory, as these “non-standard” issues also apply to hearing healthcare, further complicating matters.
- HIA’s MarkeTrak IX Elegantly Smashed to Pieces by Amyn Amlani (June 9, 2015)
- More On The Defective HIA MarkeTrak 9 “Study” (April 17, 2015)
- HIA’s MarkeTrak 9: A Steaming Pile Of Manure (April 17, 2015)
Here are excerpts with our inline [comments in blue] of From Telephone to the Web: The Challenge of Mode of Interview Effects in Public Opinion Polls (Pew Research Center, May 13, 2015):
Among the most striking trends in the field of survey research in the past two decades is the shift from interviewer-administered to self-administered surveys. Fueled by the growth of the internet, self-administration as a survey mode presents a mixture of opportunities and challenges to the field. Self-administered surveys tend to be less expensive and to provide ways of asking questions that are difficult or impossible to ask in an interviewer-administered survey.
But the results from self-administered and interviewer-administered surveys are sometimes different. This difference is called a mode effect, a difference in responses to a survey question attributable to the mode in which the question is administered. [emphasis added] Among the issues this raises are how to track trends in responses over time when the mode of interview has changed and how to handle the inconsistencies when combining data gathered using different modes.
Using its nationally representative American Trends Panel, Pew Research Center conducted a large-scale experiment that tested the effects of the mode of survey interview – in this case, a telephone survey with an interviewer vs. a self-administered survey on the Web – on results from a set of 60 questions like those commonly asked by the center’s research programs. This report describes the effort to catalog and evaluate mode effects in public opinion surveys.
The study finds that differences in responses by survey mode are fairly common, but typically not large, with a mean difference of 5.5 percentage points and a median difference of five points across the 60 questions. The differences range in size from 0 to 18 percentage points. The results are based on 3,003 respondents who were randomly assigned to either the phone or Web mode and interviewed July 7 — Aug. 4, 2014 for this study.
Where differences occurred, they were especially large on three broad types of questions: Items that asked the respondent to assess the quality of their family and social life produced differences of 18 and 14 percentage points, respectively, with those interviewed on the phone reporting higher levels of satisfaction than those who completed the survey on the Web.
Questions about societal discrimination against several different groups also produced large differences, with telephone respondents more apt than Web respondents to say that gays and lesbians, Hispanics and blacks face a lot of discrimination. However, there was no significant mode difference in responses to the question of whether women face a lot of discrimination.
Web respondents were far more likely than those interviewed on the phone to give various political figures a “very unfavorable” rating, a tendency that was concentrated among members of the opposite party of each figure rated.
Statistically significant mode effects also were observed on several other questions. Telephone respondents were more likely than those interviewed on the Web to say they often talked with their neighbors, to rate their communities as an “excellent” place to live and to rate their own health as “excellent.” Web respondents were more likely than phone respondents to report being unable to afford food or needed medical care at some point in the past twelve months.
One important concern about mode effects is that they do not always affect all respondents in the same way. Certain kinds of respondents may be more vulnerable than others to the effect of the mode of interview. In some instances, this may be a consequence of cognitive factors; for example, well-educated respondents may be better able than those with less education to comprehend written questions. [Cognitive factors also come into play with working memory (more here), age, duration & severity of the hearing loss, and as Frank Lin has documented, the linkage to dementia.] In other instances, the sensitivity of a question may be greater for certain respondents than for others; for example, mode effects on questions about financial difficulties may be much larger among low income individuals — the people most likely to experience such troubles.1
Despite these sometimes substantial differences, the study found that many commonly used survey questions evidence no mode effect. Reports about various personal activities performed “yesterday” – such as getting news from a newspaper, on television or on the radio; calling a friend or relative; writing or receiving a letter; or getting some type of exercise – showed no significant differences by mode of interview. And most questions about religious affiliation, belief and practice yielded similar results on the Web and on the phone, though Web respondents were somewhat more likely than those interviewed on the telephone to say that they “seldom” or “never” attended religious services.
About the Study
This study was conducted using Pew Research Center’s nationally representative American Trends Panel (ATP). Panelists who normally take their surveys on the Web were randomly assigned to either the phone mode (N=1,494 completed by phone) or the Web mode (N=1,509 completed on the Web). Each set of respondents was independently weighted to be representative of the U.S. public in an effort to ensure that any differences observed between the groups were a result only of mode of interview effects. Mode differences for each question in the study were measured by comparing answers given by the Web and phone groups using a commonly reported category of each question in the study or the category that shows the largest mode difference — whichever is larger.
Why Mode of Interview Effects Occur
The experience of being interviewed by another person differs from completing a survey online or on paper. For example, an interviewer can help respondents stay focused and may be able to provide clarification or encouragement at difficult junctures during the interview.
But the social interaction inherent in a telephone or in-person interview may also exert subtle pressures on respondents that affect how they answer questions. Respondents may feel a need to present themselves in a more positive light to an interviewer, leading to an overstatement of socially desirable behaviors and attitudes and an understatement of opinions and behaviors they fear would elicit disapproval from another person. Previous research has shown that respondents understate such activities as drug and alcohol use and overstate activities like donating to charity or helping other people. This phenomenon is often referred to as “social desirability bias.” These effects may be stronger among certain types of people than others, introducing additional bias into the results.2
Most of the largest mode differences observed in this study are observed on questions where social desirability bias could play a role in the responses. Of the 21 items showing a difference by mode of at least seven percentage points, seven involve ratings of political figures (and very negative ratings are less prevalent for all seven items on the phone than on the Web), four involve questions about intimate personal issues including life satisfaction, health status and financial troubles (with positive responses more common on the phone across all of them) [Emphasis added] and three relate to perceptions of discrimination against minority groups (with phone respondents more likely to say there is discrimination against each group). [We believe these differences may also apply to how others perceive their hearing loss, and also how others perceive them wearing hearing aids (cosmetics).] Two other questions that fit within this framework are talking with neighbors and attending religious services. Phone respondents were 11 points more likely than Web respondents to say they talked with neighbors at least a few times a week. [This 11 point delta may also apply with respect to the impact of hearing loss affecting interactions with neighbors, with respect to the loss causing isolation.] Web respondents were seven points more likely than phone respondents to say that they seldom or never attend religious services.
But not all questions that touch on potentially sensitive topics or involve behaviors that are socially desirable or undesirable exhibited mode effects. For example, there was no significant mode difference in how people rated their own personal happiness; or in the percentages of people who said they had done volunteer work in the past year, called a friend or relative yesterday just to talk, or visited with family or friends yesterday. There also were no differences by mode in the shares of people who are religiously unaffiliated, think that a person must believe in God in order to be moral or say that religion is very important in their life.
In addition, there are other sources of mode difference apart from social desirability. Because surveys require cognitive processing of words and phrases to understand a question and choose an appropriate response option, the channel in which the question and options are communicated can also affect responses [Emphasis added — See above]. A complicated question with many different response options may be very difficult to comprehend when someone hears it on the phone, but easier to process when read online or on paper. Because they are easier to remember, the last response option read by an interviewer may be favored by respondents — a phenomenon called the “recency effect.” This effect is less prevalent in a self-administered survey, where respondents can see all of the response options in a glance or can go back and re-read a question on their own.3
One question in the survey was lengthy and somewhat complicated and could have posed a greater challenge to phone than Web respondents: an item that asked respondents to place themselves in one or more of a set of racial or ethnic categories. [Once again, this goes back to amplification of the effects on cognition vs age-related hearing loss.] This item was modeled on a new question under review by the U.S. Census that, for the first time, includes Hispanic origin as an option along with the more traditional race categories such as white, black or African American, Asian or Asian American. Yet respondents on the phone and the Web gave nearly identical answers. [However, many hearing impaired people will expend more effort due to poor speech perception scores: Differences between telephone and written responses among the hearing impaired cohort is a topic ripe for research.]
The implicit time pressure in an interviewer-administered survey can affect a respondent’s willingness to engage in the amount of thought necessary to recall facts or past events, leading to different answers than would be obtained if no interviewer were involved. And, of course, the absence of an interviewer might make it more likely that some respondents on the Web or on paper decide to speed through a questionnaire in order to finish more quickly, thus providing lower-quality data.
In general, we found little evidence that cognitive processes of these sorts created mode differences in responses. [But, as we pointed out above, this may not be the case with a hearing impaired cohort.] That may reflect the fact that the questions chosen for this study are drawn from well-tested items designed for telephone surveys, and thus do not reflect the kinds of burdensome items that have previously been shown to create mode effects. It’s also possible that the panelists, having participated in one large telephone survey with Pew Research Center (the polarization study that was used to recruit the American Trends Panel) and – for the vast majority – at least one previous panel wave, are savvier survey participants than others and thus are less vulnerable to mode effects than a fresh cross-sectional sample would be.
This report presents the study’s findings in a series of sections organized by the topic of the survey questions used. Following a discussion of the study’s conclusions, there is a detailed methodological description of the study. A table presenting all of the items sorted by the size of the mode differences follows. At the end is a complete topline showing all questions and response categories.
Scope of the Mode Differences
This study is composed of mode of interview comparisons across 60 different questions covering a range of subjects and question formats [But we’re going to pare it down to topics related to hearing healthcare]. Topics include politics,
religion,social relationships, daily activities, personal health, interpersonal trust and others. Question formats ranged from simple categorical items (rating one’s community as excellent, good, only fair, poor), to yes/no items (read a newspaper yesterday), to completely open-ended items (how many doctor visits in the past 12 months), to 0-100 rating scales (rate the Democratic and Republican leaders in Congress).
Responses to all but four of the 60 items showed at least some numerical difference by mode, and the median difference across all items was 5 percentage points. The largest difference was 18 points, and there were eight items with differences of at least 10 points. But most of the 24 non-zero differences smaller than 5 percentage points are not statistically significant, and thus could have occurred by chance as a result of sampling error.
The following sections of the report provide details about the presence or absence of mode differences across all of the measures included in the study. Each section includes an analysis of the overall results and an examination of the kinds of people most likely to be affected by the mode of interview. In general, only those differences that are statistically significant are highlighted, except in a few instances where there was a strong expectation of a mode effect and none was found.
Sizeable Mode Effects in Political Measures
Some of the largest mode differences in the study are seen in the ratings of political figures. Public views —on both sides of the aisle— are considerably more negative when expressed via the Web than over the phone. The mode differences are even larger when looking at the ratings of political figures by respondents of the opposite political party. [Most of this section has been struck] [:]
The effects of the interview mode on ratings of political figures also appear in a different question format — the “feeling thermometer” scale that varies from 0 to 100. As with the verbal scales, more people express highly negative views on the Web than on the phone. [Emphasis added: This effect may have extra importance with hearing healthcare surveying.] Asked to rate party leaders in Congress on a 0 to 100 scale, 44% of Web respondents give Republican leaders in Congress between 0 and 33, the “cold” or negative end of the scale. When asked on the phone, 37% gave responses in this range. That 7-percentage point difference is the same as with Democratic leaders in Congress (32% on the phone, 39% on the Web). As with ratings of specific political figures, mode differences on these scales are much larger among members of the opposite party of the group being rated.
The use of a numerical scale highlights another difference between Web and phone: Phone respondents are more likely than Web respondents to select an answer closer to the midpoint of the scale (between 34 and 66). [Emphasis added] When rating Democratic leaders in Congress, 36% of Web respondents selected a number in the middle range, compared with 45% of phone respondents…[Balance of section on politics struck]
Measures of [Racial and Gender] Discrimination Significantly Affected by Mode
Considerable mode differences were observed on questions about societal discrimination against several groups including gays and lesbians, Hispanics and blacks, with phone respondents more likely than Web respondents to say these groups faced “a lot” of discrimination. [This goes to how others perceive their hearing loss, hearing aids; and also importantly surveys of teens & parents on bullying and teasing.] But the impact of the mode of interview varied by the race and ethnicity of the respondents.
Originally we omitted this entire section on race & gender discrimination; however the concept of “linked fate” in the Black community may also come into play in hearing healthcare, as there are indeed differences in the perceptions of Blacks towards the culturally Deaf; thus we only struck the sections on LGBT & Hispanic discrimination. There may indeed be differences vis-a-vis hearing loss in these two additional cohorts, but it is poorly documented.
Author Theodore Johnson explains the “linked fate” paradigm:
Though their individual experiences differ, race plays a significant role in how all Blacks are perceived and treated by society, as University of Chicago professor Michael Dawson explained in 1994. Dawson argued that race binds black voters together with the belief that one’s success is contingent on the success of the group as a whole — an idea colloquially known as “linked fate.” That belief motivates African Americans to subordinate personal policy preferences and individual economic interests to the civil liberties of the overall group.
The Pew study continues…
[Parenthetically, because political & other polling can be easily skewed, as a polling consumer the first thing we do is go to the bottom of the report to look at the sample size, margin-of-error (MOE), and methodology — And sometimes this is buried. The necessity is vividly illustrated in this political season, as on the Democrat side there are three candidates with well-defined polling percentages: Two are in the middle quartile and one is in the MOE. However, on the GOP side there are 14 candidates as of this writing, with several down in the MOE, especially with sample sizes under about 1040; with this pair of polls in evenly-split Florida using sample sizes well over 2400 to assure a small MOE. What’s more, because 220 of the 538 Electoral votes are in “locked in” deep blue States such as California, New York, Massachusetts, etc… we are discounting so-called “national polls” completely, as they are easily skewed by this factor as well as the typically small sample size, usually around 500 respondents; i.e. we believe the true MOE is much higher than reported. For these reasons we decided to leave the methodology section intact, so the reader can learn about the myriad of factors involved in both consuming as well as producing a world-class survey.]
When asked about discrimination against blacks, more phone respondents (54%) than Web respondents (44%) said this group faced a lot of discrimination. This pattern was clear among whites, where 50% on the phone and just 37% on the Web said blacks face a lot of discrimination. But among blacks, the pattern is reversed: 71% of black respondents interviewed by phone say they face “a lot” of discrimination, while on the Web 86% do so.
Unlike the items about other minority groups, there is no significant mode difference in responses to the question about women. Exploration of key demographic subgroups showed similar answers on the Web and on the phone, suggesting that social desirability may not influence responses to this question. [Interesting…]
Happiness and Life Satisfaction Higher Among Phone Respondents
Sizeable mode differences were observed on questions measuring satisfaction with family and social life. [Emphasis added] Among phone survey respondents, 62% said they were “very satisfied” with their family life; among Web respondents, just 44% said this. Asked about their social life, 43% of phone respondents said they were very satisfied, while just 29% of Web respondents said this. These sizeable differences are evident among most social and demographic groups in the survey.
The mode differences on satisfaction with social life are smallest among people who, in a different question, say they are “very happy” with their life and larger among those who say they are “pretty happy” or “not too happy.” However, answers to the happiness question itself do not vary by survey mode.[:]
Volunteering, Community Involvement and Social Trust
Because neighborliness and civic engagement are considered virtuous behaviors by many people, it would not be surprising to see more reports of these activities in a phone interview than on the Web. Mode differences were observed on some but not all such measures. Respondents on the phone reported more interaction with neighbors than they did on the Web. [Emphasis added] Similarly, phone respondents were more likely to say they worked in the community to fix a problem or improve a condition, but were not more likely to report engaging in volunteering through an organization. Where mode differences appear, they tend to be larger among higher income individuals. [Emphasis added: Hearing aid adoption rates somewhat vary due to the cost.]
Asked how frequently they talk to their neighbors in a typical month, 58% of phone respondents report talking to their neighbors “every day” or “a few times a week”; on the Web, 47% report doing so. The mode difference among higher income respondents (those making more than $75,000 a year) is 15 percentage points.[:]
A modest mode effect is observed on a question asking about working with other people in the neighborhood to fix a problem or improve a condition: 38% on the phone report doing so, compared with 33% on the Web. The mode difference was 10 points among higher income respondents.
One other aspect of social capital is social trust, but the study finds no significant mode difference in responses to a standard question about trusting other people.
The Impact of Mode on Reports of Financial Circumstances
A series of questions on personal financial conditions uncover further differences in response by mode. Web respondents are more likely than those on the phone to say that in the past year they have had trouble paying for food for their family and that they were not able to see a doctor when they needed to because of cost. The effect is strongest among those with lower incomes. [Emphasis added] [:]
In a related item, Web survey respondents are somewhat more likely than telephone survey respondents to say that in the past year they needed to see a doctor [replace “see a doctor” with “purchase hearing aids”] but were not able to because of cost (22% in the phone survey, 28% on the Web survey) [And this difference may be more due to hearing aids being a big ticket item.]. Among non-whites, the mode gap is 17 percentage points (40% on the Web, 23% on the phone). Among whites, there is no difference (22% on the web, 21% on the phone). This question illustrates how a mode effect could lead to an incorrect conclusion: A phone interview would suggest that whites and non-whites have similar access to a doctor, while the web interview shows a sizeable difference in access [Emphasis added].
The mode effect is particularly evident among those who say (in a separate question) that they have not seen a doctor at all in the past year. Among phone respondents who report that they have not visited a doctor in the past year, 23% say they have not seen a doctor because of cost; among Web respondents, 46% say this. By contrast, no mode effect is apparent among people who said they have been to the doctor at least once in the past year.[:]
Use of Internet and TechnologyThis section builds on More on HIA’s Defective MarkeTrak 9 “Study”, which is a discussion of income-bifurcated internet access issues among older adults.]
Questions about internet usage and technology may be particularly sensitive to survey mode if one of the modes is the internet itself. Using the internet to take a survey may bring to mind thoughts and ideas about technology use that might not arise when taking a survey by phone, simply because the context is directly related to the subject matter. [Emphasis added] It is also possible that people who are more likely to respond to a survey request when it comes via Web than via phone are different with respect to their technology use, and these differences may not be corrected by weighting. [Emphasis added] [The converse may also be true among the hearing impaired cohort due to speech perception difficulties unrelated to cognition as discussed above: Whereas a normal-hearing person has little or no difficulty conversing on the phone, a hearing impaired person may be bothered by using it, skewing the results.]
The share of respondents who reported using the internet every day was not significantly different in the Web and phone groups (84% vs. 82% respectively). But among daily internet users, the difference in regularity of use was significant, with 36% of the Web group indicating that they use the internet constantly, compared with 28% of the phone group [and this effect may be, pardon the pun, amplified among a hearing impaired study cohort]. An examination of our panelists’ responses to questions about technology use in previous waves suggests that part of this 8-percentage point difference is attributable to differences in the type of people who were more likely to respond to the Web survey, while part is due to the mode of the interview itself.
We compared the results among respondents who indicated in Wave 1 of the panel (March-April 2014) that they use one of several social networks several times a day.4 Across both frequent and less frequent social media users, the percentage of Web respondents reporting constant internet use is 5 percentage points higher than for phone respondents. Although frequency of social media use is not a perfect proxy for constant internet use, the fact that the mode difference is identical for frequent and less frequent social media users suggests that people with similar internet usage habits answer the question in different ways depending on the mode in which it is asked.
On the other hand, exploring this previously collected data also reveals that 40% of the Web-mode respondents are frequent social media users, compared with only 30% of phone respondents. This means that the overall mode difference in the percentage reporting constant internet usage is likely a function of both the way respondents answer the question and true differences in internet usage between the two groups.
All of the participants in this experiment were enrolled in the panel for several months prior to the mode study, but they varied considerably in how many previous waves they had responded to. Those who had been regular participants were more apt to respond to this study’s invitation if they were assigned to the Web group than if they were assigned to the phone group (perhaps because that was more familiar to them). Less regular participants were the opposite: They were more likely to respond if assigned to the phone group (perhaps because they are less comfortable on the Web). In short, those who responded on the Web may be more Web savvy than their counterparts in the phone group.5 Altogether, this serves to demonstrate the difficulties inherent in conducting survey research on topics where the mode of data collection may be related to both survey participation and measurement.
For other technology related items, the effects are smaller. About half (54%) of Web respondents reported playing a game on a computer, mobile device or video game console the previous day, compared with 48% of phone respondents. Web and phone respondents were statistically equivalent in their reporting of worries about computers and technology being used to invade privacy (26% of Web respondents say they worry “a lot” vs. 22% on the phone), sending an email or a text to a friend or relative the previous day (79% for both) and use of a social networking site the previous day (69% on Web vs. to 66% on phone).
Mode Effects for Autobiographical and Factual Knowledge
Answering a survey question requires respondents to recall certain kinds of relevant information and to use this information to help formulate an answer. The mode of the interview can affect this process of recall in different ways, making it easier or more difficult for respondents to perform the necessary search of memory or by providing the motivation to conduct a thorough search. For example, an interviewer may be able to encourage respondents to make the effort to recall if they read a newspaper the previous day. At the same time, the respondents on the telephone may feel some pressure to answer quickly, so as not to keep the interviewer waiting. On the Web, respondents are free to take longer to think about the answer to a question. Time to think can be particularly important for questions that may require respondents to recall autobiographical or factual information from memory.6
One question that might have been vulnerable to the interviewer’s presence asked for an estimate of the number of times in the past 12 months that the respondent had seen a doctor or other health care professional. Although the distribution of the answers was nearly identical on the Web and the phone, a follow-up question found interesting differences in how respondents arrived at their answers. Offered four options for how they came up with their answer, by a margin of 15% to 7%, more phone than Web respondents said that they estimated the number “based on a general impression.” This difference, though modest in size, could indicate that some phone respondents are more likely to take the easiest possible route to an answer in order to save time. Alternatively, it could reflect a recency effect in that this option was the last of the four to be read to respondents.[Sections on factual knowledge and cheating struck.]
This study has examined the impact of survey mode on a wide range of public opinion questions drawn from those commonly asked by Pew Research Center and related organizations. While many of the differences discovered between modes are modest, some are sizeable. And many of the differences are consistent with the theory that respondents are more likely to give answers that paint themselves or their communities in a positive light, or less likely to portray themselves negatively, when they are interacting with an interviewer. This appears to be the case with the questions presenting the largest differences in the study — satisfaction with family and social life, as well as questions about the ability to pay for food and medical care. [Emphasis added] The fact that telephone respondents consistently exhibit more socially desirable reporting is consistent with a large body of preexisting literature on the topic. For most of these and other differences described here, there is no way to determine whether the telephone or the Web responses are more accurate, though previous research examining questions for which the true value is known have found that self-administered surveys generally elicit more accurate information than interviewer-administered surveys.8[:]
We did see evidence that reports of frequent internet use may be inflated in Web surveys relative to phone surveys, as well as indications that heavy internet users are more prevalent in the Web sample. Although responses to other questions about technology use were largely consistent across modes, researchers should be aware of the potential for differences due to both nonresponse and measurement error when studying these kinds of items.
Yet while significant mode effects are seen on a variety of measures, an equal number displayed only small or non-existent mode differences. Many of the items asking about concrete events, characteristics or attributes did not appear affected by the mode of interview. These included questions about passport and driver’s license ownership, race and religious affiliation, as well as most questions about specific activities engaged in “yesterday.”
What then should survey designers do when deciding among modes of data collection? This study suggests that there may be advantages to self-administered data collection via the Web, particularly if the survey seeks to measure socially desirable or sensitive topics. The willingness of respondents to express more negative attitudes about their personal lives or toward political figures could reflect a greater level of candidness, although we have no way of knowing which set of answers is more consistent with actual behavior outside of the survey context. [Emphasis added]
That being said, this study can only speak to the possible effects of mode choice on measurement error, which is only one of many possible sources of error that can affect survey quality. Great pains were taken to ensure that the experimental groups were equivalent, and the sample comes from a pre-recruited, probability-based Web panel. Even in this carefully controlled scenario, we found that respondents who had ignored all previous survey requests were more likely to respond when they were contacted over the phone.
Even with declining response rates, telephone surveys continue to provide access to survey samples that are broadly representative of the general public. Many members of the general public still lack reliable access to the internet, making coverage a concern in practice. [But this may be skewed in different directions by hearing loss vs phone, and internet use vs income, which itself is bifurcated by age.] Random Digit Dial (RDD) phone surveys have been found to perform better than many probability-based Web surveys at including financially struggling individuals, those with low levels of education and linguistic minorities. Researchers should carefully consider the tradeoffs between measurement error on the one hand and coverage and nonresponse error on the other. Studies using both Web and telephone components – so-called mixed mode studies – may become more common, and many researchers believe that self-administration via the internet will eventually become the standard method of survey research. Pew Research Center and other scholars are currently developing methods for combining data collected from different modes so that disruption to long-standing trend data is minimized. [Emphasis added.]
The mode study was conducted using the Pew Research Center’s American Trends Panel, a probability-based, nationally representative panel of US adults living in households. Respondents who self-identify as internet users (representing 89% of U.S. adults) participate in the panel via monthly self-administered Web surveys, and those who do not use the internet participate via telephone or mail. The panel is managed by Abt SRBI.
All current members of the American Trends Panel were originally recruited from the 2014 Political Polarization and Typology Survey, a large (n=10,013) national landline and cellphone random digit dial (RDD) survey conducted January 23-March 16, 2014 in English and Spanish. At the end of that survey, respondents were invited to join the panel. The invitation was extended to all respondents who use the internet (from any location) and a random subsample of respondents who do not use the internet.10
Data in this report are drawn from the July wave of the panel, which was conducted July 7-August 4, 2014 among 3,351 respondents. In this study, 50% of panelists who typically take their panel surveys via the Web were randomly assigned to take the survey via the Web mode, resulting in 1,509 Web-mode completed interviews. The remaining 50% of the Web panelists were assigned to take the survey via a telephone interview (phone mode), resulting in 1,494 experimental phone-mode completed interviews. The remaining 348 interviews were completed by non-internet panelists typically interviewed by mail. These non-experimental, phone-mode respondents are not considered in the analysis of the experiment in this report but were interviewed to calculate separate general population estimates from the data in this wave of the panel.
As outlined above, all Web panelists were included in the mode study experiment. Those with a mailing address on file were mailed a pre-notification letter, customized for their treatment group (Web vs. phone mode). The letter explained that the next monthly panel wave was a special study, and that we were attempting to obtain the highest level of participation possible. As such, respondents would be given an extra $5 for completing the study beyond their usual incentive amount of $5 or $10, depending on their incentive group. All incentives were contingent upon completing the mode study survey. The letter explained to the Web-mode panelists that an email invitation would be arriving in their inbox between July 14 and 15. The non-experimental phone-mode panelists were told the survey was being conducted via telephone for this month only and that they would hear from an interviewer in the next few days. All Web panelists were also sent a pre-notification email, customized for their treatment group. This email contained the same information as the pre-notification letter sent in the mail.
Next, panelists assigned to the Web-mode treatment were sent a standard invitation email. This was followed by up to four reminder emails for nonrespondents. Panelists assigned to the phone-mode treatment were called up to 10 times. A message was left on the first call if a voicemail or answering machine was reached. No refusal conversion was attempted on soft refusals, so as not to antagonize panelists we hoped to retain for future panel waves. After completion of the survey, respondents were sent the incentive amount referenced in their survey materials via check or Amazon gift card, according to their preference.
The ATP data were weighted in a multi-step process that begins with a base weight incorporating the respondents’ original survey selection probability and the fact that some panelists were subsampled for invitation to the panel. Next, an adjustment was made for the fact that the propensity to join the panel varied across different groups in the sample. The final step in the weighting uses an iterative technique that matches gender, age, education, race, Hispanic origin and region to parameters from the U.S. Census Bureau’s 2012 American Community Survey. Population density is weighted to match the 2010 U.S. Decennial Census. Telephone service is weighted to estimates of telephone coverage for 2014 that were projected from the July-December 2013 National Health Interview Survey. It also adjusts for party affiliation using an average of the three most recent Pew Research Center general public telephone surveys, and for internet use using as a parameter a measure from the 2014 Survey of Political Polarization. Note that for the mode study, separate weights were computed for the web respondents, the experimental phone respondents, all phone respondents (experimental and non) and the total sample. Neither the web respondent weight nor the experimental phone respondent weight included the internet usage parameter in the raking as all respondents in these groups are internet users. Sampling errors and statistical tests of significance take into account the effect of weighting. The Hispanic sample in the American Trends Panel is predominantly native born and English speaking.
The following table shows the unweighted sample sizes and the error attributable to sampling that would be expected at the 95% level of confidence for different groups in the survey:
Sample sizes and sampling errors for other subgroups are available upon request.
In addition to sampling error, one should bear in mind that question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of opinion polls.
The Web component of the July wave had a response rate of 64% (1,509 responses among 2,345 individuals sampled from the panel); the experimental phone component had a response rate of 63% (1,494 responses among 2,366 individuals sampled from the panel); the total phone component (experimental and non) had a response rate of 63% (1,842 responses among 2,927 Web-based and non-Web individuals sampled from the panel). Taking account of the response rate for the 2014 survey on political polarization (10.6%), the cumulative response rate for the July ATP wave is 3.7%.
Assessing the Equivalence of the Web and Phone Groups
This study was primarily designed to evaluate differences in the way people respond to questions in different modes. In order to isolate the effects of the mode of interview itself, it is essential that comparisons between the Web and telephone groups are not confounded by systematic differences in the composition of each group. Although panel members were randomly assigned to each of the groups, if one mode or the other disproportionately represents people with particular characteristics, then differences in the response patterns may be due to nonresponse rather than the measurement itself. Because all of the panel members were recruited from the same large telephone survey, we know a great deal about both the panel members who responded and those who did not. This includes their demographic characteristics as well as their partisan and ideological leanings. In this section, we will take a look at how different subgroups responded in each mode.
The overall completion rates for each mode were nearly identical, with 64% of panelists assigned to the Web group completing the survey, compared with 63% of panelists assigned to take the survey by phone. With several notable exceptions, response was largely consistent within demographic subgroups. Web and phone response did not differ significantly by age, education, marital status, income or census region. Completion varied somewhat by mode with respect to sex, race and whether the respondent lives in an urban, suburban or rural location. Women were 6 percentage points more likely to complete the survey on the Web than on the phone, whereas the completion rate for men did not differ significantly between modes.
Non-Hispanic whites and Hispanics appear slightly more likely to respond in the Web group than the phone group. Non-Hispanic blacks showed the most pronounced difference, with a 56% completion rate in the phone group, compared with a 42% completion rate on the Web. Whereas urban panelists were equally likely to respond in either mode, suburban response was 6 percentage points higher on the Web, while response in rural areas was 10 percentage points higher by phone. Because the Web and phone samples are independently weighted to national parameters on all of the variables just described, the minor differences in the composition of the two groups that resulted from differential response propensities is corrected.
Because much of the Pew Research Center’s work relates to politics and public policy, the effects of survey design decisions on the partisan and ideological makeup of our samples is of particular importance. We do see some evidence of a mode preference along ideological lines. Panelists who identified themselves as very conservative in the recruitment survey were 11 percentage points more likely to respond if they were in the phone group. On the other end of the scale, panelists who identified as very liberal are 4 percentage points more likely to respond when in the Web group. The pattern is similar but not identical for partisanship.11 Here, Republicans and independents who lean Republican are only 2 percentage points more likely to respond in the phone group, while Democrats and those who lean Democratic are 2 percentage points more likely to respond by Web. The largest effect is found among independents who do not lean toward either party, who are 7 percentage points more likely to complete the survey in the Web group.
Despite these differences within groups in the likelihood of response, the overall demographic and partisan distributions among respondents in each mode group are remarkably similar. Prior to weighting, women make up 51% of the Web group and 49% of the phone group. Although non-Hispanic blacks were significantly more likely to complete the survey by phone, the proportion in the phone group is only 3 percentage points higher than in the Web group (9% and 6% respectively). The percentage of respondents with only a high school education or less is 4 points higher in the phone group than in the Web group. The phone group is slightly more rural, more conservative (37% very or somewhat conservative on the phone vs. 32% on the Web) and has a higher proportion of Republicans and Republican leaners than the Web group. After the groups are weighted to account for nonresponse, these differences are all largely reduced.
One sizeable difference in response that is not completely adjusted for in weighting is the regularity with which panelists responded to prior surveys. The completion rate for panelists who had responded to all three prior waves was 97% for the Web group, compared with 83% for the phone group. In the Web group, the completion rate for panelists who had missed one or more waves was 32%, compared with 44% for the phone group. This is consistent with the notion that despite all of these panelists having access to the internet, some people are easier to reach and recruit by phone than Web. After weighting, 29% of the Web group had missed one or more prior waves, compared with 43% in the phone group.
Despite this difference, the demographic distributions of the two experimental groups remain quite similar. Moreover, we repeated several of our analyses while controlling for the effects of response to previous waves, and our findings with and without these controls were very similar. The sole exception involved questions on internet use and technology. The Web-mode respondents were more likely to report using the internet “constantly” than the phone respondents, possibly because people who are more frequent internet users are also more likely to respond to a Web-based survey invitation. The telephone sample brought in respondents who are less frequent internet users and therefore less likely to respond to a Web-based survey invitation. In more technical terms, there appears to be a strong, direct association between the response mechanism and outcomes pertaining to frequency of internet use.
- For example, in a survey of university alumni, individuals were much more likely to rate a question as sensitive if their answer would place them in a socially undesirable category. See Kreuter, Frauke, Stanley Presser and Roger Tourangeau. 2008. “Social Desirability Bias in CATI, IVR, and Web Surveys the Effects of Mode and Question Sensitivity.” Public Opinion Quarterly. [Omitted from the excerpted text];
- For an overview of the effects of interviewer presence on answers to sensitive questions, see Tourangeau, Roger and Ting Yan. 2007. “Sensitive Questions in Surveys.” Psychological Bulletin;
- Krosnick, Jon A. and Duane F. Alwin. 1987. “An Evaluation of a Cognitive Theory of Response-Order Effects in Survey Measurement.” Public Opinion Quarterly;
- Wave 1 of the American Trends Panel was conducted from March 19 to April 29, 2014. Respondents were asked how often they use Facebook, Twitter, Google Plus, YouTube and LinkedIn. [Omitted from the excerpted text];
- See the Methodological Appendix for a discussion patterns in the completion rates for different subgroups;
- See Tourangeau, Roger, Lance J. Rips and Kenneth Rasinski. 2000. “The Psychology of Survey Response.” Cambridge University Press, chapter 3, for an explanation of how memory and recall function as part of the survey response process;
- Prior, Markus, and Arthur Lupia. 2008. “Money, Time, and Political Knowledge: Distinguishing Quick Recall and Political Learning Skills.” American Journal of Political Science, pages 169-183. Prior and Lupia describe this process and suggest that when respondents have more time, the survey is not only measuring stored knowledge but also the ability to search for and obtain relevant information. [Omitted from the excerpted text];
- For example, Kreuter, Presser and Tourangeau were able to compare respondent self-reports to administrative records on a number of sensitive items pertaining to academic performance such as failing classes or being placed on academic probation. Respondents who belonged to the sensitive category (e.g., having failed a class) were more likely to falsely deny it when surveyed by an interviewer over the phone than if the survey was self-administered via the Web. See Kreuter, Frauke, Stanley Presser and Roger Tourangeau. 2008. “Social Desirability Bias in CATI, IVR, and Web Surveys the Effects of Mode and Question Sensitivity.” Public Opinion Quarterly.
- Ye, Cong, Jenna Fulton and Roger Tourangeau. 2011. “More positive or More Extreme? A Meta-Analysis of Mode Differences in Response Choice.” Public Opinion Quarterly. [Omitted from the excerpted text];
- When data collection for Pew Research Center’s 2014 Political Polarization and Typology Survey began, non-internet users were subsampled at a rate of 25%, but a decision was made shortly thereafter to invite all non-internet users to join. In total, 83% of non-internet users were invited to join the panel. [Omitted from the excerpted text];
- This section refers to party identification as it was measured in Pew Research Center’s 2014 Political Polarization and Typology Survey from which the American Trends Panel was recruited. Party identification was asked again as part of the mode study, but that data is not available for nonrespondents to the mode study
About This Report
This report is a collaborative effort based on the input and analysis of the following individuals:
Scott Keeter, Director, Survey Research
Kyley McGeeney, Research Methodologist
Ruth Igielnik, Research Analyst
Andrew Mercer, Research Methodologist
Nancy Mathiowetz, University of Wisconsin-Milwaukee
Claudia Deane, Vice President, Research
Cary Funk, Associate Director, Research
Jeff Gottfried, Research Associate
Courtney Kennedy, Abt SRBI
Charles DiSogra, Abt SRBI
Chintan Turakhia, Abt SRBI
Nick Bertoni, Abt SRBI
Molly Caldaro, Abt SRBI
Marci Schalk, Abt SRBI
UPDATE (12/22/2015): Polling company Morning Consult just released their similar mode effect study titled Why does Donald Trump perform better in online versus live telephone polling? (PDF). This 13 page report is interesting as it also looks at Interactive Voice Response (IVR) as well as phone and online surveying, with about 800 in each of the three categories. The reason we’re adding this update is the mode effect for IVR surveying, which some vendors provide to clinics and dispensers for patient satisfaction surveys.
From the LA Times:
Polls may actually underestimate Trump’s support, study finds
Donald Trump leads the GOP presidential field in polls of Republican voters nationally and in most early-voting states, but some surveys may actually be understating his support, a new study suggests.
The analysis, by Morning Consult, a polling and market research company, looked at an odd occurrence that has cropped up repeatedly this year: Trump generally has done better in online polls than in surveys done by phone.
The firm conducted an experiment aimed at understanding why that happens and which polls are more accurate — online surveys that have tended to show Trump with support of nearly four-in-10 GOP voters or the telephone surveys that have typically shown him with the backing of one-third or fewer.
Their results suggest that the higher figure probably provides the more accurate measure. Some significant number of Trump supporters, especially those with college educations, are “less likely to say that they support him when they’re talking to a live human” than when they are in the “anonymous environment” of an online survey, said the firm’s polling director, Kyle Dropp.
With Trump dominating political debates in both parties, gauging his level of support has become a crucial puzzle. The Morning Consult study provides one piece of the solution, although many other uncertainties remain.
Among the complicating factors is this: The gap between online and telephone surveys has narrowed significantly in surveys taken in the last few weeks. That could suggest that Republicans who were reluctant to admit to backing Trump in the past have become more willing to do so recently. [This dynamic element was not addressed in the Pew study above.] [:]
Still, the Morning Consult experiment sheds considerable light on an issue that has puzzled pollsters for months.
The firm polled 2,397 potential Republican voters earlier this month, randomly assigning them to one of three different methods — a traditional telephone survey with live interviewers calling landlines and cellphones, an online survey and an interactive dialing technique that calls people by telephone and asks them to respond to recorded questions by hitting buttons on their phone. [Emphasis added.]
We’ll pause here to discuss IVR for patient satisfaction surveys: As we discussed above, telephone surveying the hearing impaired community has additional factors which must be addressed:
- General age-related cognitive losses
- Working memory issues (more here);
- Additional cognitive losses from long-term untreated hearing loss;
- Poor transmitted audio quality;
- Poor speech production by the announcer;
- Additional “cognitive overload” due to poor phone speech perception;
- Phone captioning errors on CaptionCall or CapTel phones.
By randomly assigning people to the three different approaches and running all at the same time, the researchers hoped to eliminate factors that might cause results to vary from one poll to another.
The experiment confirmed that “voters are about six points more likely to support Trump when they’re taking the poll online then when they’re talking to a live interviewer,” said Dropp.
The most telling part of the experiment, however, was that not all types of people responded the same way. Among blue-collar Republicans, who have formed the core of Trump’s support, the polls were about the same regardless of method. But among college-educated Republicans, a significant difference appeared, with Trump scoring 9 points better in the online poll.
The most likely explanation for that education gap, Dropp and his colleagues believe, is a well-known problem known as social-desirability bias — the tendency of people to not want to confess unpopular views to a pollster.
Blue-collar voters don’t feel embarrassed about supporting Trump, who is very popular in their communities, the pollsters suggested. But many college-educated Republicans may hesitate to admit their attraction to Trump, the experiment indicates.
In a public setting such as the Iowa caucuses, where people identify their candidate preference in front of friends and neighbors, that same social-desirability bias may hold sway.
But in most primaries, where voters cast a secret ballot, the study’s finding suggests that anonymous online surveys — the ones that typically show Trump with a larger lead — provide the more accurate measure of his backing.
“It’s our sense that a lot of polls are under-reporting Trump’s overall support,” Dropp said.
- Pew has excellent polling; but can sometimes yield scary results;
- Check out the “methodology” in this brand new (12/20/15) CBS/YouGov poll — You’ll run screaming from the room!
- Adequate sample variance in audiologic research is a pet peeve of ours: Oftentimes we consume hearing aid and (especially) CI speech perception research where the samples are so clustered at or near the ceiling that the “rationalized arcsine transform” is needed to pry it apart by linearizing the proportions due to shoddy study design. Although we have no connection to this website, this particular article on the subject is spot-on~