CoolData blog

18 November 2010

Survey says … beware, beware!

Filed under: Alumni, skeptics, Surveying — Tags: , , — kevinmacdonell @ 4:45 pm

I love survey data. But sometimes we get confused about what it’s really telling us. I don’t claim to be an expert on surveying, but today I want to talk about one of the main ways I think we’re led astray. In brief: Surveys would seem to give us facts, or “the truth”. They don’t. Surveys reveal attitudes.

In higher education, surveying is of prime importance in benchmarking constituent engagement in order to identify programmatic areas that are underperforming, as well as areas that are doing well and for which making changes therefore entails risk. Making intelligent, data-driven decisions in these areas can strengthen programming, enhance engagement, and finally increase giving. And there’s no doubt that the act of responding to a survey, the engagement score that might result, and the responses to individual questions or groups of questions, are all predictive of giving. I have found so myself in my own predictive modeling at two universities.

But let’s not get carried away. Survey data can be a valuable source of predictor variables, but it’s a huge leap from making that admission to saying that survey data trumps everything.

I know of at least one vendor working in the survey world who does make that leap. This vendor believes surveying is THE singular best way to predict giving, and that survey responses have it all over the regular practice of predictive modeling using variables mined from a database. Such “archival” data provides “mere correlates” of engagement. Survey data provides the real goods.

I see the allure. Why would we put any stock in some weak correlation between the presence of an email address and giving, when we can just ask them how they feel about giving to XYZ University?


I have incorporated survey data in my own models, data that came from two wide-ranging, professionally-designed, Likert-type surveys of alumni engagement. Survey data is great because it’s fresh, independent of giving, and revealing of attitudes. It is also extremely biased in favour of highly-engaged alumni, and is completely disconnected from reality when it comes to gathering facts as opposed to attitudinal data.

Let me demonstrate the unreliability of survey data with regard to facts. Here are a few examples of statements and responses (one non-Likert), gathered from surveys of two institutions:

  • “I try to donate every year” — 946 individuals answered “agree” or “strongly agree” — but 12.3% of those 946 had no lifetime giving.
  • “I support XYZ University regularly” — 1,001 individuals answered “agree” or “strongly agree” — but 18.7% of them had no lifetime giving.
  • “Have you ever made a charitable gift to XYZ University (Y/N)?” — 1,690 individuals said “Yes” — but 8.1% of them had no lifetime giving.
  • “I support XYZ University to the best of my capacity” — 1,498 individuals answered “agree” or “strongly agree” — but 39.6% of them had no lifetime giving!

And, even stranger:

  • “I try to donate every year” — 1,371 answered “disagree” or “strongly disagree” — but 27.7% of those respondents were in fact donors!

Frankly, if I asked survey-takers how many children they have, I wouldn’t trust the answers.

This disconnect from reality actually works in my favour when I am creating predictive models, because I have some assurance that the responses to these questions is not just a proxy for ‘giving’, but rather a far more complicated thing that has to do with attitude, not facts. But in no model I’ve created has survey data (even carefully-selected survey data strongly correlated with giving) EVER been more predictive than the types of data most commonly used in predictive models — notably age/class year, the presence/absence of certain contact information, marital status, employment information, and so on.

For the purposes of identifying weaknesses or strengths in constituent engagement, survey data is king. For predicting giving in its various forms, survey data and engagement scores are just more variables to test and work into the model — nothing more, nothing less — and certainly not something magical or superior to the data that institutions already have in their databases waiting to be mined. I respect the work that people are doing to investigate causation in connection with giving. But when they criticize the work of data miners as “merely” dealing in correlation, well that I have a problem with.


1 Comment »

  1. I had a quantitative analysis professor who told us that most organization’s surveys were little more than customer satisfaction polls of people who already tended to have a favorable view. If you weren’t getting 80% approval in the top 2 boxes it meant you had a huge PR problem but not necessarily any more than that.

    I often find that organizations do little to keep respondents from taking a survey more than once. If someone sends me an online link I always try to take it a second time just to see if they have take measures to prevent duplication and inform them if I can skew the results.

    I am often surprised to find how many organizations don’t even anticipate this even when they are asking about contentious issues. I know of examples where one person (or a small group of people) purposely skewed the results just to get the organization to change procedures/policies to suit him/them.

    Comment by artem1s — 19 November 2010 @ 12:18 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at

%d bloggers like this: