CoolData blog

4 November 2013

Census Zip Code data versus internal data as predictors of alumni giving

Guest post by Peter Wylie and John Sammis

Thanks to data available via the 2010 US Census, for any educational institution that provides us zip codes for the alums in its advancement database, we can compute such things as the median income and the median house value of the zip code in which the alum lives.

Now, we tend to focus on internal data rather than external data. For a very long time the two of us have been harping on something that may be getting a bit tiresome: the overemphasis on finding outside wealth data in major giving, and the underemphasis on looking at internal data. Our problem has been that we’ve never had a solid way to systematically compare these two sources of data as they relate to the prediction of giving in higher education.

John Sammis has done a yeoman’s job of finding a very reasonably priced source for this Census data as well as building some add-ons to our statistical software package – add-ons that allow us to manipulate the data in interesting ways. All this has happened within the last six months or so, and I’ve been having a ball playing around with this data, getting John’s opinions on what I’ve done, and then playing with the data some more.

The data for this piece come from four private, small to medium sized higher education institutions in the eastern half of the United States. We’ll show you a smidgeon of some of the things we’ve uncovered. We hope you’ll find it interesting, and we hope you’ll decide to do some playing of your own.

Download the full, printer-friendly PDF of our study here (free, no registration required): Census ZIP data Wylie & Sammis.

20 May 2010

External data project: “Do Not Call” as a predictor

Filed under: External data, Phonathon, Predictor variables — Tags: , — kevinmacdonell @ 7:40 am

Here’s a little variable-creation project which shouldn’t cost much and might yield new insights into the behaviour of your database constituents, especially in connection with propensity to give over the phone: Have they registered their home phone numbers with the Do Not Call list?

A number of US states and some countries such as Canada and the United Kingdom have created these registries for  citizens keen on avoiding getting solicited at home by telemarketers. Here in Canada, all a person has to do to register is go to a website and enter their phone number(s). Commercial telemarketers are prohibited from calling any numbers on the list; violations bring stiff penalties. (If anyone can catch them.)

Charities such as my employer are exempt from the ban, so it doesn’t affect the phonathon program (although we must adhere to the stated preferences of our alumni and maintain an internal do-not-call list). But it stands to reason that there might be some connection between not wanting a call from a telemarketer, and not wanting any phone solicitation whatsoever, including from us. If being registered with the DNCL is negatively correlated with phone-solicited giving, we might gain a useful predictive variable about the people who have not already taken the step of adding themselves to our internally-maintained exclusion list.

In Canada, organizations may sign up to access this list of banned phone numbers. (Visit National DNCL page.) If they are businesses seeking to solicit Canadians by phone, they have no choice: They have to sign up and pay for the lists in order to exclude them from their calling efforts. But there’s nothing to stop exempt charities from signing up as well for the purposes of research.

So that’s what I did. In January 2009, I registered our university as a user of the Do Not Call service, and downloaded six lists of phone numbers. Each list, comprised of a single area code, cost $55. I chose the codes that captured the primary geographic areas where our alumni live. The top six area codes covered 80% of living alumni. After that it would have been a case of diminishing returns — I would have had to purchase seven more area codes to get to 90%. For this experiment, I was good with 80%.

What you get is just a list of phone numbers, but these files are huge — several megabytes in some cases, even in compressed format. If you purchase more than one file, open the smallest one in Excel to inspect the data. In order to match up these numbers with the home phone numbers in your database, you’ll need to ensure that they’re formatted in exactly the same way (i.e., no dashes, full 10 digits, whatever).

That done, now you can simply bring in the file as a new variable in your model. (If you’re using Data Desk, follow these directions for adding a new variable to an existing data set.) You may need to create a variable name, as the file you’ve downloaded might not have a column label and will use the first phone number by default. Code your matches as ‘1’ and everyone else as ‘0’, and test the results against ‘Giving’, or whatever your predicted value happens to be. Keep in mind that your list is specific to a geographic region; while you’re testing for a Do Not Call effect, you will want to exclude records outside the country or state you’re studying.

I’ve tested against both lifetime giving AND phonathon-only giving, and got some interesting results, which I’ll write about later. Give it a try.

Create a free website or blog at