CoolData blog

9 June 2011

Young alumni are a whole different animal

Filed under: Alumni, Annual Giving, Model building, Predictor variables — Tags: , — kevinmacdonell @ 12:23 pm

My Phonathon program hires about thirty students a year. These are mature, reliable employees whom I’d recommend to any prospective future employer. They’re also, well, young. When I was in university, many of them hadn’t even been born.

So, yeah, they’re different from me. They’re different in terms of girth, taste in music and facility with pop-culture references. And they’re different in the data.

Grads who are just beginning their careers as alumni will lack most of the engagement-related attributes we usually rely on for predictive models: event attendance, volunteer activity, employment updates, a business phone. Therefore, variables that relate to their recent student experience are likely to loom larger for them than for their older counterparts. At the same time, recent grads tend to have a richer variety of data in their records, as database usage has increased across the enterprise through the years.

These two differences mark young alumni as a distinct population: One, differences in the distribution of variables that all alumni share, and two, the existence of variables that only younger alumni can have.

It makes me wonder why I’m still lumping young alumni in with older alumni in my predictive models. You might recall that a while ago I was bragging about how well my Phonathon model worked to predict propensity to give in response to phone solicitation. I also mentioned that, unfortunately, the model under-performed in predicting acquisition of young donors.

Okay, it didn’t under-perform — it failed. I concluded that young alumni need their own, separate model.

Where do we draw the line for “young alumni”? One possibility is that we go with our program’s definition of young alums — for me, that’s anyone who has earned a degree in any of the past three years and is under 35. Others might use graduates of the last decade.

This might be fine, but keep in mind that the training sample in a predictive model doesn’t have to follow the strict definition of the population that the appeal is targeting. We need a critical mass of donors in our sample population in order to train the model, therefore we might be more successful if we drew a larger, more loosely-defined sample. Our sample will include some alumni who are slightly older than the alumni who will get the “young alum” appeal — that’s okay, because they’re in the sample for only one reason: training the model.

However you draw the line, the distinction rests on the answer to this question: Is the data that describes one group different from the data that describes another? They may all be alumni, but can they also be thought of as separate populations, in terms of the data that was collected on them?

If you audit the data in certain tables, you might be able to find an “information bump”. That’s what I call the approximate year in which an institution started collecting and storing a lot more information on incoming students. In the data I’m familiar with, that bump has occured in the last ten to 15 years.

One of the most noticeable areas where data recording has increased is in personal information. Nowadays you can find Social Security Number (or in Canada, Social Insurance Number), religion, ethnicity, next-of-kin information, citizenship, driver’s license status, even eye and hair colour. Auditing these fields will tell you when data collection was ramped up, but probably won’t yield many useful predictors as they don’t have much to do with engagement. Certain types of personal information may also be off limits to you.

Investigate personal information if you can, but be sure to look around for other, more relevant data. Some examples:

  • Whether they lived in residence — If you don’t have direct access to this, the answer might be lurking in the alum’s past address data.
  • Athletics involvement — Count of activities, or a yes/no indicator.
  • Club and society activities — Count of activities, or a yes/no indicator.
  • Greek society membership — Yes/no.
  • Whether they were transfer students or received all of their degree credits from your institution
  • Whether they were employed on campus while a student
  • Whether they were recipients of awards, prizes, scholarships or bursaries
  • Whether they signed up for Email for Life, or otherwise kept their university email address or other university login active — In my data, more than 98% of the most recent grad class has an active university login. That drops to about 84% for the grad class of 2010, then 38% for 2009. The percentages continue to fall gradually from there. This attrition effect might hide the fact that retaining a student login past graduation is a strong indicator of affinity. I will write more on this topic in a future post.
  • Online community membership or activity

Oh, and don’t ignore the usual variables, such as marital status! In any conventional predictive model I’ve ever worked on, having a marital status of “single” in the database was a strong negative predictor of giving. But when I reduced my sample to graduates from the past ten years who were no older than 35, I was surprised to see that predictor turn into a strong positive. Although married alumni were still more likely to give, the “singles” were right behind them — and far ahead of the alumni for whom the marital status was missing. In my new model, I will use both “married” and “single” as predictors. Although the marrieds are more likely to be donors, there are relatively few of them; being coded single in our database could well prove to be a leading predictor of giving. (You will need to know, of course, why some alums are coded and others not. I’m still investigating.)

When September rolls around, I’ll be another three months older, and there’s nothing I can do about that. At least I’ll know my hard-working callers will be well-focused, talking to the recent grads who are most ready to make their very first gift to the Annual Fund.



  1. Great post Kevin! I think “young alumni” (say those under the age of 35) are a very interesting sub sect, both from a data perspective, and from a behavioral perspective.

    I do agree with your assessment, that from a data profile point of view, older alumni have great depth but a narrow array of fields. Younger alumni don’t have as much depth to their data, but a wider array of data points that are actively monitored/entered into the database. This can make for some complicated pan-generational comparisons.

    I also think you see some different behaviors in respect to phone use in general among younger alumni (read: younger generations). Many have their phones almost fused to their hands, but do not like actually talking (many prefer text). Overall I think phonathon participation is highest with older constituents, where as a text campaign, or email solicitation has more engagement among younger populations.

    There are also some counter intuitive phone # “data forces” at play. Studies have shown that younger people are much more liberal with sharing information like their phone number (or posting things on facebook), but many do not have land lines, and cell phones are not publicly searchable records. So they are more likely to give you the number if you ask, but its harder to find their number from a broader acquisition or research perspective.

    Good stuff…look forward to future posts on young alumni/giving behavior.

    Comment by Alexander Oftelie — 9 June 2011 @ 5:31 pm

    • Thanks for your comment … Yes, and there’s a big difference between having a phone number and being able to get to actually speak with someone, particularly someone with a mobile instead of a landline. I will be posting some new research shortly which deals specifically with the mounting problem of low contact rates.

      Comment by kevinmacdonell — 15 June 2011 @ 1:08 pm

  2. […] final note: This post follows a previous one called Young alumni are a whole different animal, which was about building models dedicated to predicting young-alumni giving. Variables such as the […]

    Pingback by Special variables for predicting young alumni giving « CoolData blog — 14 June 2011 @ 8:04 am

  3. What is your target variable?

    Comment by Justin Brasfield — 15 June 2011 @ 12:39 pm

    • I tested these predictors against a couple of target variables, one being lifetime giving (continuous) and the other a binary variable (is or is not a donor). The model I will eventually build based on these investigations will almost certainly be a logistic regression model with the ‘is a donor’ binary target.

      Comment by kevinmacdonell — 15 June 2011 @ 1:04 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: