CoolData blog

2 May 2013

New twists on inferring age from first name

Filed under: Analytics, Coolness, Data Desk, Fun — Tags: , , , — kevinmacdonell @ 6:14 am

Not quite three years ago I blogged about a technique for estimating the age of your database constituents when you don’t have any relevant data such as birth date or class year. It was based on the idea that many first names are typically “young” or “old.” I expanded on the topic in a followup post: Putting an age-guessing trick to the test. Until now, I’ve never had a reason to guess someone’s age — alumni data is pretty well supplied in that department. This very month, though, I have not one but two major modeling projects to work on that involve constituents with very little age data present. I’ve worked out a few improvements to the technique which I will share today.

First, here’s the gist of the basic idea. Picture two women, named Freda and Katelyn. Do you imagine one of them as older than the other? I’m guessing you do. From your own experience, you know that a lot of young women and girls are named Katelyn, and that few if any older women are. Even if you aren’t sure about Freda, you would probably guess she’s older. If you plug these names into babynamewizard.com, you’ll see that Freda was a very popular baby name in the early 1900s, but fell out of the Top 1000 list sometime in the 1980s. On the other hand, Katelyn didn’t enter the Top 1000 until the 1970s and is still popular.

To make use of this information you need to turn it into data. You need to acquire a lot of data on the frequency of first names and how young or old they tend to be. If you work for a university or other school, you’re probably in luck: You might have a lot of birth dates for your alumni or, failing that, you have class years which in most cases will be a good proxy for age. This will be the source you’ll use for guessing the age of everyone else in your database — friends, parents and other person constituents — who don’t have ages. If you have a donor database that contains no age data, you might be able to source age-by-first name data somewhere else.

Back to Freda and Katelyn … when I query our database I find that the average age of constituents named Freda is 69, while the average age for Katelyn is 25. For the purpose of building a model, for anyone named Freda without an age, I will just assume she is 69, and for anyone named Katelyn, 25. It’s as simple as creating a table with two columns (First name and Average age), and matching this to your data file via First Name. My table has more than 13,500 unique first names. Some of these are single initials, and not every person goes by their first name, but that doesn’t necessarily invalidate the average age associated with them.

I’ve tested this method, and it’s an improvement over plugging missing values with an all-database average or median age. For a data set that has no age data at all, it should provide new information that wasn’t there before — information that is probably correlated with behaviours such as giving.

Now here’s a new wrinkle.

In my first post on this subject, I noted that some of the youngest names in our database are “gender flips.” Some of the more recent popular names used to be associated with the opposite gender decades ago. This seems to be most prevalent with young female names: Ainslie, Isadore, Sydney, Shelly, Brooke. It’s harder to find examples going in the other direction, but there are a few, some of them perhaps having to do with differences in ethnic origin: Kori, Dian, Karen, Shaune, Mina, Marian. In my data I have close to 600 first names that belong to members of both sexes. When I calculate average age by First Name separately for each sex, some names end up with the exact same age for male and female. These names have an androgynous quality to them: Lyndsay, Riley, Jayme, Jesse, Jody. At the other extreme are the names that have definitely flipped gender, which I’ve already given examples of … one of the largest differences being for Ainslie. The average male named Ainslie is 54 years older than the average female of the same name. (In my data, that is.)

These differences suggest an improvement to our age-inferring method: Matching on not just First Name, but Sex as well. Although only 600 of my names are double-gendered, they include many popular names, so that they actually represent almost one-quarter of all constituents.

Now here’s another wrinkle.

When we’re dealing with constituents who aren’t alumni, we may be missing certain personal information such as Sex. If we plan to match on Sex as well as First Name, we’ve got a problem. If Name Prefix is present, we can infer from whether it’s Mr., Ms., etc., but unless the person doing the data entry was having an off day, this shouldn’t be an avenue available to us — it should already be filled in. (If you know it’s “Mrs.,” then why not put in F for Sex?) For those records without a Sex recorded (or have a Sex of ‘N’), we need to make a guess. To do so, we return to our First Names query and the Sex data we do have.

In my list of 600 first names that are double-gendered, not many are actually androgynous. We have females named John and Peter, and we have males named Mary and Laura, but we all know that given any one person named John, chances are we’re talking about a male person. Mary is probably female. These may be coding errors or they may be genuine, but in any case we can use majority usage to help us decide. We’ll sometimes get it wrong — there are indeed boys named Sue — but if you have 7,000 Johns in your database and only five of them are female, then let’s assume (just for the convenience of data mining*) that all Johns are male.

So: Query your database to retrieve every first name that has a Sex code, and count up the instance of each. The default sex for each first name is decided by the highest count, male or female. To get a single variable for this, I subtract the number of females from the number of males for each first name. Since the result is positive for males and negative for females, I call it a “Maleness Score” — but you can do the reverse and call it a Femaleness Score if you wish! Results of zero are considered ties, or ‘N’.

At this point we’ve introduced a bit of circularity. For any person missing Age and Sex, first we have to guess their sex based on the majority code assigned to that person’s first name, and then go back to the same data to grab the Age that matches up with Name and Sex. Clearly we are going to get it very wrong for a lot of records. You can’t expect these guesses to hold up as well as true age data. Overall, though, there should be some signal in all that noise … if your model believes that “Edgar” is male and 72 years of age, and that “Brittany” is female and 26, well, that’s not unreasonable and it’s probably not far from the truth.

How do we put this all together? I build my models in Data Desk, so I need to get all these elements into my data file as individual variables. You can do this any way that works for you, but I use our database querying software (Hyperion Brio). I import the data into Brio as locally-saved tab-delimited files and join them up as you see below. The left table is my modeling data (or at least the part of it that holds First Name), and the two tables on the right hold the name-specific ages and sexes from all the database records that have this information available. I left-join each of these tables on the First Name field.

age_tablesWhen I process the query, I get one row per ID with the fields from the left-hand table, plus the fields I need from the two tables on the right: the so-called Maleness Score, Female Avg Age by FName, Male Avg Age by Fname, and N Avg Age by Fname. I can now paste these as new variables into Data Desk. I still have work to do, though: I do have a small amount of “real” age data that I don’t want to overwrite, and not every First Name has a match in the alumni database. I have to figure out what I have, what I don’t have, and what I’m going to do to get a real or estimated age plugged in for every single record. I write an expression called Age Estimated to choose an age based on a hierarchical set of IF statements. The text of my expression is below — I will explain it in plain English following the expression.

if len('AGE')>0 then 'AGE'

else if textof('SEX')="M" and len('M avg age by Fname')>0 then 'M avg age by Fname'
else if textof('SEX')="M" and len('N avg age by Fname')>0 then 'N avg age by Fname'
else if textof('SEX')="M" and len('F avg age by Fname')>0 then 'F avg age by Fname'

else if textof('SEX')="F" and len('F avg age by Fname')>0 then 'F avg age by Fname'
else if textof('SEX')="F" and len('N avg age by Fname')>0 then 'N avg age by Fname'
else if textof('SEX')="F" and len('M avg age by Fname')>0 then 'M avg age by Fname'

else if textof('SEX')="N" and 'Maleness score'>0 and len('M avg age by Fname')>0 then 'M avg age by Fname'
else if textof('SEX')="N" and 'Maleness score'<0 and len('F avg age by Fname')>0 then 'F avg age by Fname'
else if textof('SEX')="N" and 'Maleness score'=0 and len('N avg age by Fname')>0 then 'N avg age by Fname'

else if len('N avg age by Fname')>0 then 'N avg age by Fname'
else if len('F avg age by Fname')>0 then 'F avg age by Fname'
else if len('M avg age by Fname')>0 then 'M avg age by Fname'

else 49

Okay … here’s what the expression actually does, going block by block through the statements:

  1. If Age is already present, then use that — done.
  2. Otherwise, if Sex is male, and the average male age is available, then use that. If there’s no average male age, then use the ‘N’ age, and if that’s not available, use the female average age … we can hope it’s better than no age at all.
  3. Otherwise if Sex is female, and the average female age is available, then use that. Again, go with any other age that’s available.
  4. Otherwise if Sex is ‘N’, and the Fname is likely male (according to the so-called Maleness Score), then use the male average age, if it’s available. Or if the first name is probably female, use the female average age. Or if the name is tied male-female, use the ‘N’ average age.
  5. Otherwise, as it appears we don’t have anything much to go on, just use any available average age associated with that first name: ‘N’, female, or male.
  6. And finally, if all else fails (which it does for about 6% of my file, or 7,000 records), just plug in the average age of every constituent in the database who has an age, which in our case is 49. This number will vary depending on the composition of your actual data file — if it’s all Parents, for example, then calculate the average of Parents’ known ages, excluding other constituent types.

When I bin the cases into 20 roughly equal groups by Estimated Age, I see that the percentage of cases that have some giving history starts very low (about 3 percent for the youngest group), rises rapidly to more than 10 percent, and then gradually rises to almost 18 percent for the oldest group. That’s heading in the right direction at least. As well, being in the oldest 5% is also very highly correlated with Lifetime Giving, which is what we would expect from a donor data set containing true ages.

est_age_vingt

This is a bit of work, and probably the gain will be marginal a lot of the time. Data on real interactions that showed evidence of engagement would be superior to age-guessing, but when data is scarce a bit of added lift can’t hurt. If you’re concerned about introducing too much noise, then build models with and without Estimated Age, and evaluate them against each other. If your software offers multiple imputation for missing data as a feature, try checking that out … what I’m doing here is just a very manual form of multiple imputation — calculating plausible values for missing data based on the values of other variables. Be careful, though: A good predictor of Age happens to be Lifetime Giving, and if your aim is to predict Giving, I should think there’s a risk your model will suffer from feedback.

* One final note …

Earlier on I mentioned assuming someone is male or female “just for the convenience of data mining.”  In our databases (and in a conventional, everyday sense too), we group people in various ways — sex, race, creed. But these categories are truly imperfect summaries of reality. (Some more imperfect than others!) A lot of human diversity is not captured in data, including things we formerly thought of as clear-cut. Sex seems conveniently binary, but in reality it is multi-category, or maybe it’s a continuous variable. (Or maybe it’s too complex for a single variable.) In real life I don’t assume that when someone in the Registrar’s Office enters ‘N’ for Sex that the student’s data is merely missing. Because the N category is still such a small slice of the population I might treat it as missing, or reapportion it to either Male or Female as I do here. But that’s strictly for predictive modeling. It’s not a statement about transgendered or differently gendered people nor an opinion about where they “belong.”

30 April 2013

Final thoughts on Phonathon donor acquisition

No, this is not the last time I’ll write about Phonathon, but after today I promise to give it a rest and talk about something else. I just wanted to round out my post on the waste I see happening in donor acquisition via phone programs with some recent findings of mine. Your mileage may vary, or “YMMV” as they say on the listservs, so as usual don’t just accept what I say. I suggest questions that you might ask of your own data — nothing more.

I’ve been doing a thorough analysis of our acquisition efforts this past year. (The technical term for this is a WTHH analysis … as in “What The Heck Happened??”) I found that getting high phone contact rates seemed to be linked with making a sufficient number of call attempts per prospect. For us, any fewer than three attempts per prospect is too few to acquire new donors in any great number. In general, contact rates improve with call attempt numbers above three, and after that, the more the better.

“Whoa!”, I hear you protest. “Didn’t you just say in your first post that it makes no sense to have a set number of call attempts for all prospects?”

You’re right — I did. It doesn’t make sense to have a limit. But it might make sense to have a minimum.

To get anything from an acquisition segment, more calling is better. However, by “call more” I don’t mean call more people. I mean make more calls per prospect. The RIGHT prospects. Call the right people, and eventually many or most of them will pick up the phone. Call the wrong people, and you can ring them up 20, 30, 50 times and you won’t make a dent. That’s why I think there’s no reason to set a maximum number of call attempts. If you’re calling the right people, then just keep calling.

What’s new here is that three attempts looks like a solid minimum. This is higher than what I see some people reporting on the listservs, and well beyond the capacity of many programs as they are currently run — the ones that call every single person with a phone number in the database. To attain the required amount of per-prospect effort, those schools would have to increase phone capacity (more students, more nights), or load fewer prospects. The latter option is the only one that makes sense.

Reducing the number of people we’re trying to reach to acquire as new donors means using a predictive model or at least some basic data mining and scoring to figure out who is most likely to pick up the phone. I’ve built models that do that for two years now, and after evaluating their performance I can say that they work okay. Not super fantastic, but okay. I can live with okay … in the past five years our program has made close to one million call attempts. Even a marginal improvement in focus at that scale of activity makes a significant difference.

You don’t need to hack your acquisition segment in half today. I’m not saying that. To get new donors you still need lots and lots of prospects. Maybe someday you’ll be calling only a fraction of the people you once did, but there’s no reason you can’t take a gradual approach to getting more focused in the meantime. Trim things down a bit in the first year, evaluate the results, and fold what you learned into trimming a bit more the next year.

18 April 2013

A response to ‘What do we do about Phonathon?’

I had a thoughtful response to my blog post from earlier this week (What do we do about Phonathon?) from Paul Fleming, Database Manager at Walnut Hill School for the Arts in Natick, Massachusetts, about half an hour from downtown Boston. With Paul’s permission, I will quote from his email, and then offer my comments afterword:

I just wanted to share with you some of my experiences with Phonathon. I am the database manager of a 5-person Development department at a wonderful boarding high school called the Walnut Hill School for the Arts. Since we are a very small office, I have also been able to take on the role of the organizer of our Phonathon. It’s only been natural for me to combine the two to find analysis about the worth of this event, and I’m happy to say, for our own school, this event is amazingly worthwhile.

First of all, as far as cost vs. gain, this is one of the cheapest appeals we have. Our Phonathon callers are volunteer students who are making calls either because they have a strong interest in helping their school, or they want to be fed pizza instead of dining hall food (pizza: our biggest expense). This year we called 4 nights in the fall and 4 nights in the spring. So while it is an amazing source of stress during that week, there aren’t a ton of man-hours put into this event other than that. We still mail letters to a large portion of our alumni base a few times a year. Many of these alumni are long-shots who would not give in response to a mass appeal, but our team feels that the importance of the touch point outweighs the short-term inefficiencies that are inherent in this type of outreach.

Secondly, I have taken the time to prioritize each of the people who are selected to receive phone calls. As you stated in your article, I use things like recency and frequency of gifts, as well as other factors such as event participation or whether we have other details about their personal life (job info, etc). We do call a great deal of lapsed or nondonors, but if we find ourselves spread too thin, we make sure to use our time appropriately to maximize effectiveness with the time we have. Our school has roughly 4,400 living alumni, and we graduate about 100 wonderful, talented students a year. This season we were able to attempt phone calls to about 1,200 alumni in our 4 nights of calling. The higher-priority people received up to 3 phone calls, and the lower-priority people received just 1-2.

Lastly, I was lucky enough to start working at my job in a year in which there was no Phonathon. This gave me an amazing opportunity to test the idea that our missing donors would give through other avenues if they had no other way to do so. We did a great deal of mass appeals, indirect appeals (alumni magazine and e-newsletters), and as many personalized emails and phone calls as we could handle in our 5-person team. Here are the most basic of our findings:

In FY11 (our only non-Phonathon year), 12% of our donors were repeat donors. We reached about 11% participation, our lowest ever. In FY12 (the year Phonathon returned):

  • 27% of our donors were new/recovered donors, a 14% increase from the previous year.
  • We reached 14% overall alumni participation.
  • Of the 27% of donors who were considered new/recovered, 44% gave through Phonathon.
  • The total amount of donors we had gained from FY11 to FY12 was about the same number of people who gave through the Phonathon.
  • In FY13 (still in progess, so we’ll see how this actually plays out), 35% of the previously-recovered donors who gave again gave in response to less work-intensive mass mailing appeals, showing that some of these Phonathon donors can, in fact, be converted and (hopefully) cultivated long-term.

In general, I think your article was right on point. Large universities with a for-pay, ongoing Phonathon program should take a look and see whether their efforts should be spent elsewhere. I just wanted to share with you my successes here and the ways in which our school has been able to maintain a legitimate, cost-effective way to increase our participation rate and maintain the quality of our alumni database.

Paul’s description of his program reminds me there are plenty of institutions out there who don’t have big, automated, and data-intensive calling programs gobbling up money. What really gets my attention is that Walnut Hill uses alumni affinity factors (event attendance, employment info) to prioritize calling to get the job done on a tight schedule and with a minimum of expense. This small-scale data mining effort is an example for the rest of us who have a lot of inefficiency in our programs due to a lack of focus.

The first predictive models I ever created were for a relatively small university Phonathon that was run with printed prospect cards and manual dialing — a very successful program, I might add. For those of you at smaller institutions wondering if data mining is possible only with massive databases, the answer is NO.

And finally, how wonderful it is that Walnut Hill can quantify exactly what Phonathon contributes in terms of new donors, and new donors who convert to mail-responsive renewals.

Bravo!

15 April 2013

What do we do about Phonathon?

Filed under: Alumni, Annual Giving, Phonathon — Tags: , , , — kevinmacdonell @ 5:41 am

I love Phonathon. I love what it does, and I love the data it produces. But sad to say, Phonathon may be the sick old man of fundraising. In fact some have taken its pulse and declared it dead.

A few weeks ago, a Director of Annual Giving named Audra Vaz posted this question to a listserv: “I’m writing to see if any institutions out there have transitioned away from their Phonathon program. If so, how did it affect your Annual Giving program?”

A number of people immediately came to the defence of Phonathon with assurances of the long-term value of calling programs. The responses went something like this: Get rid of Phonathon?? It’s a great point of connection between an institution and its alumni, particularly its younger alumni. It’s the best tool for donor acquisition. It’s a great way to update contact and employment information. Don’t do it!

Audra wasn’t satisfied. “As currently run, it’s expensive and ineffective,” she wrote of her program at Florida Atlantic University in Boca Raton. “It takes up 30% of my budget, brings in less than 2% of Annual Fund donations and only has a 20% ROI. I could use that money for building societies, personal solicitations, and direct mail which is much more effective for us. In a difficult budget year, I cannot be nostalgic and continue to justify the bleed for a program that most institutions do yet hardly any makes money off of. Seems like a bad business model to me.”

I can’t disagree with Audra. Anyone following fundraising listservs knows that, in general, contact rates and productivity are declining year after year. And out of the contacts it does manage to make, Phonathon generates scads of pledges that are never fulfilled, entailing the additional cost of reminder mailings and write-offs. There are those who say that Phonathon should be viewed as an investment and not an expense. I have been inclined to that view myself. The problem is that yes, it IS an expense, and not a small one. If Phonathons create value in all the other ways that the defenders say they do, then where are the numbers to prove it? Where’s the ROI? Audra had numbers; the defenders did not. At strategic planning time, numbers talk louder than opinions.

When I contacted Audra recently to get permission to use her name, she told me she has opted to keep her Phonathon program for now, but will market its services to other university divisions to turn it into a revenue generator (athletics and arts ticket sales, admissions welcome calls, invitations to events, and alumni membership renewals). That sounds like a good idea. I can think of a number of additional ways to keep Phonathon alive and relevant, but since this is a data-related blog I will focus on just two.

1. Stop calling everybody!

At many institutions, Phonathon is used as a mass-contact tool for indiscriminately soliciting anyone the Annual Fund believes might have a pulse. This approach is becoming less and less sustainable. The same question is asked repeatedly on the listservs: “How many times, on average, do you attempt to call alumni non-donors before you retire their call sheet?” And then people give their one-size-fits-all answers: five times, seven times, whatever times per record. Given how graduating classes have increased in size for most institutions, I am not surprised to read that some programs are stretched too thin to call very deeply. As one person wrote recently: “Because of time and resources constraints, we’re lucky to get two attempts in with nondonor/long lapsed alumni.”

I just don’t get it.

We know that people who have attended events are more likely to pick up the phone. We know that alumni who have shared their job title with us are more likely to pick up the phone. We know that alumni who have given us their email address are more likely to pick up the phone. So why in 2013 are schools still expending the same amount of energy on each of their prospective donors as if they were all exactly alike? They are NOT all alike, and these schools are wasting time and money.

If you’ve got automated calling software, you should be adding up the number of times you’ve successfully reached individual alumni over the years (regardless of the call result), and use that data to build predictive models for likelihood to answer the phone. If you don’t have that historical data, you should at least consider an engagement-based scoring system to focus your efforts on alumni who have demonstrated some of the usual signs of affinity: coming to events, sharing contact and employment information, having other family members who are alumni, volunteering, responding to surveys and so on.

A phone contact propensity score (and related models such as donor acquisition likelihood) will allow you to make cuts to your program when and if the time comes. You can feel more confident that you’re trimming the bottom, cutting away the least productive slice of your program.

2. Think outside Phonathon!

Your phone program is a data generation machine, granting you a wide window view on the behaviours of your alumni and donors. I’m not talking just about address updates, as valuable as those are. You know how many times they’ve picked up the phone when they see your ID come up on the display, and you might also know how long they’ve spent on the phone with your student callers. This is not trivial information nor is it of interest only to Phonathon managers.

Relate this behavioural data to other desired behaviours: Are your current big donors characterized by picking up more often? Do your Planned Giving expectancies tend to have longer conversations on average? What about volunteering, mentoring, and other activities? Phone contact history is real, affinity-related data, delivered fresh to you daily, lifting the curtain on who likes you.

(When I say real data, I mean REAL. This is a record of what individuals have actually DONE, not what they’ve stated as a preference in a survey. This data doesn’t lie.)

A few closing thoughts. …

I said earlier that Phonathon has been used (or misused) as a mass-contact tool. Software and automation enables a hired team of students to make a staggering number of phone calls in a very short time. The bulk of long-lapsed and never-donors are approached by phone rather than mail: The cost of a single call attempt seems negligible, so Phonathon managers spread their acquisition efforts as thinly as possible, trying to turn over every last stone.

There’s something to be said about having adequate volume in order to generate new donors, but here’s the problem: The phone is no longer a mass-contact medium. In fact it’s well on its way to becoming a niche medium, handled by a whole new type of device. Some people answer the phone and respond positively to being approached that way, and for that reason phone will be important for as long as there are phones. But the masses are no longer answering.

These days some fundraisers think of email as their new mass-contact medium of choice. Again they must be thinking in terms of cost, since it hardly matters whether you’re sending 1,000 emails or 100,000 emails. And again they’re mistaken in thinking that email is practically free — they’re just not counting the full cost to the institution of the practice of spamming people.

The truth is, there is no reliable mass-contact medium anymore. If email (or phone, or social media) is a great fundraising channel, it’s not because it’s a seemingly cheap way to reach out to thousands of people. It’s a great fundraising channel when, and only when, it reaches out to the right people at the right time.

  1. Alumni and donors are not all the same. They are not defined by their age, address or other demographic groupings. They are individual human beings.
  2. They have preferred channels for communicating and giving.
  3. These preferences are revealed only through observation of past behaviours. Not through self-reporting, not through classification by age or donor status, not by any other indirect means.
  4. We cannot know the real preferences of everyone in our database. Therefore, we model on observed past behaviours to make intelligent guesses about the preferences we don’t already know.
  5. Our models are an improvement on current practice, but they are imperfect. All models are wrong; we will make them better. And we will keep Phonathon healthy and productive.

21 March 2013

The lopsided nature of alumni giving

Filed under: Alumni, Major Giving, Peter Wylie — Tags: , , , — kevinmacdonell @ 6:06 am

Guest post by Peter B. Wylie

(Printer-friendly PDF download of this post available here: Lopsided Nature of Alum Giving – Wylie)

Eight years ago I wrote a piece called Sports, Fund Raising, and the 80/20 Rule”. It had to do with how most alumni giving in higher education comes from a very small group of former students. Nobody was shocked or awed by the article. The sotto voce response seemed to be, “Thanks, Pete. We got that. Tell us something we don’t know.” That’s okay. It’s like my jokes. A lot of ‘em don’t get more than a polite laugh; some get stone silence.

Anyway, time passed and I started working closely with John Sammis. Just about every week we’d look at a new alumni database, and over and over, we’d see the same thing. The top one percent of alumni givers had donated more than the other ninety-nine percent.

Finally, I decided to take a closer look at the lifetime giving data from seven schools that I thought covered a wide spectrum of higher education institutions in North America. Once again, I saw this huge lopsided phenomenon where a small, small group of alums were accounting for a whopping portion of the giving in each school. That’s when I went ahead and put this piece together.

What makes this one any different from the previous piece? For one thing, I think it gives you a more granular look at the lopsidedness, sort of like Google Maps allows you to really focus in on the names of tiny streets in a huge city. But more importantly, for this one I asked several people in advancement whose opinions I respect to comment on the data. After I show you that data, I’ll summarize some of what they had to say, and I’ll add in some thoughts of my own. After that, if you have a chance, I’d love to hear what you think. (Commenting on this blog has been turned off, but feel free to send an email to kevin.macdonell@gmail.com.)

The Data

I mentioned above that I looked at data from seven schools. After some agonizing, I decided I would end up putting you to sleep if I showed you all seven. So I chopped it down to four. Believe me, four is enough to make the point.

Here’s how I’ve laid out the data:

  • For each of the four schools I ranked only the alumni givers (no other constituencies) into deciles (10 groups), centiles (100 groups), and milliles (1,000 groups), by total lifetime hard credit giving. (There is actually no such word as “milliles” in English; I have borrowed from the French.)
  • In the first table in each set I’ve included all the givers. In the second table I’ve included only the top ten percent of givers. And in the third table I’ve included only the top one percent of givers. (The chart following the third table graphically conveys some of the information included in the third table.)

To make sure all this is clear, let’s go through the data for School A. Take a look at Table 1. It shows the lifetime giving for all alumni donors at the school divided into ten equal size groups called deciles. Notice that the alums in decile 10 account for over 95% of that giving. Conversely, the alums in decile 1 account for two tenths of one percent of the giving.

Table 1: Amount and Percentage of Total Lifetime Giving in School A for all Alumni by Giving Decile

table1

Moving on to Table 2. Here we’re looking at only the top decile of alumni givers divided into one percent groups. What jumps out from this table is that the top one percent of all givers account for more than 80% of alumni lifetime giving. That’s five times as much as the remaining 99% of alumni givers.

Table 2: Amount and Percentage of Total Lifetime Giving at School A for Top Ten Percent of Alumni Donors

table2

If that’s not lopsided enough for you, let’s look at Table 3 where the top one percent of alumni givers is divided up into what I’ve called milliles. That is, tenth of a percent groups. And lo and behold, the top one tenth of one percent of alumni donors account for more than 60% of alumni lifetime giving. Figure 1 shows the same information in a bit more dramatic way than does the table.

Table 3: Amount and Percentage of Total Lifetime Giving at School A for Top One Percent of Alumni Donors

table3

figure1

What I’d recommend is that you go through the same kinds of tables and charts laid out below for Schools B, C, and D. Go as fast or as slowly as you’d like. Being somewhat impatient, I would focus on Figures 2-4. I think that’s where the real punch in these data resides.

Table 4: Amount and Percentage of Total Lifetime Giving in School B for all Alumni by Giving Decile

table4

Table 5: Amount and Percentage of Total Lifetime Giving at School B for Top Ten Percent of Alumni Donors

table5

Table 6: Amount and Percentage of Total Lifetime Giving at School B for Top One Percent of Alumni Donors

table6

figure2

Table 7: Amount and Percentage of Total Lifetime Giving in School C for all Alumni by Giving Decile

table7

Table 8: Amount and Percentage of Total Lifetime Giving at School C for Top Ten Percent of Alumni Donors

table8

Table 9: Amount and Percentage of Total Lifetime Giving at School C for Top One Percent of Alumni Donors

table9

figure3

Table 10: Amount and Percentage of Total Lifetime Giving in School D for all Alumni by Giving Decile

table10

Table 11: Amount and Percentage of Total Lifetime Giving at School D for Top Ten Percent of Alumni Donors

table11

Table 12: Amount and Percentage of Total Lifetime Giving at School D for Top One Percent of Alumni Donors

table12

figure4

When I boil down to its essence what you’ve just looked at for these three schools, here’s what I see:

  • In School B over the half of the total giving is accounted for by three tenths of one percent of the givers.
  • In School C we have pretty much the same situation as we have in School B.
  • In School D over 60% of the total giving is accounted for by two tenths of one percent of the givers.

What Some People in Advancement have to Say about All This

Over the years I’ve gotten to know a number of thoughtful/idea-oriented folks in advancement. I asked several of them to comment on the data you’ve just seen. To protect the feelings of the people I didn’t ask, I’ll keep the commenters anonymous. They know who they are, and they know how much I appreciate their input.

Here are a few of the many helpful observations they made:

Most of the big money in campaigns and other advancement efforts does not come from alumni. I’m a bit embarrassed to admit that I had forgotten this fact. CASE puts out plenty of literature that confirms this. It is “friends” who carry the big load in higher education fundraising. At least two of the commenters pointed out that we could look at that fact as a sad commentary on the hundreds and hundreds of thousands of alums who give little or nothing to their alma maters. However, both felt it was better to look at these meager givers as an untapped resource that we have to do a better job of reaching.

The data we see here reflect the distribution of wealth in society. The commenter said, “There simply are very few people who have large amounts of disposable wealth and a whole lot of hard working folks who are just trying to participate in making a difference.” I like this comment; it jibes with my sense of the reality out there.

“It is easier (and more comfortable) to work with donors rather than prospective donors.” The commenter went on to say: “The wealthier the constituency the more you can get away with this approach because you have enough people who can make mega-gifts and that enables you to avoid building the middle of the gift pyramid.” This is very consistent with what some other commenters had to say about donors in the middle of the pyramid — donors who don’t get enough attention from the major giving folks in advancement.

Most people in advancement ARE aware of the lopsidedness. All of the commenters said they felt people in advancement were well aware of the lopsided phenomenon, perhaps not to the level of granularity displayed in this piece. But well aware, nonetheless.

What you see in this piece underestimates the skew because it doesn’t include non-givers. I was hoping that none of the commenters would bring up this fact because I had not (and still have not) come up with a clear, simple way to convey what the commenter had pointed out. But let’s see if I can give you an example. Look at Figure 4. It shows that one tenth of one percent of alumni givers account for over 48% of total alumni giving. However, let’s imagine that half of the solicitable alumni in this school have given nothing at all. Okay, if we now double the base to include all alums, not just alum givers, then what happens to the percentage size of that top one tenth of one percent of givers? It’s no longer one tenth of one percent; it’s now one twentieth of one percent. If you’re confused, let’s ask someone else reading this thing to explain it. I’m spinning my wheels.

One More Thought from Me

But here’s a thought that I’ve had for a long time. When I look at the incredible skewness that we see in the top one percent of alumni donors, I say, “WHY?!” Is the difference among the top millile and the bottom millile in that top one percent simply a function of capacity to give? Maybe it is, but I’d like to know. And then I say, call me crazy, LET’S FIND OUT! Not with some online survey. That won’t cut it. Let’s hire a first rate survey research team to go out and interview these folks (we’re not talking a lot of people here). Would that cost some money to go out and get these answers? Yes, and it would be worth every penny of it. The potential funding sources I’ve talked to yawn at the idea. But I’ll certainly never let go of it.

As always, let us know what you think.

20 February 2013

The ‘analytic’ investment

Filed under: Analytics, Data — Tags: , — kevinmacdonell @ 10:49 am

Everyone’s talking about predictive analytics, Big Data, yadda yadda. The good news is, many institutions and organizations in our sector are indeed making investments in analytics and inching towards becoming data-driven. I have to wonder, though, how much of current investment is based on hype, and how much is going to fall away when data is no longer a hot thing.

Becoming a data-driven organization is a journey, not a destination. Forward progress is not inevitable, and it is possible for an office, a department or an institution to slip backward on the path, even when it seems they’ve “arrived”. In order for analytics to mature from a cutting-edge “nice-to-have” into a regular part of operations, the enterprise needs to be aware of its returns to the bottom line.

In my view, current investments in analytics are often done for reasons that are well-intentioned but vague: It seems to be the right thing to do these days … we see others doing it, so we feel we need to as well … we have an agenda for innovation and this fits the bill … and so on. I’m glad to see the investment, but not every promising innovation gets to stick around. Demonstrating ability to generate revenue — either through savings or through identifying new sources of revenue — will carry the day in the long run.

As I write this, I hear the jangle of railway bells at the level crossing in the early-morning dark outside my hotel room on the city’s downtown waterfront. I’m in Seattle today to attend the DRIVE 2013 conference, hosted by the University of Washington. I’ll be speaking on this topic — the “analytic” investment — later today. I have to admit to having struggled with making the session relevant for this group. For one, they don’t need convincing that making the investment is worth it. And second, if they think that I and my employer have figured out how to calculate the return on investment for analytics programs, they may be in for a disappointment. We have not.

In fact, when it comes right down to it, I like to spend my day working on cool things, interesting problems that face our department, and not so much on stuff that sounds like accounting (“ROI”). I’m betting many of the attendees of my session feel the same way. So I’ll be asking them to stop thinking about how they can get their managers, directors and vice presidents to understand the language of data and analytics. They’ll be far more successful if they try to speak the language their bosses respond to: Return on investment.

I may be a little short on answers for you, but I do have some pretty good questions.

Older Posts »

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 806 other followers