What is the average age of new expectancies, at the time they became known to your organization?
What is the size of your general prospect pool?
Last August, about two months into a data analyst position with a university’s development division, I had the task to build a predictive model for the Office of Gift Planning (OGP). The OGP wanted a tool to help them focus on the constituents who are most likely to make a planned gift. I wanted to identify a few hundred of the best planned giving prospects who I could prioritize by the probability of donating. After a bit of preliminary research, I chose: 1) 50 years of age and older and 2) inclusion in a recent wealth screening as the criteria for the study population. This generated a file of 133,000 records; 582 of them were planned gift donors. I’ve worked with files larger than this and did not expect a problem. However, that turned out to be a mistake because the planned gift donors, who exhibited the target behavior, comprised 0.4% of the population, a proportion so small it can be considered rare. I’ll explain more about that later; first I want to describe the project as it developed.
I decided to use logistic regression with the dependent variable being either “made a planned gift” or “has not made a planned gift”. I cleaned the data and identified some strong relationships between the variables. After trying several combinations for the regression model, I had one with a Nagelkerke of .24, which is relatively good. (Nagelkerke is like a pseudo R squared; it can be loosely interpreted as the variability of the dependent variable that is accounted for by the model’s independent variables.) However, when I applied the algorithm to the study population, only 31 constituents without a planned gift and only 11 planned giving donors were identified as having a probability of giving of .5 or greater. I lowered the probability threshold of giving to .2 or greater and 105 non-planned givers and 52 planned gift donors fell into this range. This was still disappointing.
Desperate to identify more new potential prospects, I explored more criteria to narrow the study population and built three successive models. For the purpose of the follow-up exploratory research and this article, I re-built all four models using the same independent variables to easily compare their outcomes. Here’s a summary of the four models:
Models B, C, and D are all subsets of the original data set. Each model has advantages and disadvantages to it and I was uncertain how to evaluate them against one another. For example, each additional filtering criterion resulted in losing part of the target population, meaning that I systematically eliminated constituents with characteristics that are in fact associated with making a planned gift. I scored everyone who was identified with a probability of .2 or greater in any of the models by the number of models in which they were identified. I’m not unhappy with that solution, but since then I’ve been learning about better methods for targeting rare behavior.
If the OGP was interested only in prioritizing the prospects already in their pool of potential planned giving donors, model D would serve their need. However, we wanted to identify the best potential planned giving prospects within the database. If we want to uncover untapped potential in an ever-growing database, we need to explore methods on how to target rare behavior. This seems especially important in our field where 1) donating, in general, is somewhat rare and 2) donating really generous gifts is rarer. Better methods of targeting rare behavior will also be useful for modeling for special initiatives and unique kinds of gifts.
As I’ve been learning, logistic regression suffers from small sample bias when the target behavior is rare, relative to the study population. This helps explain why applying the algorithm to the original population resulted in very few new prospects–even though the model had a decent Nagelkerke of .24. Some analysts suggest using alternative sampling methods when the target behavior comprises less than 5% of the study. (See endnote.) Knowing that the planned gift donors in my original project comprised only 0.4% of the population, I decided to experiment with two new approaches.
In both of the exploratory models, I created the study population size so planned gift donors would comprise 5 percent. First, I generated a study population by including all 582 of the planned gift donors and a random selection of 11,060 non-planned-gift constituents (model E). Then, I applied the algorithm from that population to the entire non-planned-gift population of 132,418. In the second approach (model F), the planned gift population was randomly split into two equal size groups of 291. I also randomly selected 5,530 non-planned-gift constituents. To build the regression model, I combined one of the planned gift donor groups (of 291) with 5,530 non-planned-gift constituents. I then tested the algorithm on the holdout sample (the other planned giving group of 291 with 5,530 non-planned-gift constituents). Finally, I applied the algorithm to the entire original population of 133,000. Here are the results:
Using the same independent variables as in models A through D, model E had a Nagelkerke of .39 and model F .38, which helps substantiate that the independent variables are useful predictors for planned giving. Models E and F were more effective at predicting the planned givers (129 and 123 respectively with a probability of giving greater than or equal to .5) compared to model A (11), i.e. more than ten times as many. The sampling techniques have some advantages and disadvantages. The disadvantage is that by reducing the non-planned-gift population, it loses some of its variability and complexity. However, the advantage, in both models E and F, is that 1) the target population maintains its complexity, 2) new prospects are not limited by characteristic selection (the additional criteria that I used to reduce the population in models B, C, and D), which increases the likelihood of identifying constituents who were previously not on the OGP’s radar, and 3) the effects of the sample bias seem to be reduced.
It’s important to note that I displayed the measures (Nagelkerke and estimated probabilities) from the exploratory models and populations purely for comparison purposes. Because the study population is manipulated in the exploratory methods, the probability of giving should not be directly interpreted as actual probabilities. However, they can be used to prioritize those with the highest probabilities and that will serve our need.
To explore another comparison between models A and F, I ranked all 133,000 records in each. I then sorted all the records in model F in descending order. I took the top 1,000 records from model F and then ran correlation between the rank of model A and the rank of model F; they have a correlation of .282, meaning there is a substantial difference between the ranked records.
Over the last several months, Peter Wylie, Higher Education Consultant and Contractor, and I have been exchanging ideas on this topic. I thank him for his insight, suggestions, and encouragement to share my findings with our colleagues.
It would be helpful to learn about the methods you’ve used to target rare behavior. We could feel more confident about using alternative methods if repeat efforts produced similar outcomes. Furthermore, I did not have a chance to evaluate the prospecting performance of these models, so if you have used a method for targeting rare behavior and have had an opportunity to assess its effectiveness, I am very interested in learning about that. I welcome ideas, feedback, examples from your research, and questions in regard to this work. Please feel free to contact me at firstname.lastname@example.org.
The ideas for these alternative approaches are adapted from the following articles:
Kelly Heinrich has been conducting quantitative research and analysis in higher education development for two and a half years. She has recently accepted a position as Assistant Director of Prospect Management and Analytics with Stanford University that will begin in June 2013.
A few years ago I met with an experienced Planned Giving professional who had done very well over the years without any help from predictive modeling, and was doing me the courtesy of hearing my ideas. I showed this person a series of charts. Each chart showed a variable and its association with the condition of being a current Planned Giving expectancy. The ultimate goal would have been to consolidate these predictors together as a score, in order to discover new expectancies in that school’s alumni database. The conventional factors of giving history and donor loyalty are important, I conceded, but other engagement-related factors are also very predictive: student activities, alumni involvement, number of degrees, event attendance, and so on.
This person listened politely and was genuinely interested. And then I went too far.
One of my charts showed that there was a strong association between being a Planned Giving expectancy and having a single initial in the First Name field. I noted that, for some unexplained reason, having a preference for a name like “S. John Doe” seemed to be associated with a higher propensity to make a bequest. I thought that was cool.
The response was a laugh. A good-natured laugh, but still — a laugh. “That sounds like astrology!”
I had mistaken polite interest for a slam-dunk, and in my enthusiasm went too far out on a limb. I may have inadvertently caused the minting of a new data-mining skeptic. (Eventually, the professional retired after completing a successful career in Planned Giving, and having managed to avoid hearing much more about predictive modeling.)
At the time, I had hastened to explain that what we were looking at were correlations — loose, non-causal relationships among various characteristics, some of them non-intuitive or, as in this case, seemingly nonsensical. I also explained that the linkage was probably due to other variables (age and sex being prime candidates). Just because it’s without explanation doesn’t mean it’s not useful. But I suppose the damage was done. You win some, you lose some.
Although some of the power (and fun) of predictive modeling rests on the sometimes non-intuitive and unexplained nature of predictor variables, I now think it’s best to frame any presentation to a general audience in terms of what they think of as “common sense”. Limiting, yes. But safer. Unless you think your listener is really picking up what you’re laying down, keep it simple, keep it intuitive, and keep it grounded.
So much for sell jobs. Let’s get back to the data … What ABOUT that “first-initial” variable? Does it really mean anything, or is it just noise? Is it astrology?
I’ve got this data set in front of me — all alumni with at least some giving in the past ten years. I see that 1.2% percent of all donors have a first initial at the front of their name. When I look at the subset of the records that are current Planned Giving expectancies, I see that 4.6% have a single-initial first name. In other words, Planned Giving expectancies are almost four times as likely as all other donors to have a name that starts with a single initial. The data file is fairly large — more than 17,000 records — and the difference is statistically significant.
What can explain this? When I think of a person whose first name is an initial and who tends to go by their middle name, the image that comes to mind is that of an elderly male with a higher than average income — like a retired judge, say. For each of the variables Age and Male, there is in fact a small positive association with having a one-character first name. Yet, when I account for both ‘Age’ and ‘Male’ in a regression analysis, the condition of having a leading initial is still significant and still has explanatory power for being a Planned Giving expectancy.
I can’t think of any other underlying reasons for the connection with Planned Giving. Even when I continue to add more and more independent variables to the regression, this strange predictor hangs in there, as sturdy as ever. So, it’s certainly interesting, and I usually at least look at it while building models.
On the other hand … perhaps there is some justification for the verdict of “astrology” (that is, “nonsense”). The data set I have here may be large, but the number of Planned Giving expectancies is less than 500 — and 4.6% of 500 is not very many records. Regardless of whether p ≤ 0.0001, it could still be just one of those things. I’ve also learned that complex models are not better than simple ones, particularly when trying to predict something hard like Planned Giving propensity. A quirky variable that suggests no potential causal pathway makes me wary of the possibility of overfitting the noise in my data and missing the signal.
Maybe it’s useful, maybe it’s not. Either way, whether I call it “cool” or not will depend on who I’m talking to.
According to conventional wisdom, the best Planned Giving prospects are donors who have consistently given small Annual Fund gifts over a long period of time. Rather than assume this is true always and everywhere, I think we should put the “loyal donor” rule of thumb to the test in the environment of our own data.
Here’s what I did recently. I picked a group of current Planned Giving expectancies, and pulled their giving totals for the 20 fiscal years prior to their identification. To select the group, I chose everyone identified as an expectancy in the year 2003 or later, so the years of giving that I pulled where 1983 to 2002. I also limited the group to people who are now at least 50 years old. This ensured that everyone in the group was probably old enough to have participated in the Annual Fund during any of those years if they chose.
I didn’t look at how much they gave in any given year, only whether they gave. Expectancies who gave in 20 out of 20 years received a “score” of 20. Someone who had given in 10 years out of 20 got a score of 10, and so on. Non-donors were scored as zero.
Then I made a bar chart of their scores. The height of the bars corresponds to the percentage of the group that falls into each number of years of giving in that 20-year span.
What does this chart tell us? It’s clear these expectancies are indeed very loyal donors. A little under half of them have some giving in at least 10 of the 20 years. That’s wonderful.
I am struck that 15% of them have no giving at all. On the other hand, the proportion of alumni over 50 who are NOT Planned Giving expectancies and have no giving is 61%, so the expectancies compare well against them.
Here’s the same chart, but with all alumni over 50 who are not current expectancies:
Big difference! The scale is totally different, due to the disproportionate number of non-donors in this group. As a percentage of alumni, very loyal donors are scarce. Let’s look at it another way, excluding non-donors from both groups: In the chart below, the expectancy donors have giving in twice as many years as non-expectancy donors, on average:
No wonder, then, we’ve been told to focus on loyal Annual Fund donors in order to identify new prospects for Planned Giving. The connection is undeniable.
A couple of things interfere with the clarity of this picture, however. Have another look at the first chart above. Although all of these people are old enough to have contributed in every year since 1983, a significant percentage of them have given in only a handful of years. For example, 6 percent of current expectancies have giving in only ONE of the 20 years. They share that distinction with 10,000 alumni who are not expectancies.
In other words, if years of giving was your only metric for proactively identifying prospects, and no expectancies came in “over the transom,” so to speak, that 6 percent of the group would never be discovered. There are just too many individuals at the lower end of years-of-giving to get focused in any practical way. Donor loyalty is therefore a great predictor of Planned Giving potential, but it does not define the profile of a Planned Giving donor.
If donor data does not contain all the answers, where can you look? I have a few ideas.
Using the same group of current expectancies (age 50 or older, and identified in 2003 or later), I pulled some other characteristics from the database to test as predictors. I was careful to select data that existed before 2003, i.e. that pre-dated the identification of the individuals as expectancies.
Here’s a great one: Student activities. Participation in varsity sports, campus clubs and student government is coded in the database, and the chart below compares the proportions of the two groups who have at least one such activity code in their records.
Interesting, eh? Now, maybe ten or 15 years ago there was a big push on to solicit former athletes for Planned Giving, and that’s why they’re well represented in the current crop of expectancies — but I doubt that very much. The evidence indicates that student experience is a big factor even for decisions taken many years later. This is a great example of how even the oldest data is valuable in the present day.
Here’s another one: Alumni who hold more than one degree. The proportions on both sides are high, because I counted degrees from ANY university (we have that information in our database), and we have many graduate and professional degree holders. The chart would seem to indicate that expectancies are more likely to hold multiple degrees than non-expectancies. A little more digging would tell us whether a particular profession (doctors or lawyers, for example) are heavily represented among the expectancies group.
Here’s another one, for the presence of a Faculty or Staff code, which indicates whether someone is or at one time was employed by the university. This code is not uniformly applied (it does not directly correspond to actual employment or even HR data), so it’s not perfect, but as a rough indicator it works fine for data mining.
Next up is one of my very favourite predictors for Planned Giving potential: event attendance. I’ve seen this elsewhere, and it holds true here as well. Showing up at any kind of reunion or alumni-related event is highly predictive. I got a little lazy when I calculated this variable because I did not exclude events attended in 2003 or later; I would expect the percentages to change a bit, but probably not by much. I DID exclude attendance at any kind of donor-recognition event — if only donors are invited, attendance is merely a proxy for donor status.
I could do this for a dozen more variables, but you get the point. There are all sorts of additional indicators of Planned Giving potential sitting in your database. As well, my predictors are not necessarily your predictors. It’s up to you to do a little digging and find them.
From here, we could have niggling arguments about whether some of these predictors are really better than ‘donor loyalty’, or are even statistically significant, and so on. But if you are currently trying to identify prospects solely by identifying loyal donors, allow me to suggest this improvement in your methods: Devise a simple scoring system that gives one point for ‘donor loyalty’ (however you wish to define that — I’ve defined it as giving in at least 10 years out of 20), and one point for each of the other predictors that strike you as particularly powerful. Using the predictors I’ve presented here, my score would be calculated like so:
Loyal donor (0/1) + Student activity (0/1) + Multiple degrees (0/1) + Faculty or Staff (0/1) + Event attendance (0/1) = Maximum PG score of 5.
What happens when I apply this model to our database? Out of more than 30,000 living and addressable alumni over the age of 50 who are not already expectancies, only 89 have a perfect score of 5 out of 5. That’s a very manageable, high-quality list of individuals to provide for review by a Planned Giving Officer.
This model is far from the last word in data mining for Planned Giving, and it has some severe limitations. For example, focusing on these 89 individuals might essentially result in a campaign based on retired professors in the Faculty of Medicine! Your expectancies are not going to be one homogeneous group, so you’ll want to identify other clusters for solicitation. As well, almost 700 individuals in our database would have a score of 4 out of 5, so things get out of hand quickly when you have too few score levels.
Otherwise, it’s pretty nifty. This score is easy to understand, not terribly difficult to calculate, and is a useful departure from any single-minded focus on donor loyalty.
Back in September I read a blog post by Jonathan Grapsas about the possible connection between gifts made in memory and Planned Giving expectancies. (The link between in memory and legacies?) He writes, “People are making a gift in memory of someone they care about. They are in that head space.”
I was delighted to discover that memorial gifts are identified with a code in our database, so in I went in search of a connection in our own data. I found that of any alum who has ever made a gift, only 2.1% have ever made a memorial gift. But of all current Planned Giving expectancies who are donors, 9.8% have done so.
Now, I haven’t dug deep: These could be gifts tied to the donor’s own planned gift and which came after the commitment was made. But what if this turns out to be a real difference in giving behaviour and a predictor for Planned Giving?
Up to this point, I haven’t thought of in-memory gifts as an indicator of affinity. The donor is motivated by the desire to honour a friend or loved one, not any identification with your mission or nostalgia for alma mater. In other words, it seems probable that the gift given in memory is typically not up for renewal. To discover that the behaviour may be associated with an especially elusive class of donor is exciting.
There is no need to simply accept the conventional wisdom that the best Planned Giving prospects are the ones who have consistently given small amounts to the annual fund over a long period of time. This behaviour is certainly a predictor for bequests, but it does not typify them.