CoolData blog

19 April 2015

Planned Giving prospect identification, driven by data

Filed under: Planned Giving, Prospect identification — Tags: , , , — kevinmacdonell @ 6:28 pm

I’m looking forward to giving two presentations in my home city in connection with this week’s national conference of the Canadian Association of Gift Planners (CAGP). In theory I’ll be talking about data-driven prospect identification for Planned Giving … “in theory” because my primary aim isn’t to provide a how-to for analyzing data.


Rather, I will urge fundraisers to seek “data partners” in their organizations — finding that person who is closest to the data — and posing some good questions. There’s a lot of value hidden in your data, and you can’t realize this value alone: You’ve got to work closely with your colleagues in Advancement Services or with any researcher, analyst, or IT person who can get you what you need. And you have to be able to tell that person what you’re looking for.


For a shop that’s done little or no analysis of their data, I would start with these two basic questions:


  1. What is the average age of new expectancies, at the time they became known to your organization?
  2. What is the size of your general prospect pool?
The answer to the first question might suggest that more active prospect identification is required, of the type more often associated with major-gift fundraising. If the average age is 75 or older, I have to think that earlier identification of bequest intentions would benefit donor and cause alike, by allowing for a longer period for the conversation to mature and for the relationship to develop.


The answer to the second question gives an indication of the potential that exists in the database — but also the challenge of zeroing in on the few people (the top 100, say) in that universe of prospects who are most likely to accept a personal visit. Again, I’m talking about high-touch fundraising — more like Major Gifts, less like Annual Fund.


As Planned Giving professionals get comfortable asking questions of the data, the quality of the questions should improve. Ideally, the analyses will move from one-off projects to an ongoing process of gathering insights and then applying them. Along those lines, I will be giving attendees for both presentations a taste of how some simple targeting via data mining might work. As Peter Wylie and I wrote in our book, “Score!”, data mining for Planned Giving is primarily about improving the odds of success.


I said that I’m giving two presentations. Actually, it’s the same presentation, for two audiences. The first talk will be for a higher ed audience in advance of the conference, and the second will be for a more general nonprofit audience attending the conference proper. I expect the questions and conversations to differ significantly, and I also expect some of my assertions about Planned Giving fundraising to be challenged. Should be interesting!


Since you’ve read this far, you might be interested in downloading the handout I’ve prepared for these talks: Data-driven prospect ID for Planned Giving. There’s nothing like being there in person for the conversation we’re going to have, but this discussion paper does cover most of what I’ll be talking about.


If you’re visiting Halifax for the conference, welcome! I look forward to meeting with you.



7 January 2015

New finds in old models


When you build a predictive model, you can never be sure it’s any good until it’s too late. Deploying a mediocre model isn’t the worst mistake you can make, though. The worst mistake would be to build a second mediocre model because you haven’t learned anything from the failure of the first.


Performance against a holdout data set for validation is not a reliable indicator of actual performance after deployment. Validation may help you decide which of two or more competing models to use, or it may provide reassurance that your one model isn’t total junk. It’s not proof of anything, though. Those lovely predictors, highly correlated with the outcome, could be fooling you. There are no guarantees they’re predictive of results over the year to come.


In the end, the only real evidence of a model’s worth is how it performs on real results. The problem is, those results happen in the future. So what is one to do?


I’ve long been fascinated with Planned Giving likelihood. Making a bequest seems like the ultimate gesture of institutional affinity (ultimate in every sense). On the plus side, that kind of affinity ought to be clearly evidenced in behaviours such as event attendance, giving, volunteering and so on. On the negative side, Planned Giving interest is uncommon enough that comparing expectancies with non-expectancies will sometimes lead to false predictors based on sparse data. For this reason, my goal of building a reliable model for predicting Planned Giving likelihood has been elusive.


Given that a validation data set taken from the same time period as the training data can produce misleading correlations, I wondered whether I could do one better: That is, be able to draw my holdout sample not from data of the same time period as that used to build the model, but from the future.


As it turned out, yes, I could.


Every year I save my regression analyses as Data Desk files. Although I assess the performance of the output scores, I don’t often go back to the model files themselves. However, they’re there as a document of how I approached modelling problems in the past. As a side benefit, each file is also a snapshot of the alumni population at that point in time. These data sets may consist of a hundred or more candidate predictor variables — a well-rounded picture.


My thinking went like this: Every old model file represents data from the past. If I pretend that this snapshot is really the present, then in order to have access to knowledge of the future, all I have to do is look at today’s data stored in the database.


For example, for this blog post, I reached back two years to a model I created in Data Desk for predicting likelihood to upgrade to the Leadership level in Annual Giving. I wasn’t interested in the model itself. Rather, I wanted to examine the underlying variables I had to work with at the time. This model had been an ambitious undertaking, with some 170 variables prepared for analysis. Many of course were transformations of variables or combinations of interacting variables. Among all those variables was one indicating whether a case was a current Planned Giving expectancy or not, at that point in time.


In this snapshot of the database from two years ago, some of the cases that were not expectancies would have become so since then. In other words, I now had the best of both worlds. I had a comprehensive set of potential predictors as they existed two years ago, AND access to the hitherto unknowable future: The identities of the people who had become expectancies after the predictors had been frozen in time.


As I said, my old model was not intended to predict Planned Giving inclination. So I built a new model, using “Is an Expectancy” (0/1) as the target variable. I trained the regression model on the two-year-old expectancy data — I didn’t even look at the new expectancies while building the model. No: I used those new expectancies as my validation data set.


“Validation” might be too strong a word, given that there were only 80 or so new cases. That’s a lot of bequest intentions, for sure, but in terms of data it’s a drop in the bucket compared with the number of cases being scored. Let’s call it a test data set. I used this test set to help me analyze the model, in a couple of ways.


First I looked at how new expectancies were scored by the model I had just built. The chart below shows their distribution by score decile. Slightly more than 50% of new expectancies were in the top decile. This looks pretty good — keeping in mind that this is what actual performance would have looked like had I really built this model two years ago (which I could have):




(Even better, looking at percentiles, most of the expectancies in that top 10% are concentrated nicely in the top few percentiles.)


But I didn’t stop there. It is also evident that almost half of new expectancies fell outside the top 10 percent of scores, so clearly there was room for improvement. My next step was to examine the individual predictors I had used in the model. These were of course the predictors most highly correlated with being an expectancy. They were roughly the following:
  • Year person’s personal information in the database was last updated
  • Number of events attended
  • Age
  • Year of first gift
  • Number of alumni activities
  • Indicated “likely to donate” on 2009 alumni survey
  • Total giving in last five years (log transformed)
  • Combined length of name Prefix + Suffix


I ranked the correlation of each of these with the 0/1 indicator meaning “new expectancy,” and found that most of the predictors were still fine, although they changed their order in the rank correlation. Donor likelihood (from survey) and recent giving were more important, and alumni activities and how recently a person’s record was updated were less important.


This was interesting and useful, but what was even more useful was looking at the correlations between ALL potential predictors and the state of being a new expectancy. A number of predictors that would have been too far down the ranked list to consider using two years ago were suddenly looking much better. In particular, many variables related to participation in alumni surveys bubbled closer to the top as potentially significant.


This exercise suggests a way to proceed with iterative, yearly improvements to some of your standard models:
  • Dig up an old model from a year or more ago.
  • Query the database for new cases that represent the target variable, and merge them with the old datafile.
  • Assess how your model performed or, if you created more than one model, see which model would have performed best. (You should be doing this anyway.)
  • Go a layer deeper, by studying the variables that went into those models — the data “as it was” — to see which variables had correlations that tricked you into believing they were predictive, and which variables truly held predictive power but may have been overlooked.
  • Apply what you learn to the next iteration of the model. Leave out the variables with spurious correlations, and give special consideration to variables that may have been underestimated before.

28 May 2013

Targeting rare behavior

Filed under: Planned Giving, regression — Tags: , , — kevinmacdonell @ 5:21 am

Guest post by Kelly Heinrich, Assistant Director of Prospect Management and Analytics, Stanford University

Last August, about two months into a data analyst position with a university’s development division, I had the task to build a predictive model for the Office of Gift Planning (OGP). The OGP wanted a tool to help them focus on the constituents who are most likely to make a planned gift. I wanted to identify a few hundred of the best planned giving prospects who I could prioritize by the probability of donating. After a bit of preliminary research, I chose: 1) 50 years of age and older and 2) inclusion in a recent wealth screening as the criteria for the study population. This generated a file of 133,000 records; 582 of them were planned gift donors. I’ve worked with files larger than this and did not expect a problem. However, that turned out to be a mistake because the planned gift donors, who exhibited the target behavior, comprised 0.4% of the population, a proportion so small it can be considered rare. I’ll explain more about that later; first I want to describe the project as it developed.

I decided to use logistic regression with the dependent variable being either “made a planned gift” or “has not made a planned gift”. I cleaned the data and identified some strong relationships between the variables. After trying several combinations for the regression model, I had one with a Nagelkerke of .24, which is relatively good. (Nagelkerke is like a pseudo R squared; it can be loosely interpreted as the variability of the dependent variable that is accounted for by the model’s independent variables.) However, when I applied the algorithm to the study population, only 31 constituents without a planned gift and only 11 planned giving donors were identified as having a probability of giving of .5 or greater. I lowered the probability threshold of giving to .2 or greater and 105 non-planned givers and 52 planned gift donors fell into this range. This was still disappointing.

Desperate to identify more new potential prospects, I explored more criteria to narrow the study population and built three successive models. For the purpose of the follow-up exploratory research and this article, I re-built all four models using the same independent variables to easily compare their outcomes. Here’s a summary of the four models:


Models B, C, and D are all subsets of the original data set. Each model has advantages and disadvantages to it and I was uncertain how to evaluate them against one another. For example, each additional filtering criterion resulted in losing part of the target population, meaning that I systematically eliminated constituents with characteristics that are in fact associated with making a planned gift. I scored everyone who was identified with a probability of .2 or greater in any of the models by the number of models in which they were identified. I’m not unhappy with that solution, but since then I’ve been learning about better methods for targeting rare behavior.

If the OGP was interested only in prioritizing the prospects already in their pool of potential planned giving donors, model D would serve their need. However, we wanted to identify the best potential planned giving prospects within the database. If we want to uncover untapped potential in an ever-growing database, we need to explore methods on how to target rare behavior. This seems especially important in our field where 1) donating, in general, is somewhat rare and 2) donating really generous gifts is rarer. Better methods of targeting rare behavior will also be useful for modeling for special initiatives and unique kinds of gifts.

As I’ve been learning, logistic regression suffers from small sample bias when the target behavior is rare, relative to the study population. This helps explain why applying the algorithm to the original population resulted in very few new prospects–even though the model had a decent Nagelkerke of .24. Some analysts suggest using alternative sampling methods when the target behavior comprises less than 5% of the study. (See endnote.) Knowing that the planned gift donors in my original project comprised only 0.4% of the population, I decided to experiment with two new approaches.

In both of the exploratory models, I created the study population size so planned gift donors would comprise 5 percent. First, I generated a study population by including all 582 of the planned gift donors and a random selection of 11,060 non-planned-gift constituents (model E). Then, I applied the algorithm from that population to the entire non-planned-gift population of 132,418. In the second approach (model F), the planned gift population was randomly split into two equal size groups of 291. I also randomly selected 5,530 non-planned-gift constituents. To build the regression model, I combined one of the planned gift donor groups (of 291) with 5,530 non-planned-gift constituents. I then tested the algorithm on the holdout sample (the other planned giving group of 291 with 5,530 non-planned-gift constituents). Finally, I applied the algorithm to the entire original population of 133,000. Here are the results:


Using the same independent variables as in models A through D, model E had a Nagelkerke of .39 and model F .38, which helps substantiate that the independent variables are useful predictors for planned giving. Models E and F were more effective at predicting the planned givers (129 and 123 respectively with a probability of giving greater than or equal to .5) compared to model A (11), i.e. more than ten times as many. The sampling techniques have some advantages and disadvantages. The disadvantage is that by reducing the non-planned-gift population, it loses some of its variability and complexity. However, the advantage, in both models E and F, is that 1) the target population maintains its complexity, 2) new prospects are not limited by characteristic selection (the additional criteria that I used to reduce the population in models B, C, and D), which increases the likelihood of identifying constituents who were previously not on the OGP’s radar, and 3) the effects of the sample bias seem to be reduced.

It’s important to note that I displayed the measures (Nagelkerke and estimated probabilities) from the exploratory models and populations purely for comparison purposes. Because the study population is manipulated in the exploratory methods, the probability of giving should not be directly interpreted as actual probabilities. However, they can be used to prioritize those with the highest probabilities and that will serve our need.

To explore another comparison between models A and F, I ranked all 133,000 records in each. I then sorted all the records in model F in descending order. I took the top 1,000 records from model F and then ran correlation between the rank of model A and the rank of model F; they have a correlation of .282, meaning there is a substantial difference between the ranked records.

Over the last several months, Peter Wylie, Higher Education Consultant and Contractor, and I have been exchanging ideas on this topic. I thank him for his insight, suggestions, and encouragement to share my findings with our colleagues.

It would be helpful to learn about the methods you’ve used to target rare behavior. We could feel more confident about using alternative methods if repeat efforts produced similar outcomes. Furthermore, I did not have a chance to evaluate the prospecting performance of these models, so if you have used a method for targeting rare behavior and have had an opportunity to assess its effectiveness, I am very interested in learning about that. I welcome ideas, feedback, examples from your research, and questions in regard to this work. Please feel free to contact me at


The ideas for these alternative approaches are adapted from the following articles:

Kelly Heinrich has been conducting quantitative research and analysis in higher education development for two and a half years. She has recently accepted a position as Assistant Director of Prospect Management and Analytics with Stanford University that will begin in June 2013.

15 January 2013

The cautionary tale of Mr. S. John Doe

A few years ago I met with an experienced Planned Giving professional who had done very well over the years without any help from predictive modeling, and was doing me the courtesy of hearing my ideas. I showed this person a series of charts. Each chart showed a variable and its association with the condition of being a current Planned Giving expectancy. The ultimate goal would have been to consolidate these predictors together as a score, in order to discover new expectancies in that school’s alumni database. The conventional factors of giving history and donor loyalty are important, I conceded, but other engagement-related factors are also very predictive: student activities, alumni involvement, number of degrees, event attendance, and so on.

This person listened politely and was genuinely interested. And then I went too far.

One of my charts showed that there was a strong association between being a Planned Giving expectancy and having a single initial in the First Name field. I noted that, for some unexplained reason, having a preference for a name like “S. John Doe” seemed to be associated with a higher propensity to make a bequest. I thought that was cool.

The response was a laugh. A good-natured laugh, but still — a laugh. “That sounds like astrology!”

I had mistaken polite interest for a slam-dunk, and in my enthusiasm went too far out on a limb. I may have inadvertently caused the minting of a new data-mining skeptic. (Eventually, the professional retired after completing a successful career in Planned Giving, and having managed to avoid hearing much more about predictive modeling.)

At the time, I had hastened to explain that what we were looking at were correlations — loose, non-causal relationships among various characteristics, some of them non-intuitive or, as in this case, seemingly nonsensical. I also explained that the linkage was probably due to other variables (age and sex being prime candidates). Just because it’s without explanation doesn’t mean it’s not useful. But I suppose the damage was done. You win some, you lose some.

Although some of the power (and fun) of predictive modeling rests on the sometimes non-intuitive and unexplained nature of predictor variables, I now think it’s best to frame any presentation to a general audience in terms of what they think of as “common sense”. Limiting, yes. But safer. Unless you think your listener is really picking up what you’re laying down, keep it simple, keep it intuitive, and keep it grounded.

So much for sell jobs. Let’s get back to the data … What ABOUT that “first-initial” variable? Does it really mean anything, or is it just noise? Is it astrology?

I’ve got this data set in front of me — all alumni with at least some giving in the past ten years. I see that 1.2% percent of all donors have a first initial at the front of their name. When I look at the subset of the records that are current Planned Giving expectancies, I see that 4.6% have a single-initial first name. In other words, Planned Giving expectancies are almost four times as likely as all other donors to have a name that starts with a single initial. The data file is fairly large — more than 17,000 records — and the difference is statistically significant.

What can explain this? When I think of a person whose first name is an initial and who tends to go by their middle name, the image that comes to mind is that of an elderly male with a higher than average income — like a retired judge, say. For each of the variables Age and Male, there is in fact a small positive association with having a one-character first name. Yet, when I account for both ‘Age’ and ‘Male’ in a regression analysis, the condition of having a leading initial is still significant and still has explanatory power for being a Planned Giving expectancy.

I can’t think of any other underlying reasons for the connection with Planned Giving. Even when I continue to add more and more independent variables to the regression, this strange predictor hangs in there, as sturdy as ever. So, it’s certainly interesting, and I usually at least look at it while building models.

On the other hand … perhaps there is some justification for the verdict of “astrology” (that is, “nonsense”). The data set I have here may be large, but the number of Planned Giving expectancies is less than 500 — and 4.6% of 500 is not very many records. Regardless of whether p ≤ 0.0001, it could still be just one of those things. I’ve also learned that complex models are not better than simple ones, particularly when trying to predict something hard like Planned Giving propensity. A quirky variable that suggests no potential causal pathway makes me wary of the possibility of overfitting the noise in my data and missing the signal.

Maybe it’s useful, maybe it’s not. Either way, whether I call it “cool” or not will depend on who I’m talking to.

28 February 2011

Look beyond loyal donors to find Planned Giving prospects

Filed under: Alumni, Planned Giving, predictive modeling, Predictor variables — Tags: , — kevinmacdonell @ 9:17 am

According to conventional wisdom, the best Planned Giving prospects are donors who have consistently given small Annual Fund gifts over a long period of time. Rather than assume this is true always and everywhere, I think we should put the “loyal donor” rule of thumb to the test in the environment of our own data.

Here’s what I did recently. I picked a group of current Planned Giving expectancies, and pulled their giving totals for the 20 fiscal years prior to their identification. To select the group, I chose everyone identified as an expectancy in the year 2003 or later, so the years of giving that I pulled where 1983 to 2002. I also limited the group to people who are now at least 50 years old.  This ensured that everyone in the group was probably old enough to have participated in the Annual Fund during any of those years if they chose.

I didn’t look at how much they gave in any given year, only whether they gave. Expectancies who gave in 20 out of 20 years received a “score” of 20. Someone who had given in 10 years out of 20 got a score of 10, and so on. Non-donors were scored as zero.

Then I made a bar chart of their scores. The height of the bars corresponds to the percentage of the group that falls into each number of years of giving in that 20-year span.

What does this chart tell us? It’s clear these expectancies are indeed very loyal donors. A little under half of them have some giving in at least 10 of the 20 years. That’s wonderful.

I am struck that 15% of them have no giving at all. On the other hand, the proportion of alumni over 50 who are NOT Planned Giving expectancies and have no giving is 61%, so the expectancies compare well against them.

Here’s the same chart, but with all alumni over 50 who are not current expectancies:

Big difference! The scale is totally different, due to the disproportionate number of non-donors in this group. As a percentage of alumni, very loyal donors are scarce. Let’s look at it another way, excluding non-donors from both groups: In the chart below, the expectancy donors have giving in twice as many years as non-expectancy donors, on average:

No wonder, then, we’ve been told to focus on loyal Annual Fund donors in order to identify new prospects for Planned Giving. The connection is undeniable.

A couple of things interfere with the clarity of this picture, however. Have another look at the first chart above. Although all of these people are old enough to have contributed in every year since 1983, a significant percentage of them have given in only a handful of years. For example, 6 percent of current expectancies have giving in only ONE of the 20 years. They share that distinction with 10,000 alumni who are not expectancies.

In other words, if years of giving was your only metric for proactively identifying prospects, and no expectancies came in “over the transom,” so to speak, that 6 percent of the group would never be discovered. There are just too many individuals at the lower end of years-of-giving to get focused in any practical way. Donor loyalty is therefore a great predictor of Planned Giving potential, but it does not define the profile of a Planned Giving donor.

If donor data does not contain all the answers, where can you look? I have a few ideas.

Using the same group of current expectancies (age 50 or older, and identified in 2003 or later), I pulled some other characteristics from the database to test as predictors. I was careful to select data that existed before 2003, i.e. that pre-dated the identification of the individuals as expectancies.

Here’s a great one: Student activities. Participation in varsity sports, campus clubs and student government is coded in the database, and the chart below compares the proportions of the two groups who have at least one such activity code in their records.

Interesting, eh? Now, maybe ten or 15 years ago there was a big push on to solicit former athletes for Planned Giving, and that’s why they’re well represented in the current crop of expectancies — but I doubt that very much. The evidence indicates that student experience is a big factor even for decisions taken many years later. This is a great example of how even the oldest data is valuable in the present day.

Here’s another one: Alumni who hold more than one degree. The proportions on both sides are high, because I counted degrees from ANY university (we have that information in our database), and we have many graduate and professional degree holders. The chart would seem to indicate that expectancies are more likely to hold multiple degrees than non-expectancies. A little more digging would tell us whether a particular profession (doctors or lawyers, for example) are heavily represented among the expectancies group.

Here’s another one, for the presence of a Faculty or Staff code, which indicates whether someone is or at one time was employed by the university. This code is not uniformly applied (it does not directly correspond to actual employment or even HR data), so it’s not perfect, but as a rough indicator it works fine for data mining.

Next up is one of my very favourite predictors for Planned Giving potential: event attendance. I’ve seen this elsewhere, and it holds true here as well. Showing up at any kind of reunion or alumni-related event is highly predictive. I got a little lazy when I calculated this variable because I did not exclude events attended in 2003 or later; I would expect the percentages to change a bit, but probably not by much. I DID exclude attendance at any kind of donor-recognition event — if only donors are invited, attendance is merely a proxy for donor status.

I could do this for a dozen more variables, but you get the point. There are all sorts of additional indicators of Planned Giving potential sitting in your database. As well, my predictors are not necessarily your predictors. It’s up to you to do a little digging and find them.

From here, we could have niggling arguments about whether some of these predictors are really better than ‘donor loyalty’, or are even statistically significant, and so on. But if you are currently trying to identify prospects solely by identifying loyal donors, allow me to suggest this improvement in your methods: Devise a simple scoring system that gives one point for ‘donor loyalty’ (however you wish to define that — I’ve defined it as giving in at least 10 years out of 20), and one point for each of the other predictors that strike you as particularly powerful. Using the predictors I’ve presented here, my score would be calculated like so:

Loyal donor (0/1) + Student activity (0/1) + Multiple degrees (0/1) + Faculty or Staff (0/1) + Event attendance (0/1) = Maximum PG score of 5.

What happens when I apply this model to our database? Out of more than 30,000 living and addressable alumni over the age of 50 who are not already expectancies, only 89 have a perfect score of 5 out of 5. That’s a very manageable, high-quality list of individuals to provide for review by a Planned Giving Officer.

This model is far from the last word in data mining for Planned Giving, and it has some severe limitations. For example, focusing on these 89 individuals might essentially result in a campaign based on retired professors in the Faculty of Medicine! Your expectancies are not going to be one homogeneous group, so you’ll want to identify other clusters for solicitation. As well, almost 700 individuals in our database would have a score of 4 out of 5, so things get out of hand quickly when you have too few score levels.

Otherwise, it’s pretty nifty. This score is easy to understand, not terribly difficult to calculate, and is a useful departure from any single-minded focus on donor loyalty.

10 December 2010

In-memory gifts and Planned Giving potential

Filed under: Planned Giving, Predictor variables — Tags: — kevinmacdonell @ 2:16 pm

Back in September I read a blog post by Jonathan Grapsas about the possible connection between gifts made in memory and Planned Giving expectancies. (The link between in memory and legacies?) He writes, “People are making a gift in memory of someone they care about. They are in that head space.”

I was delighted to discover that memorial gifts are identified with a code in our database, so in I went in search of a connection in our own data. I found that of any alum who has ever made a gift, only 2.1% have ever made a memorial gift. But of all current Planned Giving expectancies who are donors, 9.8% have done so.

Now, I haven’t dug deep: These could be gifts tied to the donor’s own planned gift and which came after the commitment was made. But what if this turns out to be a real difference in giving behaviour and a predictor for Planned Giving?

Up to this point, I haven’t thought of in-memory gifts as an indicator of affinity. The donor is motivated by the desire to honour a friend or loved one, not any identification with your mission or nostalgia for alma mater. In other words, it seems probable that the gift given in memory is typically not up for renewal. To discover that the behaviour may be associated with an especially elusive class of donor is exciting.

There is no need to simply accept the conventional wisdom that the best Planned Giving prospects are the ones who have consistently given small amounts to the annual fund over a long period of time. This behaviour is certainly a predictor for bequests, but it does not typify them.

Older Posts »

Create a free website or blog at