# CoolData blog

## 28 May 2013

### Targeting rare behavior

Filed under: Planned Giving, regression — Tags: , , — kevinmacdonell @ 5:21 am

## Guest post by Kelly Heinrich, Assistant Director of Prospect Management and Analytics, Stanford University

Last August, about two months into a data analyst position with a university’s development division, I had the task to build a predictive model for the Office of Gift Planning (OGP). The OGP wanted a tool to help them focus on the constituents who are most likely to make a planned gift. I wanted to identify a few hundred of the best planned giving prospects who I could prioritize by the probability of donating. After a bit of preliminary research, I chose: 1) 50 years of age and older and 2) inclusion in a recent wealth screening as the criteria for the study population. This generated a file of 133,000 records; 582 of them were planned gift donors. I’ve worked with files larger than this and did not expect a problem. However, that turned out to be a mistake because the planned gift donors, who exhibited the target behavior, comprised 0.4% of the population, a proportion so small it can be considered rare. I’ll explain more about that later; first I want to describe the project as it developed.

I decided to use logistic regression with the dependent variable being either “made a planned gift” or “has not made a planned gift”. I cleaned the data and identified some strong relationships between the variables. After trying several combinations for the regression model, I had one with a Nagelkerke of .24, which is relatively good. (Nagelkerke is like a pseudo R squared; it can be loosely interpreted as the variability of the dependent variable that is accounted for by the model’s independent variables.) However, when I applied the algorithm to the study population, only 31 constituents without a planned gift and only 11 planned giving donors were identified as having a probability of giving of .5 or greater. I lowered the probability threshold of giving to .2 or greater and 105 non-planned givers and 52 planned gift donors fell into this range. This was still disappointing.

Desperate to identify more new potential prospects, I explored more criteria to narrow the study population and built three successive models. For the purpose of the follow-up exploratory research and this article, I re-built all four models using the same independent variables to easily compare their outcomes. Here’s a summary of the four models:

Models B, C, and D are all subsets of the original data set. Each model has advantages and disadvantages to it and I was uncertain how to evaluate them against one another. For example, each additional filtering criterion resulted in losing part of the target population, meaning that I systematically eliminated constituents with characteristics that are in fact associated with making a planned gift. I scored everyone who was identified with a probability of .2 or greater in any of the models by the number of models in which they were identified. I’m not unhappy with that solution, but since then I’ve been learning about better methods for targeting rare behavior.

If the OGP was interested only in prioritizing the prospects already in their pool of potential planned giving donors, model D would serve their need. However, we wanted to identify the best potential planned giving prospects within the database. If we want to uncover untapped potential in an ever-growing database, we need to explore methods on how to target rare behavior. This seems especially important in our field where 1) donating, in general, is somewhat rare and 2) donating really generous gifts is rarer. Better methods of targeting rare behavior will also be useful for modeling for special initiatives and unique kinds of gifts.

As I’ve been learning, logistic regression suffers from small sample bias when the target behavior is rare, relative to the study population. This helps explain why applying the algorithm to the original population resulted in very few new prospects–even though the model had a decent Nagelkerke of .24. Some analysts suggest using alternative sampling methods when the target behavior comprises less than 5% of the study. (See endnote.) Knowing that the planned gift donors in my original project comprised only 0.4% of the population, I decided to experiment with two new approaches.

In both of the exploratory models, I created the study population size so planned gift donors would comprise 5 percent. First, I generated a study population by including all 582 of the planned gift donors and a random selection of 11,060 non-planned-gift constituents (model E). Then, I applied the algorithm from that population to the entire non-planned-gift population of 132,418. In the second approach (model F), the planned gift population was randomly split into two equal size groups of 291. I also randomly selected 5,530 non-planned-gift constituents. To build the regression model, I combined one of the planned gift donor groups (of 291) with 5,530 non-planned-gift constituents. I then tested the algorithm on the holdout sample (the other planned giving group of 291 with 5,530 non-planned-gift constituents). Finally, I applied the algorithm to the entire original population of 133,000. Here are the results:

Using the same independent variables as in models A through D, model E had a Nagelkerke of .39 and model F .38, which helps substantiate that the independent variables are useful predictors for planned giving. Models E and F were more effective at predicting the planned givers (129 and 123 respectively with a probability of giving greater than or equal to .5) compared to model A (11), i.e. more than ten times as many. The sampling techniques have some advantages and disadvantages. The disadvantage is that by reducing the non-planned-gift population, it loses some of its variability and complexity. However, the advantage, in both models E and F, is that 1) the target population maintains its complexity, 2) new prospects are not limited by characteristic selection (the additional criteria that I used to reduce the population in models B, C, and D), which increases the likelihood of identifying constituents who were previously not on the OGP’s radar, and 3) the effects of the sample bias seem to be reduced.

It’s important to note that I displayed the measures (Nagelkerke and estimated probabilities) from the exploratory models and populations purely for comparison purposes. Because the study population is manipulated in the exploratory methods, the probability of giving should not be directly interpreted as actual probabilities. However, they can be used to prioritize those with the highest probabilities and that will serve our need.

To explore another comparison between models A and F, I ranked all 133,000 records in each. I then sorted all the records in model F in descending order. I took the top 1,000 records from model F and then ran correlation between the rank of model A and the rank of model F; they have a correlation of .282, meaning there is a substantial difference between the ranked records.

Over the last several months, Peter Wylie, Higher Education Consultant and Contractor, and I have been exchanging ideas on this topic. I thank him for his insight, suggestions, and encouragement to share my findings with our colleagues.

It would be helpful to learn about the methods you’ve used to target rare behavior. We could feel more confident about using alternative methods if repeat efforts produced similar outcomes. Furthermore, I did not have a chance to evaluate the prospecting performance of these models, so if you have used a method for targeting rare behavior and have had an opportunity to assess its effectiveness, I am very interested in learning about that. I welcome ideas, feedback, examples from your research, and questions in regard to this work. Please feel free to contact me at heinrichkellyl@gmail.com.

Endnotes

The ideas for these alternative approaches are adapted from the following articles:

Kelly Heinrich has been conducting quantitative research and analysis in higher education development for two and a half years. She has recently accepted a position as Assistant Director of Prospect Management and Analytics with Stanford University that will begin in June 2013.

## 15 January 2013

### The cautionary tale of Mr. S. John Doe

A few years ago I met with an experienced Planned Giving professional who had done very well over the years without any help from predictive modeling, and was doing me the courtesy of hearing my ideas. I showed this person a series of charts. Each chart showed a variable and its association with the condition of being a current Planned Giving expectancy. The ultimate goal would have been to consolidate these predictors together as a score, in order to discover new expectancies in that school’s alumni database. The conventional factors of giving history and donor loyalty are important, I conceded, but other engagement-related factors are also very predictive: student activities, alumni involvement, number of degrees, event attendance, and so on.

This person listened politely and was genuinely interested. And then I went too far.

One of my charts showed that there was a strong association between being a Planned Giving expectancy and having a single initial in the First Name field. I noted that, for some unexplained reason, having a preference for a name like “S. John Doe” seemed to be associated with a higher propensity to make a bequest. I thought that was cool.

The response was a laugh. A good-natured laugh, but still — a laugh. “That sounds like astrology!”

I had mistaken polite interest for a slam-dunk, and in my enthusiasm went too far out on a limb. I may have inadvertently caused the minting of a new data-mining skeptic. (Eventually, the professional retired after completing a successful career in Planned Giving, and having managed to avoid hearing much more about predictive modeling.)

At the time, I had hastened to explain that what we were looking at were correlations — loose, non-causal relationships among various characteristics, some of them non-intuitive or, as in this case, seemingly nonsensical. I also explained that the linkage was probably due to other variables (age and sex being prime candidates). Just because it’s without explanation doesn’t mean it’s not useful. But I suppose the damage was done. You win some, you lose some.

Although some of the power (and fun) of predictive modeling rests on the sometimes non-intuitive and unexplained nature of predictor variables, I now think it’s best to frame any presentation to a general audience in terms of what they think of as “common sense”. Limiting, yes. But safer. Unless you think your listener is really picking up what you’re laying down, keep it simple, keep it intuitive, and keep it grounded.

So much for sell jobs. Let’s get back to the data … What ABOUT that “first-initial” variable? Does it really mean anything, or is it just noise? Is it astrology?

I’ve got this data set in front of me — all alumni with at least some giving in the past ten years. I see that 1.2% percent of all donors have a first initial at the front of their name. When I look at the subset of the records that are current Planned Giving expectancies, I see that 4.6% have a single-initial first name. In other words, Planned Giving expectancies are almost four times as likely as all other donors to have a name that starts with a single initial. The data file is fairly large — more than 17,000 records — and the difference is statistically significant.

What can explain this? When I think of a person whose first name is an initial and who tends to go by their middle name, the image that comes to mind is that of an elderly male with a higher than average income — like a retired judge, say. For each of the variables Age and Male, there is in fact a small positive association with having a one-character first name. Yet, when I account for both ‘Age’ and ‘Male’ in a regression analysis, the condition of having a leading initial is still significant and still has explanatory power for being a Planned Giving expectancy.

I can’t think of any other underlying reasons for the connection with Planned Giving. Even when I continue to add more and more independent variables to the regression, this strange predictor hangs in there, as sturdy as ever. So, it’s certainly interesting, and I usually at least look at it while building models.

On the other hand … perhaps there is some justification for the verdict of “astrology” (that is, “nonsense”). The data set I have here may be large, but the number of Planned Giving expectancies is less than 500 — and 4.6% of 500 is not very many records. Regardless of whether p ≤ 0.0001, it could still be just one of those things. I’ve also learned that complex models are not better than simple ones, particularly when trying to predict something hard like Planned Giving propensity. A quirky variable that suggests no potential causal pathway makes me wary of the possibility of overfitting the noise in my data and missing the signal.

Maybe it’s useful, maybe it’s not. Either way, whether I call it “cool” or not will depend on who I’m talking to.

## 28 February 2011

### Look beyond loyal donors to find Planned Giving prospects

Filed under: Alumni, Planned Giving, predictive modeling, Predictor variables — Tags: , — kevinmacdonell @ 9:17 am

According to conventional wisdom, the best Planned Giving prospects are donors who have consistently given small Annual Fund gifts over a long period of time. Rather than assume this is true always and everywhere, I think we should put the “loyal donor” rule of thumb to the test in the environment of our own data.

Here’s what I did recently. I picked a group of current Planned Giving expectancies, and pulled their giving totals for the 20 fiscal years prior to their identification. To select the group, I chose everyone identified as an expectancy in the year 2003 or later, so the years of giving that I pulled where 1983 to 2002. I also limited the group to people who are now at least 50 years old.  This ensured that everyone in the group was probably old enough to have participated in the Annual Fund during any of those years if they chose.

I didn’t look at how much they gave in any given year, only whether they gave. Expectancies who gave in 20 out of 20 years received a “score” of 20. Someone who had given in 10 years out of 20 got a score of 10, and so on. Non-donors were scored as zero.

Then I made a bar chart of their scores. The height of the bars corresponds to the percentage of the group that falls into each number of years of giving in that 20-year span.

What does this chart tell us? It’s clear these expectancies are indeed very loyal donors. A little under half of them have some giving in at least 10 of the 20 years. That’s wonderful.

I am struck that 15% of them have no giving at all. On the other hand, the proportion of alumni over 50 who are NOT Planned Giving expectancies and have no giving is 61%, so the expectancies compare well against them.

Here’s the same chart, but with all alumni over 50 who are not current expectancies:

Big difference! The scale is totally different, due to the disproportionate number of non-donors in this group. As a percentage of alumni, very loyal donors are scarce. Let’s look at it another way, excluding non-donors from both groups: In the chart below, the expectancy donors have giving in twice as many years as non-expectancy donors, on average:

No wonder, then, we’ve been told to focus on loyal Annual Fund donors in order to identify new prospects for Planned Giving. The connection is undeniable.

A couple of things interfere with the clarity of this picture, however. Have another look at the first chart above. Although all of these people are old enough to have contributed in every year since 1983, a significant percentage of them have given in only a handful of years. For example, 6 percent of current expectancies have giving in only ONE of the 20 years. They share that distinction with 10,000 alumni who are not expectancies.

In other words, if years of giving was your only metric for proactively identifying prospects, and no expectancies came in “over the transom,” so to speak, that 6 percent of the group would never be discovered. There are just too many individuals at the lower end of years-of-giving to get focused in any practical way. Donor loyalty is therefore a great predictor of Planned Giving potential, but it does not define the profile of a Planned Giving donor.

If donor data does not contain all the answers, where can you look? I have a few ideas.

Using the same group of current expectancies (age 50 or older, and identified in 2003 or later), I pulled some other characteristics from the database to test as predictors. I was careful to select data that existed before 2003, i.e. that pre-dated the identification of the individuals as expectancies.

Here’s a great one: Student activities. Participation in varsity sports, campus clubs and student government is coded in the database, and the chart below compares the proportions of the two groups who have at least one such activity code in their records.

Interesting, eh? Now, maybe ten or 15 years ago there was a big push on to solicit former athletes for Planned Giving, and that’s why they’re well represented in the current crop of expectancies — but I doubt that very much. The evidence indicates that student experience is a big factor even for decisions taken many years later. This is a great example of how even the oldest data is valuable in the present day.

Here’s another one: Alumni who hold more than one degree. The proportions on both sides are high, because I counted degrees from ANY university (we have that information in our database), and we have many graduate and professional degree holders. The chart would seem to indicate that expectancies are more likely to hold multiple degrees than non-expectancies. A little more digging would tell us whether a particular profession (doctors or lawyers, for example) are heavily represented among the expectancies group.

Here’s another one, for the presence of a Faculty or Staff code, which indicates whether someone is or at one time was employed by the university. This code is not uniformly applied (it does not directly correspond to actual employment or even HR data), so it’s not perfect, but as a rough indicator it works fine for data mining.

Next up is one of my very favourite predictors for Planned Giving potential: event attendance. I’ve seen this elsewhere, and it holds true here as well. Showing up at any kind of reunion or alumni-related event is highly predictive. I got a little lazy when I calculated this variable because I did not exclude events attended in 2003 or later; I would expect the percentages to change a bit, but probably not by much. I DID exclude attendance at any kind of donor-recognition event — if only donors are invited, attendance is merely a proxy for donor status.

I could do this for a dozen more variables, but you get the point. There are all sorts of additional indicators of Planned Giving potential sitting in your database. As well, my predictors are not necessarily your predictors. It’s up to you to do a little digging and find them.

From here, we could have niggling arguments about whether some of these predictors are really better than ‘donor loyalty’, or are even statistically significant, and so on. But if you are currently trying to identify prospects solely by identifying loyal donors, allow me to suggest this improvement in your methods: Devise a simple scoring system that gives one point for ‘donor loyalty’ (however you wish to define that — I’ve defined it as giving in at least 10 years out of 20), and one point for each of the other predictors that strike you as particularly powerful. Using the predictors I’ve presented here, my score would be calculated like so:

Loyal donor (0/1) + Student activity (0/1) + Multiple degrees (0/1) + Faculty or Staff (0/1) + Event attendance (0/1) = Maximum PG score of 5.

What happens when I apply this model to our database? Out of more than 30,000 living and addressable alumni over the age of 50 who are not already expectancies, only 89 have a perfect score of 5 out of 5. That’s a very manageable, high-quality list of individuals to provide for review by a Planned Giving Officer.

This model is far from the last word in data mining for Planned Giving, and it has some severe limitations. For example, focusing on these 89 individuals might essentially result in a campaign based on retired professors in the Faculty of Medicine! Your expectancies are not going to be one homogeneous group, so you’ll want to identify other clusters for solicitation. As well, almost 700 individuals in our database would have a score of 4 out of 5, so things get out of hand quickly when you have too few score levels.

Otherwise, it’s pretty nifty. This score is easy to understand, not terribly difficult to calculate, and is a useful departure from any single-minded focus on donor loyalty.

## 10 December 2010

### In-memory gifts and Planned Giving potential

Filed under: Planned Giving, Predictor variables — Tags: — kevinmacdonell @ 2:16 pm

Back in September I read a blog post by Jonathan Grapsas about the possible connection between gifts made in memory and Planned Giving expectancies. (The link between in memory and legacies?) He writes, “People are making a gift in memory of someone they care about. They are in that head space.”

I was delighted to discover that memorial gifts are identified with a code in our database, so in I went in search of a connection in our own data. I found that of any alum who has ever made a gift, only 2.1% have ever made a memorial gift. But of all current Planned Giving expectancies who are donors, 9.8% have done so.

Now, I haven’t dug deep: These could be gifts tied to the donor’s own planned gift and which came after the commitment was made. But what if this turns out to be a real difference in giving behaviour and a predictor for Planned Giving?

Up to this point, I haven’t thought of in-memory gifts as an indicator of affinity. The donor is motivated by the desire to honour a friend or loved one, not any identification with your mission or nostalgia for alma mater. In other words, it seems probable that the gift given in memory is typically not up for renewal. To discover that the behaviour may be associated with an especially elusive class of donor is exciting.

There is no need to simply accept the conventional wisdom that the best Planned Giving prospects are the ones who have consistently given small amounts to the annual fund over a long period of time. This behaviour is certainly a predictor for bequests, but it does not typify them.

## 5 August 2010

### Perception vs. reality on the Number 80 bus

Filed under: Planned Giving, Statistics — kevinmacdonell @ 11:26 am

(Photo used under Creative Commons license. Click image for source.)

Do you ride the bus back and forth to work? I do. Some days it’s a quick trip, and other days it just goes on forever. There’s this one stop where the driver will park the bus and just sit there, as the minutes tick by. How dare she. Doesn’t she know I’m in a hurry?

I have some flexibility in office hours, at least during the summer, so I set out to pick the best times to travel. I wanted to know: Which buses on the Number 80 route were most catchable (i.e., had a very predictable time of arrival at the stop closest to my house), were fastest and most reliable (i.e., exhibited the least variability in travel times) and were least full (so I wouldn’t have to stand the whole way).

I was sure that there was some optimal combination of these three, but I couldn’t figure it out just by riding the bus. There didn’t seem to be any discernable pattern to my experience. I did not believe it was random, so there was one conclusion: It’s a data problem.

So I’ve been collecting data on my bus rides, and I’ve just had a look at it. What I found out had less to do with the bus route than with the nature of perceived reality. What you think is going on isn’t necessarily what’s actually happening. (And yes, I’ll bring this back to fundraising.)

I record the time I sit down, and the time I land on the sidewalk at my destination. I note the day of the week (maybe Mondays are quicker rides than Fridays) and the month (maybe buses are less full during the summer months when people are on vacation). I also note how full the bus is (on a scale of 1 to 5), and whether I have to stand (0/1). And finally, I make note of outliers due to “disruptive events” (unusually long construction delays, mechanical failure, etc.)

No one but a geek would do this. But it takes only a few seconds — and if you’re interested in statistics, collecting your own data can be instructive in itself.

I haven’t collected enough data points on the Number 80 bus to reveal all its secrets, but I learned enough to know that I have no sense of elapsed time. Leaving out one extreme outlier, my average trip duration (in either direction) is 38 minutes. So how much do individual trips vary from 38 minutes? Well, 79% of all trips vary from the average by three minutes or less. Three whole minutes! Allow just one more minute of variance, and 90% of trips fit in that window.

All other patterns related to duration are pretty subtle: Late-morning rush hour buses, and the 4:45 p.m. bus tend to have the largest variance from the mean, the first because it’s a quicker trip, the second because it’s longer. The trip home is longer than the morning commute by only about one minute, on average. Tuesdays tend to bring slightly longer trips than any other day of the week — Tuesdays also have the highest average “fullness factor”.

But really, I can hop on any Number 80 bus and expect to get to my destination in 38 minutes, give or take a couple of minutes. That’s a far cry from how I perceive my commuting time: Some quick rides, some unbearably long ones. In fact, they’re all about the same. The bus driver is not trying to drive me crazy by parking the bus in mid-trip; she’s ahead of schedule and needs to readjust so commuters farther down the line don’t miss their bus.

If we can get simple things wrong, think of all the other assumptions we make about complex stuff, assumptions that could be either confirmed or discarded via a little intelligent measuring and analysis. According to what people widely believe about Planned Giving, you can go into your database right now and skim off the top alumni by years of giving and frequency of giving, and call them your top Planned Giving prospects. Your consistent donors are your best prospects, right?

Not necessarily. In fact, in one school’s data, I determined that if all their current, known Planned Giving expectancies were hidden in the database like needles in a haystack, and one were only allowed to use these patterns of past giving to find them again, they would miss two-thirds of them!

We are not wrong to have beliefs about how stuff works, but we are wrong in clinging to beliefs when the answers are waiting there in the data. The point is not that past giving is or isn’t a determinant of Planned Giving potential for your institution — the point is that you can find that out.

## 1 April 2010

### Does “no children” really mean Planned Giving potential?

Filed under: Planned Giving, Predictor variables, Surveying — Tags: , , , — kevinmacdonell @ 11:29 am

I gave a presentation to fundraising professionals and other nonprofit types recently, and I spent a little time discussing my work with predicting Planned Giving potential. One of the attendees asked if I was aware of a recent study that found that the most significant predictor for Planned Giving was the absence of children.

I had, and in my (not very coherent) response I said something to the effect that although this was interesting, I had reservations about taking an observation based on other institutions’ populations and applying it to ours. I would prefer to test it, I said. (I believe that someone else’s valid observation about their own data is only an assumption when applied blindly to mine.) And then I said that we don’t have the data to begin with.

But as I was talking, a thought occurred to me: Yes, in fact we DO have child data! I had even used that data in my PG model, but it had never occurred to me to study it very closely.

Back in the spring of 2009, our school conducted an extensive online survey of alumni as part of a national benchmarking study of alumni engagement. One of the core questions (supplied by the study firm, Engagement Analysis Inc.) asked specifically about likelihood to consider a bequest. Another question, which we added ourselves, asked respondents how many children they had under the age of 18. (We had a purpose in asking about “under 18″, and it wasn’t Planned Giving. Had I specifically been seeking a PG predictor, I would not have qualified the statement. Presumably the positive “childless effect” is explained by the lack of need to divide an estate up among children, regardless of their age.)

Our response rate was very high, and quite representative of our alumni population. Standing there in the midst of my presentation, I realized I had enough information to test the ‘childless’ theory in the environment of our own data.

The chart below shows survey responses to the PG question on the horizontal axis. The question was actually a scale statement which indicated that the responder was very likely to leave a bequest to our institition. Possible answers ranged from 1 to 6, with a one meaning “strongly disagree” and a six meaning “strongly agree”. If the respondent did not answer the question, I coded it as zero so it would show up on my chart.

In the chart, each group of respondents (i.e., each vertical bar) is segmented according to their answer on the “children” question. Notice the relative size of the blue segments, the responders who have no children under 18. For the proportion of this segment, there is a difference of approximately ten percentage points between the “strongly agree” group and the “strongly disagree” group.

In other words, childless alumni in our survey data set ARE more receptive to considering Planned Giving.

I said earlier that the survey response was representative of our alumni population. Therefore, many of the responders are far too young to be considered prospects. So I made another chart, which shows only alumni in the older half of the population: Class year 1990 and earlier. The difference between these two charts will seem subtle because they’re busy-looking, so let me point it out to you: Now the gap between the “strongly disagree” and the “strongly agree” for people with no kids has widened to 15 percentage points. This is a vote of confidence in favour of using “number of children” as a predictor of PG receptivity.

But here’s a question: Can you use child data to segment your prospect pool, and thereby avoid having to engage in predictive modeling? My answer is “No.” In both of the charts above, a majority of respondents answered “no children”, regardless of their attitude to Planned Giving. Yes, there’s a difference among the groups, but although it is significant, it is not definitive.

Others may quibble, saying that the data is suspect because we only asked about children under 18. But I really think this predictor is a lot like certain other conventional predictors, the ones related to frequency and consistency of giving: Alone, they are not powerful enough to isolate your best PG prospects. Only when you combine them with the full universe of other proven predictors in your database (event attendance, marital status, etc.) will you end up with something truly useful.

Older Posts »