CoolData blog

7 January 2015

New finds in old models


When you build a predictive model, you can never be sure it’s any good until it’s too late. Deploying a mediocre model isn’t the worst mistake you can make, though. The worst mistake would be to build a second mediocre model because you haven’t learned anything from the failure of the first.


Performance against a holdout data set for validation is not a reliable indicator of actual performance after deployment. Validation may help you decide which of two or more competing models to use, or it may provide reassurance that your one model isn’t total junk. It’s not proof of anything, though. Those lovely predictors, highly correlated with the outcome, could be fooling you. There are no guarantees they’re predictive of results over the year to come.


In the end, the only real evidence of a model’s worth is how it performs on real results. The problem is, those results happen in the future. So what is one to do?


I’ve long been fascinated with Planned Giving likelihood. Making a bequest seems like the ultimate gesture of institutional affinity (ultimate in every sense). On the plus side, that kind of affinity ought to be clearly evidenced in behaviours such as event attendance, giving, volunteering and so on. On the negative side, Planned Giving interest is uncommon enough that comparing expectancies with non-expectancies will sometimes lead to false predictors based on sparse data. For this reason, my goal of building a reliable model for predicting Planned Giving likelihood has been elusive.


Given that a validation data set taken from the same time period as the training data can produce misleading correlations, I wondered whether I could do one better: That is, be able to draw my holdout sample not from data of the same time period as that used to build the model, but from the future.


As it turned out, yes, I could.


Every year I save my regression analyses as Data Desk files. Although I assess the performance of the output scores, I don’t often go back to the model files themselves. However, they’re there as a document of how I approached modelling problems in the past. As a side benefit, each file is also a snapshot of the alumni population at that point in time. These data sets may consist of a hundred or more candidate predictor variables — a well-rounded picture.


My thinking went like this: Every old model file represents data from the past. If I pretend that this snapshot is really the present, then in order to have access to knowledge of the future, all I have to do is look at today’s data stored in the database.


For example, for this blog post, I reached back two years to a model I created in Data Desk for predicting likelihood to upgrade to the Leadership level in Annual Giving. I wasn’t interested in the model itself. Rather, I wanted to examine the underlying variables I had to work with at the time. This model had been an ambitious undertaking, with some 170 variables prepared for analysis. Many of course were transformations of variables or combinations of interacting variables. Among all those variables was one indicating whether a case was a current Planned Giving expectancy or not, at that point in time.


In this snapshot of the database from two years ago, some of the cases that were not expectancies would have become so since then. In other words, I now had the best of both worlds. I had a comprehensive set of potential predictors as they existed two years ago, AND access to the hitherto unknowable future: The identities of the people who had become expectancies after the predictors had been frozen in time.


As I said, my old model was not intended to predict Planned Giving inclination. So I built a new model, using “Is an Expectancy” (0/1) as the target variable. I trained the regression model on the two-year-old expectancy data — I didn’t even look at the new expectancies while building the model. No: I used those new expectancies as my validation data set.


“Validation” might be too strong a word, given that there were only 80 or so new cases. That’s a lot of bequest intentions, for sure, but in terms of data it’s a drop in the bucket compared with the number of cases being scored. Let’s call it a test data set. I used this test set to help me analyze the model, in a couple of ways.


First I looked at how new expectancies were scored by the model I had just built. The chart below shows their distribution by score decile. Slightly more than 50% of new expectancies were in the top decile. This looks pretty good — keeping in mind that this is what actual performance would have looked like had I really built this model two years ago (which I could have):




(Even better, looking at percentiles, most of the expectancies in that top 10% are concentrated nicely in the top few percentiles.)


But I didn’t stop there. It is also evident that almost half of new expectancies fell outside the top 10 percent of scores, so clearly there was room for improvement. My next step was to examine the individual predictors I had used in the model. These were of course the predictors most highly correlated with being an expectancy. They were roughly the following:
  • Year person’s personal information in the database was last updated
  • Number of events attended
  • Age
  • Year of first gift
  • Number of alumni activities
  • Indicated “likely to donate” on 2009 alumni survey
  • Total giving in last five years (log transformed)
  • Combined length of name Prefix + Suffix


I ranked the correlation of each of these with the 0/1 indicator meaning “new expectancy,” and found that most of the predictors were still fine, although they changed their order in the rank correlation. Donor likelihood (from survey) and recent giving were more important, and alumni activities and how recently a person’s record was updated were less important.


This was interesting and useful, but what was even more useful was looking at the correlations between ALL potential predictors and the state of being a new expectancy. A number of predictors that would have been too far down the ranked list to consider using two years ago were suddenly looking much better. In particular, many variables related to participation in alumni surveys bubbled closer to the top as potentially significant.


This exercise suggests a way to proceed with iterative, yearly improvements to some of your standard models:
  • Dig up an old model from a year or more ago.
  • Query the database for new cases that represent the target variable, and merge them with the old datafile.
  • Assess how your model performed or, if you created more than one model, see which model would have performed best. (You should be doing this anyway.)
  • Go a layer deeper, by studying the variables that went into those models — the data “as it was” — to see which variables had correlations that tricked you into believing they were predictive, and which variables truly held predictive power but may have been overlooked.
  • Apply what you learn to the next iteration of the model. Leave out the variables with spurious correlations, and give special consideration to variables that may have been underestimated before.

28 May 2013

Targeting rare behavior

Filed under: Planned Giving, regression — Tags: , , — kevinmacdonell @ 5:21 am

Guest post by Kelly Heinrich, Assistant Director of Prospect Management and Analytics, Stanford University

Last August, about two months into a data analyst position with a university’s development division, I had the task to build a predictive model for the Office of Gift Planning (OGP). The OGP wanted a tool to help them focus on the constituents who are most likely to make a planned gift. I wanted to identify a few hundred of the best planned giving prospects who I could prioritize by the probability of donating. After a bit of preliminary research, I chose: 1) 50 years of age and older and 2) inclusion in a recent wealth screening as the criteria for the study population. This generated a file of 133,000 records; 582 of them were planned gift donors. I’ve worked with files larger than this and did not expect a problem. However, that turned out to be a mistake because the planned gift donors, who exhibited the target behavior, comprised 0.4% of the population, a proportion so small it can be considered rare. I’ll explain more about that later; first I want to describe the project as it developed.

I decided to use logistic regression with the dependent variable being either “made a planned gift” or “has not made a planned gift”. I cleaned the data and identified some strong relationships between the variables. After trying several combinations for the regression model, I had one with a Nagelkerke of .24, which is relatively good. (Nagelkerke is like a pseudo R squared; it can be loosely interpreted as the variability of the dependent variable that is accounted for by the model’s independent variables.) However, when I applied the algorithm to the study population, only 31 constituents without a planned gift and only 11 planned giving donors were identified as having a probability of giving of .5 or greater. I lowered the probability threshold of giving to .2 or greater and 105 non-planned givers and 52 planned gift donors fell into this range. This was still disappointing.

Desperate to identify more new potential prospects, I explored more criteria to narrow the study population and built three successive models. For the purpose of the follow-up exploratory research and this article, I re-built all four models using the same independent variables to easily compare their outcomes. Here’s a summary of the four models:


Models B, C, and D are all subsets of the original data set. Each model has advantages and disadvantages to it and I was uncertain how to evaluate them against one another. For example, each additional filtering criterion resulted in losing part of the target population, meaning that I systematically eliminated constituents with characteristics that are in fact associated with making a planned gift. I scored everyone who was identified with a probability of .2 or greater in any of the models by the number of models in which they were identified. I’m not unhappy with that solution, but since then I’ve been learning about better methods for targeting rare behavior.

If the OGP was interested only in prioritizing the prospects already in their pool of potential planned giving donors, model D would serve their need. However, we wanted to identify the best potential planned giving prospects within the database. If we want to uncover untapped potential in an ever-growing database, we need to explore methods on how to target rare behavior. This seems especially important in our field where 1) donating, in general, is somewhat rare and 2) donating really generous gifts is rarer. Better methods of targeting rare behavior will also be useful for modeling for special initiatives and unique kinds of gifts.

As I’ve been learning, logistic regression suffers from small sample bias when the target behavior is rare, relative to the study population. This helps explain why applying the algorithm to the original population resulted in very few new prospects–even though the model had a decent Nagelkerke of .24. Some analysts suggest using alternative sampling methods when the target behavior comprises less than 5% of the study. (See endnote.) Knowing that the planned gift donors in my original project comprised only 0.4% of the population, I decided to experiment with two new approaches.

In both of the exploratory models, I created the study population size so planned gift donors would comprise 5 percent. First, I generated a study population by including all 582 of the planned gift donors and a random selection of 11,060 non-planned-gift constituents (model E). Then, I applied the algorithm from that population to the entire non-planned-gift population of 132,418. In the second approach (model F), the planned gift population was randomly split into two equal size groups of 291. I also randomly selected 5,530 non-planned-gift constituents. To build the regression model, I combined one of the planned gift donor groups (of 291) with 5,530 non-planned-gift constituents. I then tested the algorithm on the holdout sample (the other planned giving group of 291 with 5,530 non-planned-gift constituents). Finally, I applied the algorithm to the entire original population of 133,000. Here are the results:


Using the same independent variables as in models A through D, model E had a Nagelkerke of .39 and model F .38, which helps substantiate that the independent variables are useful predictors for planned giving. Models E and F were more effective at predicting the planned givers (129 and 123 respectively with a probability of giving greater than or equal to .5) compared to model A (11), i.e. more than ten times as many. The sampling techniques have some advantages and disadvantages. The disadvantage is that by reducing the non-planned-gift population, it loses some of its variability and complexity. However, the advantage, in both models E and F, is that 1) the target population maintains its complexity, 2) new prospects are not limited by characteristic selection (the additional criteria that I used to reduce the population in models B, C, and D), which increases the likelihood of identifying constituents who were previously not on the OGP’s radar, and 3) the effects of the sample bias seem to be reduced.

It’s important to note that I displayed the measures (Nagelkerke and estimated probabilities) from the exploratory models and populations purely for comparison purposes. Because the study population is manipulated in the exploratory methods, the probability of giving should not be directly interpreted as actual probabilities. However, they can be used to prioritize those with the highest probabilities and that will serve our need.

To explore another comparison between models A and F, I ranked all 133,000 records in each. I then sorted all the records in model F in descending order. I took the top 1,000 records from model F and then ran correlation between the rank of model A and the rank of model F; they have a correlation of .282, meaning there is a substantial difference between the ranked records.

Over the last several months, Peter Wylie, Higher Education Consultant and Contractor, and I have been exchanging ideas on this topic. I thank him for his insight, suggestions, and encouragement to share my findings with our colleagues.

It would be helpful to learn about the methods you’ve used to target rare behavior. We could feel more confident about using alternative methods if repeat efforts produced similar outcomes. Furthermore, I did not have a chance to evaluate the prospecting performance of these models, so if you have used a method for targeting rare behavior and have had an opportunity to assess its effectiveness, I am very interested in learning about that. I welcome ideas, feedback, examples from your research, and questions in regard to this work. Please feel free to contact me at


The ideas for these alternative approaches are adapted from the following articles:

Kelly Heinrich has been conducting quantitative research and analysis in higher education development for two and a half years. She has recently accepted a position as Assistant Director of Prospect Management and Analytics with Stanford University that will begin in June 2013.

15 January 2013

The cautionary tale of Mr. S. John Doe

A few years ago I met with an experienced Planned Giving professional who had done very well over the years without any help from predictive modeling, and was doing me the courtesy of hearing my ideas. I showed this person a series of charts. Each chart showed a variable and its association with the condition of being a current Planned Giving expectancy. The ultimate goal would have been to consolidate these predictors together as a score, in order to discover new expectancies in that school’s alumni database. The conventional factors of giving history and donor loyalty are important, I conceded, but other engagement-related factors are also very predictive: student activities, alumni involvement, number of degrees, event attendance, and so on.

This person listened politely and was genuinely interested. And then I went too far.

One of my charts showed that there was a strong association between being a Planned Giving expectancy and having a single initial in the First Name field. I noted that, for some unexplained reason, having a preference for a name like “S. John Doe” seemed to be associated with a higher propensity to make a bequest. I thought that was cool.

The response was a laugh. A good-natured laugh, but still — a laugh. “That sounds like astrology!”

I had mistaken polite interest for a slam-dunk, and in my enthusiasm went too far out on a limb. I may have inadvertently caused the minting of a new data-mining skeptic. (Eventually, the professional retired after completing a successful career in Planned Giving, and having managed to avoid hearing much more about predictive modeling.)

At the time, I had hastened to explain that what we were looking at were correlations — loose, non-causal relationships among various characteristics, some of them non-intuitive or, as in this case, seemingly nonsensical. I also explained that the linkage was probably due to other variables (age and sex being prime candidates). Just because it’s without explanation doesn’t mean it’s not useful. But I suppose the damage was done. You win some, you lose some.

Although some of the power (and fun) of predictive modeling rests on the sometimes non-intuitive and unexplained nature of predictor variables, I now think it’s best to frame any presentation to a general audience in terms of what they think of as “common sense”. Limiting, yes. But safer. Unless you think your listener is really picking up what you’re laying down, keep it simple, keep it intuitive, and keep it grounded.

So much for sell jobs. Let’s get back to the data … What ABOUT that “first-initial” variable? Does it really mean anything, or is it just noise? Is it astrology?

I’ve got this data set in front of me — all alumni with at least some giving in the past ten years. I see that 1.2% percent of all donors have a first initial at the front of their name. When I look at the subset of the records that are current Planned Giving expectancies, I see that 4.6% have a single-initial first name. In other words, Planned Giving expectancies are almost four times as likely as all other donors to have a name that starts with a single initial. The data file is fairly large — more than 17,000 records — and the difference is statistically significant.

What can explain this? When I think of a person whose first name is an initial and who tends to go by their middle name, the image that comes to mind is that of an elderly male with a higher than average income — like a retired judge, say. For each of the variables Age and Male, there is in fact a small positive association with having a one-character first name. Yet, when I account for both ‘Age’ and ‘Male’ in a regression analysis, the condition of having a leading initial is still significant and still has explanatory power for being a Planned Giving expectancy.

I can’t think of any other underlying reasons for the connection with Planned Giving. Even when I continue to add more and more independent variables to the regression, this strange predictor hangs in there, as sturdy as ever. So, it’s certainly interesting, and I usually at least look at it while building models.

On the other hand … perhaps there is some justification for the verdict of “astrology” (that is, “nonsense”). The data set I have here may be large, but the number of Planned Giving expectancies is less than 500 — and 4.6% of 500 is not very many records. Regardless of whether p ≤ 0.0001, it could still be just one of those things. I’ve also learned that complex models are not better than simple ones, particularly when trying to predict something hard like Planned Giving propensity. A quirky variable that suggests no potential causal pathway makes me wary of the possibility of overfitting the noise in my data and missing the signal.

Maybe it’s useful, maybe it’s not. Either way, whether I call it “cool” or not will depend on who I’m talking to.

28 February 2011

Look beyond loyal donors to find Planned Giving prospects

Filed under: Alumni, Planned Giving, predictive modeling, Predictor variables — Tags: , — kevinmacdonell @ 9:17 am

According to conventional wisdom, the best Planned Giving prospects are donors who have consistently given small Annual Fund gifts over a long period of time. Rather than assume this is true always and everywhere, I think we should put the “loyal donor” rule of thumb to the test in the environment of our own data.

Here’s what I did recently. I picked a group of current Planned Giving expectancies, and pulled their giving totals for the 20 fiscal years prior to their identification. To select the group, I chose everyone identified as an expectancy in the year 2003 or later, so the years of giving that I pulled where 1983 to 2002. I also limited the group to people who are now at least 50 years old.  This ensured that everyone in the group was probably old enough to have participated in the Annual Fund during any of those years if they chose.

I didn’t look at how much they gave in any given year, only whether they gave. Expectancies who gave in 20 out of 20 years received a “score” of 20. Someone who had given in 10 years out of 20 got a score of 10, and so on. Non-donors were scored as zero.

Then I made a bar chart of their scores. The height of the bars corresponds to the percentage of the group that falls into each number of years of giving in that 20-year span.

What does this chart tell us? It’s clear these expectancies are indeed very loyal donors. A little under half of them have some giving in at least 10 of the 20 years. That’s wonderful.

I am struck that 15% of them have no giving at all. On the other hand, the proportion of alumni over 50 who are NOT Planned Giving expectancies and have no giving is 61%, so the expectancies compare well against them.

Here’s the same chart, but with all alumni over 50 who are not current expectancies:

Big difference! The scale is totally different, due to the disproportionate number of non-donors in this group. As a percentage of alumni, very loyal donors are scarce. Let’s look at it another way, excluding non-donors from both groups: In the chart below, the expectancy donors have giving in twice as many years as non-expectancy donors, on average:

No wonder, then, we’ve been told to focus on loyal Annual Fund donors in order to identify new prospects for Planned Giving. The connection is undeniable.

A couple of things interfere with the clarity of this picture, however. Have another look at the first chart above. Although all of these people are old enough to have contributed in every year since 1983, a significant percentage of them have given in only a handful of years. For example, 6 percent of current expectancies have giving in only ONE of the 20 years. They share that distinction with 10,000 alumni who are not expectancies.

In other words, if years of giving was your only metric for proactively identifying prospects, and no expectancies came in “over the transom,” so to speak, that 6 percent of the group would never be discovered. There are just too many individuals at the lower end of years-of-giving to get focused in any practical way. Donor loyalty is therefore a great predictor of Planned Giving potential, but it does not define the profile of a Planned Giving donor.

If donor data does not contain all the answers, where can you look? I have a few ideas.

Using the same group of current expectancies (age 50 or older, and identified in 2003 or later), I pulled some other characteristics from the database to test as predictors. I was careful to select data that existed before 2003, i.e. that pre-dated the identification of the individuals as expectancies.

Here’s a great one: Student activities. Participation in varsity sports, campus clubs and student government is coded in the database, and the chart below compares the proportions of the two groups who have at least one such activity code in their records.

Interesting, eh? Now, maybe ten or 15 years ago there was a big push on to solicit former athletes for Planned Giving, and that’s why they’re well represented in the current crop of expectancies — but I doubt that very much. The evidence indicates that student experience is a big factor even for decisions taken many years later. This is a great example of how even the oldest data is valuable in the present day.

Here’s another one: Alumni who hold more than one degree. The proportions on both sides are high, because I counted degrees from ANY university (we have that information in our database), and we have many graduate and professional degree holders. The chart would seem to indicate that expectancies are more likely to hold multiple degrees than non-expectancies. A little more digging would tell us whether a particular profession (doctors or lawyers, for example) are heavily represented among the expectancies group.

Here’s another one, for the presence of a Faculty or Staff code, which indicates whether someone is or at one time was employed by the university. This code is not uniformly applied (it does not directly correspond to actual employment or even HR data), so it’s not perfect, but as a rough indicator it works fine for data mining.

Next up is one of my very favourite predictors for Planned Giving potential: event attendance. I’ve seen this elsewhere, and it holds true here as well. Showing up at any kind of reunion or alumni-related event is highly predictive. I got a little lazy when I calculated this variable because I did not exclude events attended in 2003 or later; I would expect the percentages to change a bit, but probably not by much. I DID exclude attendance at any kind of donor-recognition event — if only donors are invited, attendance is merely a proxy for donor status.

I could do this for a dozen more variables, but you get the point. There are all sorts of additional indicators of Planned Giving potential sitting in your database. As well, my predictors are not necessarily your predictors. It’s up to you to do a little digging and find them.

From here, we could have niggling arguments about whether some of these predictors are really better than ‘donor loyalty’, or are even statistically significant, and so on. But if you are currently trying to identify prospects solely by identifying loyal donors, allow me to suggest this improvement in your methods: Devise a simple scoring system that gives one point for ‘donor loyalty’ (however you wish to define that — I’ve defined it as giving in at least 10 years out of 20), and one point for each of the other predictors that strike you as particularly powerful. Using the predictors I’ve presented here, my score would be calculated like so:

Loyal donor (0/1) + Student activity (0/1) + Multiple degrees (0/1) + Faculty or Staff (0/1) + Event attendance (0/1) = Maximum PG score of 5.

What happens when I apply this model to our database? Out of more than 30,000 living and addressable alumni over the age of 50 who are not already expectancies, only 89 have a perfect score of 5 out of 5. That’s a very manageable, high-quality list of individuals to provide for review by a Planned Giving Officer.

This model is far from the last word in data mining for Planned Giving, and it has some severe limitations. For example, focusing on these 89 individuals might essentially result in a campaign based on retired professors in the Faculty of Medicine! Your expectancies are not going to be one homogeneous group, so you’ll want to identify other clusters for solicitation. As well, almost 700 individuals in our database would have a score of 4 out of 5, so things get out of hand quickly when you have too few score levels.

Otherwise, it’s pretty nifty. This score is easy to understand, not terribly difficult to calculate, and is a useful departure from any single-minded focus on donor loyalty.

10 December 2010

In-memory gifts and Planned Giving potential

Filed under: Planned Giving, Predictor variables — Tags: — kevinmacdonell @ 2:16 pm

Back in September I read a blog post by Jonathan Grapsas about the possible connection between gifts made in memory and Planned Giving expectancies. (The link between in memory and legacies?) He writes, “People are making a gift in memory of someone they care about. They are in that head space.”

I was delighted to discover that memorial gifts are identified with a code in our database, so in I went in search of a connection in our own data. I found that of any alum who has ever made a gift, only 2.1% have ever made a memorial gift. But of all current Planned Giving expectancies who are donors, 9.8% have done so.

Now, I haven’t dug deep: These could be gifts tied to the donor’s own planned gift and which came after the commitment was made. But what if this turns out to be a real difference in giving behaviour and a predictor for Planned Giving?

Up to this point, I haven’t thought of in-memory gifts as an indicator of affinity. The donor is motivated by the desire to honour a friend or loved one, not any identification with your mission or nostalgia for alma mater. In other words, it seems probable that the gift given in memory is typically not up for renewal. To discover that the behaviour may be associated with an especially elusive class of donor is exciting.

There is no need to simply accept the conventional wisdom that the best Planned Giving prospects are the ones who have consistently given small amounts to the annual fund over a long period of time. This behaviour is certainly a predictor for bequests, but it does not typify them.

5 August 2010

Perception vs. reality on the Number 80 bus

Filed under: Planned Giving, Statistics — kevinmacdonell @ 11:26 am

(Photo used under Creative Commons license. Click image for source.)

Do you ride the bus back and forth to work? I do. Some days it’s a quick trip, and other days it just goes on forever. There’s this one stop where the driver will park the bus and just sit there, as the minutes tick by. How dare she. Doesn’t she know I’m in a hurry?

I have some flexibility in office hours, at least during the summer, so I set out to pick the best times to travel. I wanted to know: Which buses on the Number 80 route were most catchable (i.e., had a very predictable time of arrival at the stop closest to my house), were fastest and most reliable (i.e., exhibited the least variability in travel times) and were least full (so I wouldn’t have to stand the whole way).

I was sure that there was some optimal combination of these three, but I couldn’t figure it out just by riding the bus. There didn’t seem to be any discernable pattern to my experience. I did not believe it was random, so there was one conclusion: It’s a data problem.

So I’ve been collecting data on my bus rides, and I’ve just had a look at it. What I found out had less to do with the bus route than with the nature of perceived reality. What you think is going on isn’t necessarily what’s actually happening. (And yes, I’ll bring this back to fundraising.)

I record the time I sit down, and the time I land on the sidewalk at my destination. I note the day of the week (maybe Mondays are quicker rides than Fridays) and the month (maybe buses are less full during the summer months when people are on vacation). I also note how full the bus is (on a scale of 1 to 5), and whether I have to stand (0/1). And finally, I make note of outliers due to “disruptive events” (unusually long construction delays, mechanical failure, etc.)

No one but a geek would do this. But it takes only a few seconds — and if you’re interested in statistics, collecting your own data can be instructive in itself.

I haven’t collected enough data points on the Number 80 bus to reveal all its secrets, but I learned enough to know that I have no sense of elapsed time. Leaving out one extreme outlier, my average trip duration (in either direction) is 38 minutes. So how much do individual trips vary from 38 minutes? Well, 79% of all trips vary from the average by three minutes or less. Three whole minutes! Allow just one more minute of variance, and 90% of trips fit in that window.

All other patterns related to duration are pretty subtle: Late-morning rush hour buses, and the 4:45 p.m. bus tend to have the largest variance from the mean, the first because it’s a quicker trip, the second because it’s longer. The trip home is longer than the morning commute by only about one minute, on average. Tuesdays tend to bring slightly longer trips than any other day of the week — Tuesdays also have the highest average “fullness factor”.

But really, I can hop on any Number 80 bus and expect to get to my destination in 38 minutes, give or take a couple of minutes. That’s a far cry from how I perceive my commuting time: Some quick rides, some unbearably long ones. In fact, they’re all about the same. The bus driver is not trying to drive me crazy by parking the bus in mid-trip; she’s ahead of schedule and needs to readjust so commuters farther down the line don’t miss their bus.

If we can get simple things wrong, think of all the other assumptions we make about complex stuff, assumptions that could be either confirmed or discarded via a little intelligent measuring and analysis. According to what people widely believe about Planned Giving, you can go into your database right now and skim off the top alumni by years of giving and frequency of giving, and call them your top Planned Giving prospects. Your consistent donors are your best prospects, right?

Not necessarily. In fact, in one school’s data, I determined that if all their current, known Planned Giving expectancies were hidden in the database like needles in a haystack, and one were only allowed to use these patterns of past giving to find them again, they would miss two-thirds of them!

We are not wrong to have beliefs about how stuff works, but we are wrong in clinging to beliefs when the answers are waiting there in the data. The point is not that past giving is or isn’t a determinant of Planned Giving potential for your institution — the point is that you can find that out.

Older Posts »

The Silver is the New Black Theme. Blog at


Get every new post delivered to your Inbox.

Join 1,140 other followers