CoolData blog

18 March 2010

My Planned Giving model growing pains

Filed under: Model building, Planned Giving, regression — Tags: , , — kevinmacdonell @ 8:22 am

People stumbling on CoolData might assume that I think I’ve gathered unto myself some great corpus of data mining knowledge and that now I presume to dispense it via this blog, nugget by nugget.

Uh, well – not quite.

The reality is that I spend a lot of my time at work and at home surrounded by my books, struggling to get my arms around the concepts, and doing a good deal of head-scratching. Progress is slow, as only about ten percent of my work hours are actually spent on data mining. Questions from CoolData readers are cause for anxiety more than anything else. (Questions are welcome, of course, but sometimes advice would be better.)

As a consequence, I proceed with caution when it comes to building models for my institution. I don’t have a great deal of time for testing and tweaking, and I steer clear of creating predictive score sets that cannot be deployed with a high level of confidence.

This caution has not prevented me from having some doubts about the model I created last year for our Planned Giving program, however.

This model sorted all of our alumni over a certain age into percentile ranks according to their propensity to engage with our institution in a planned giving agreement. Our Planned Giving Officer is currently focused on the individuals in the 97th percentile and up. Naturally, whenever a new commitment (verbal or written) comes across the transom (unsolicited, as I think PG gifts often are), the first thing I do is check the individual’s percentile score.

A majority of the new expectancies are in the 90s, which is good, and most of those are 97 and up, which is better. When I look at the Annual Giving model scores for these same individuals, however, I see that the AG scores do a better job of predicting the Planned Giving donors than the PG scores do. That strikes me as a bit odd.

Planned Giving being a slowly-evolving process, there aren’t enough examples of new commitments to properly evaluate the model, to my satisfaction at least. But when model-building time comes around again in July and August, I’ll be making some changes.

The central issue I faced was that current commitments numbered only a little over 100. That’s not a lot of historical data to model on. I asked around for advice. One key piece of advice was to cut down on the size of the prospect pool by excluding all alumni younger than our youngest current commitment. Done.

My primary interest, though, was to somehow legitimately boost the number of examples of PG donors, in order to beef up the dependent variable in a regression analysis.

Some institutions, I learned, tried to do this by digging into data on deceased planned giving donors, going back five or ten years. (I hope I do not strain decorum with the verb I’ve selected.) Normally we model only on living individuals, but having access to more examples of this type of donor has proven helpful for some. Unfortunately, on investigation I found that the technical issues involved made it prohibitively time-consuming: For various reasons, I would have had to perform many separate queries of the database in order to get at this data and merge it with that of the living population.

As luck would have it, though, around this time we received all the data from a huge, wide-ranging survey of alumni engagement we had conducted that March. One of the scale statements was specifically focused on attitudes towards leaving a bequest to our institution. The survey was non-anonymous, and a lot of positive responders to this statement were in our target age range. Bingo – I had a whole new group of “PG-oriented” individuals to add to my dependent variable. The PG model would be trained not only on current commitments, but on alumni who claimed to be receptive to the idea of planned giving.

In addition, I had the identities of a number of alumni who had attended information sessions on estate planning organized by our Planned Giving Officer.

I think all was well up to that point. What I did after that may have led to trouble.

I thought to myself, these PG-oriented people are not all of the same “value”. Surely a written gift commitment is “worth more” than a mere online survey response clicked on in haste. So I structured my dependent variable to look like this, using completely subjective ideas of what “value” ought to be assigned to each type of person:

  • Answered “agree” to the PG statement in survey: 1 point
  • Answered “strongly agree” to the PG statement in survey: 2 points
  • Attended an estate planning session: 3 points
  • Has made a verbal PG commitment: 6 points
  • Has a written commitment in place: 8 points

Everyone else in the database was assigned a zero. And then I used multiple regression to create the model.

This summer, I think I will tone down the cleverness with my DV.

First of all, everyone with a pro-PG orientation (if I can put it that way) will be coded “1”. Everyone else will be coded “0”, and I will try using logistic regression instead of multiple regression, as more appropriate for a binary DV.

Going back to the original model, it occurs to me that my method was based on a general misconception of what I was up to. In creating these “levels of desirability,” I ignored the role of the Planned Giving Officer. My job, as I see it now, is to deliver up the segment of alumni that has the highest probability of receptivity to planned giving. It’s the PGO’s task to engage with the merely interested and elevate them to verbal, then written, agreements. In that sense, the survey-responder and the final written commitment could very well be equivalent in “value”.

The point is, it’s not in my power to make that evaluation. Therefore, this year, everyone with the earmarks of planned giving about them will get the same value: 1. I hope that results in a more statistically defensible method.

(I should add here that although I recognize my model could be improved, I remain convinced that even a flawed predictive model is superior to any assumption-based segmentation strategy. I’ve flogged that dead horse elsewhere.)

In the meantime, your advice is always appreciated.

A majority of the new expectancies are in the 90s, and most of those are 97 and up. However, you’ll see in the attached that I compare the effectiveness of the PG score with that of the Annual Giving score. It would seem that the AG score does a better job of picking the Planned Giving donors than the PG score does! Even the old “general” model from 2008 does a (slightly) better job.

That’s a bit odd. The first thing I would say is that 11 is a very small sample and it’s hard to generalize from that.



  1. Hi Kevin,

    Your scoring system is really interesting. I think we might try something similar.

    For our PG model dependent variable, we selected all of the estates who left $25,000 or more and who gave to us during their lifetime (because we are only going to be scoring people currently on the file). This left us with about 880 people after we excluded those who joined our PG society while alive. We did this because we wanted to test to see how these donors scored. Happily, they scored highly! Let me know if I’m being unclear at all/

    When thinking adding slightly different populations to our dependent variable to beef it up, the worry is that we are murking it up at the same time. We sometimes look into how different the populations are from each other in terms of key variables before pushing them together to form 1 dependent variable.


    Comment by Michelle Paladino — 19 March 2010 @ 9:01 am

    • Hi Michelle,

      Interesting: We seem to be starting from different positions on this, then trading places! I wish I had enough data to do as you did: Hold out the ones who joined the PG society while alive, to test the model. The really tricky thing with Planned Giving is that the right people have to score VERY highly in order to get approached at all, unless there are plenty of development officers working the planned-giving file. This clearly highlights the need for PGOs and data miners to work together so that everyone understands what’s being delivered by the model: A segment of the population that is most likely to contain PG-positive people – NOT a list of sure winners. As for picking the right statistical tool, I’m beginning to learn that it’s necessary to take more than one approach, and then see which works best for one’s data.

      Comment by kevinmacdonell — 21 March 2010 @ 11:51 am

  2. Kevin,
    You’re work is always interesting and much appreciated although I don’t always follow you into the woods very far.

    You make an excellent point about the work of the planned giving officer or development officer. Your predictive modeling is the first step and development officers everywhere know they must still qualify prospects and that cultivation can take a long time before a gift emerges. But focusing resources, especially these days, is critical, and tracking all touches and moves becomes more and more important over time.

    Thank you,


    Comment by Virginia — 19 March 2010 @ 8:55 pm

    • Hi Virginia,

      I agree. Everyone involved in this process needs to understand where data mining “fits”. It’s a preliminary, first-cut process, not the final word. Focusing resources, as you say, is the name of the game. This is why I think that although data mining is new to a lot of fundraisers, it is not a disruptive technology. I see it fitting in quite comfortably with traditional fundraising, with its focus on building relationships and “people asking people”. A predictive model doesn’t displace anything, except perhaps certain ideas about segmenting the prospect pool based on rules of thumb that may be suspect. No one on the front lines of fundraising should feel particularly threatened by data mining.

      Comment by kevinmacdonell — 21 March 2010 @ 11:58 am

  3. I do believe people are becoming more comfortable thanks to leaders like you and Peter Wylie. It’s been talked about a lot in the last few years, especially at Annual Giving meetings. Data mining is what alerted folks to the fact that loyalty is a better predictor than size of gift.

    Now, it’s time to move this concept into the other areas. I’ll be staying tuned in!

    All the best, Virginia

    Comment by Virginia Ikkanda-Suddith — 21 March 2010 @ 7:53 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: