CoolData blog

22 January 2010

Four mistakes I have made

Filed under: Pitfalls, skeptics — Tags: , , — kevinmacdonell @ 1:28 pm

Is your predictive model an Edsel? Build it right, then sell it right! (Photo by dok1 on Flickr, used via Creative Commons licence.)

There are technical errors, and then there are conceptual errors. I can’t identify all the technical issues you may encounter while data mining. But it’s useful to identify a few conceptual errors. These are mistakes that that may prove damaging to your efforts to win acceptance for your models and have them applied constructively in your organization. In this blog I always write about my own experience, so the examples of stupidity you’ll read about in today’s post are all mine.

Mistake No. 1: Using score sets to predict things they weren’t designed for.

When I began creating predictive scores, I frequently referred to them as “affinity” scores. That’s how I described them to colleagues, both to make the idea accessible and because I really believed that a high score indicated a high level of affinity with our institution. Then one day I tried to use the scores to predict which class years would be most likely to attend their Homecoming milestone reunion, and thereby predict whether attendance for the upcoming reunion year would go up or down. Based on the central tendency of the scores of each class, I predicted a drop in attendance. I circulated a paper explaining my prediction and felt rather brilliant. Fortunately, I was proven wrong. That year we set a new attendance record.

My dependent variable in these early models was Lifetime Giving; therefore, the model predicted propensity to give – nothing more, nothing less. If you want to predict event attendance, build an event-attendance model. If you want to gauge alumni affinity, build a survey, or participate in an alumni engagement benchmarking study. (In Canada, check out Engagement Analysis Inc.) Be cautious, too, about making bold predictions; why give skeptics more ammunition? If you want to feel brilliant, keep it to yourself!

Lesson: Don’t be too clever.

Mistake No. 2: Using a general model to predict a specific behaviour.

This is closely related to the first mistake. By ‘general model’ I mean one in which the dependent variable is simply Lifetime Giving. I call these models ‘general’ because they make no distinction among the various types of giving (annual, major, planned) nor among preferred channel (by mail, by phone, and for some, online). Building a general model is itself not a mistake: It will work quite well for segmenting your alumni for the Annual Fund, for example, and if this is your first model it might be best not to get too exotic with how you define your dependent variable (thereby introducing new forms of error).

Just be prepared to make refinements. Two years running, our calling program used a score set from a general model, which actually worked fairly well, except for one thing: A lot of top-scoring alumni were hanging up on our student callers. This phenomenon was very noticeable, and it was enough for some to say that the model was worthless. An analysis of hang-ups confirmed that the problem existed (yes, we track hang-ups in the database). But the analysis also showed that a lot of these hanger-uppers were good donors. The top scorers were very likely to give, but a lot of them didn’t care to receive a call from a student. (And for some reason had not already requested to be solicited by mail only.)

The fix was a new predictive model aimed specifically at the calling program, with a dependent variable composed solely of dollars received via telephone appeals. Fewer hang-ups, happier callers, happier donors.

Lesson: Know what you’re predicting.

Mistake No. 3: Assuming that people will ‘get it’.

If you were able to show your fundraising colleagues that high-scoring segments of the alumni population give a lot more than the others, and that low-scoring segments give little or nothing, you’d think your work was done. Alas, no. Don’t assume that you’ll simply be able to hand off the data, because if data mining is not yet part of your institution’s culture, it’s more than likely your findings will be under-used. You’ve got to sell it.

Ensure that your end-users know what to do with their scores. Be prepared to make suggestions for applications. (Is the goal cost-cutting through reducing the solicitation effort, or is it growth in number of donors, or is it pushing existing donors to higher levels of giving?) In fact, before you even begin you should have some sense of what would really be in demand at your institution, and then try to satisfy that demand. The Annual Fund is a good place to start, but you might find that there’s a more pressing need for prospect identification in Planned Giving.

At the other end, you’ll need to understand how your colleagues implemented their scores in order to do any follow-up analysis of the effectiveness of your model. For example, if you plan to analyze the results of the Annual Fund telephone campaign, you’ll need to know exactly who was called and who wasn’t, before you can compare scores against giving.

Lesson: Communicate.

Mistake No. 4: Showing people the mental sausage.

A few years ago I used to follow a great website called 43folders.com, created by Merlin Mann. His boss and friend said to him one day, “Y’know, Merlin, we’re really satisfied with the actual work you do, but is there any way you could do it without showing so much … I don’t know … mental sausage?”

Data mining and predictive modeling and cool data stuff are all exercises in discovery. When we discover something new, our natural urge is to share. In the past, I tended to share the wrong way: I would carefully reveal my discovery as if the process were unfolding in real time. These expositions (usually in the form of a Word document emailed around) would usually be rather long. The central message would often be buried in detail which someone not inhabiting my head would regard as extraneous.

Don’t expect people to follow your plot: They’re too busy. They need the back of a cereal box, and you’re sending them Proust. You need to make your point, back it up with the minimum amount of verbiage acceptable, incorporate visuals judiciously, and get the hell out.

Learn to use the charting options available in Excel or some other software to get your point across as effectively as possible. Offer to explain it face-to-face. Offer to present on it.

Lesson: Learn how to sell.

Advertisements

4 Comments »

  1. Thanks – was just in a meeting 2 days ago and a group suggested using our major gift donor predictive score as a possible way to address a question pertaining to annual fund donors. I felt awful trying to back-pedal out of that situation. We don’t have a model that would address the question at hand…like you indicated, I at least know that the outcome of the models we do have are only pertinent in that singular context.

    Comment by Diane — 22 January 2010 @ 2:35 pm

    • Yikes. I understand that at some institutions, the problem is that people think a predictive model is magical and can do anything. I’m not in that situation, but the two extreme scenarios (hard-nosed skepticism and unquestioning worship) are probably equally frustrating. Communication, education, constant reminding – those seem to be necessary to address misconceptions.

      Comment by kevinmacdonell — 22 January 2010 @ 4:02 pm

  2. […] a couple of posts about mistakes I’ve made in data mining and predictive modelling. (See Four mistakes I have made and When your predictive model sucks.) Today I’m pleased to point out a brand new […]

    Pingback by More mistakes I’ve made « CoolData blog — 26 January 2012 @ 1:38 pm

  3. One thing working in smaller institutions (for me) means there are not many other people with similar skillset that you can discuss with and learn from each other…

    Does that happen in your organisation?

    Comment by cnukaus — 3 March 2012 @ 8:35 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: