There are technical errors, and then there are conceptual errors. I can’t identify all the technical issues you may encounter while data mining. But it’s useful to identify a few conceptual errors. These are mistakes that that may prove damaging to your efforts to win acceptance for your models and have them applied constructively in your organization. In this blog I always write about my own experience, so the examples of stupidity you’ll read about in today’s post are all mine.
Mistake No. 1: Using score sets to predict things they weren’t designed for.
When I began creating predictive scores, I frequently referred to them as “affinity” scores. That’s how I described them to colleagues, both to make the idea accessible and because I really believed that a high score indicated a high level of affinity with our institution. Then one day I tried to use the scores to predict which class years would be most likely to attend their Homecoming milestone reunion, and thereby predict whether attendance for the upcoming reunion year would go up or down. Based on the central tendency of the scores of each class, I predicted a drop in attendance. I circulated a paper explaining my prediction and felt rather brilliant. Fortunately, I was proven wrong. That year we set a new attendance record.
My dependent variable in these early models was Lifetime Giving; therefore, the model predicted propensity to give – nothing more, nothing less. If you want to predict event attendance, build an event-attendance model. If you want to gauge alumni affinity, build a survey, or participate in an alumni engagement benchmarking study. (In Canada, check out Engagement Analysis Inc.) Be cautious, too, about making bold predictions; why give skeptics more ammunition? If you want to feel brilliant, keep it to yourself!
Lesson: Don’t be too clever.
Mistake No. 2: Using a general model to predict a specific behaviour.
This is closely related to the first mistake. By ‘general model’ I mean one in which the dependent variable is simply Lifetime Giving. I call these models ‘general’ because they make no distinction among the various types of giving (annual, major, planned) nor among preferred channel (by mail, by phone, and for some, online). Building a general model is itself not a mistake: It will work quite well for segmenting your alumni for the Annual Fund, for example, and if this is your first model it might be best not to get too exotic with how you define your dependent variable (thereby introducing new forms of error).
Just be prepared to make refinements. Two years running, our calling program used a score set from a general model, which actually worked fairly well, except for one thing: A lot of top-scoring alumni were hanging up on our student callers. This phenomenon was very noticeable, and it was enough for some to say that the model was worthless. An analysis of hang-ups confirmed that the problem existed (yes, we track hang-ups in the database). But the analysis also showed that a lot of these hanger-uppers were good donors. The top scorers were very likely to give, but a lot of them didn’t care to receive a call from a student. (And for some reason had not already requested to be solicited by mail only.)
The fix was a new predictive model aimed specifically at the calling program, with a dependent variable composed solely of dollars received via telephone appeals. Fewer hang-ups, happier callers, happier donors.
Lesson: Know what you’re predicting.
Mistake No. 3: Assuming that people will ‘get it’.
If you were able to show your fundraising colleagues that high-scoring segments of the alumni population give a lot more than the others, and that low-scoring segments give little or nothing, you’d think your work was done. Alas, no. Don’t assume that you’ll simply be able to hand off the data, because if data mining is not yet part of your institution’s culture, it’s more than likely your findings will be under-used. You’ve got to sell it.
Ensure that your end-users know what to do with their scores. Be prepared to make suggestions for applications. (Is the goal cost-cutting through reducing the solicitation effort, or is it growth in number of donors, or is it pushing existing donors to higher levels of giving?) In fact, before you even begin you should have some sense of what would really be in demand at your institution, and then try to satisfy that demand. The Annual Fund is a good place to start, but you might find that there’s a more pressing need for prospect identification in Planned Giving.
At the other end, you’ll need to understand how your colleagues implemented their scores in order to do any follow-up analysis of the effectiveness of your model. For example, if you plan to analyze the results of the Annual Fund telephone campaign, you’ll need to know exactly who was called and who wasn’t, before you can compare scores against giving.
Mistake No. 4: Showing people the mental sausage.
A few years ago I used to follow a great website called 43folders.com, created by Merlin Mann. His boss and friend said to him one day, “Y’know, Merlin, we’re really satisfied with the actual work you do, but is there any way you could do it without showing so much … I don’t know … mental sausage?”
Data mining and predictive modeling and cool data stuff are all exercises in discovery. When we discover something new, our natural urge is to share. In the past, I tended to share the wrong way: I would carefully reveal my discovery as if the process were unfolding in real time. These expositions (usually in the form of a Word document emailed around) would usually be rather long. The central message would often be buried in detail which someone not inhabiting my head would regard as extraneous.
Don’t expect people to follow your plot: They’re too busy. They need the back of a cereal box, and you’re sending them Proust. You need to make your point, back it up with the minimum amount of verbiage acceptable, incorporate visuals judiciously, and get the hell out.
Learn to use the charting options available in Excel or some other software to get your point across as effectively as possible. Offer to explain it face-to-face. Offer to present on it.
Lesson: Learn how to sell.