Peter Wylie and I are each just back home, having presented at the fall conference of the Illinois chapter of the Association of Professional Researchers for Advancement (APRA-IL), hosted at Loyola University Chicago. (See photos, below!) Following an entertaining and fascinating look at the current and future state of predictive analytics presented by Josh Birkholz of Bentz Whaley Flessner, Peter and I gave a live demo of working with real data in Data Desk, with the assistance of Rush University Medical Center. We also drew names to give away a few copies of our book, Score! Data-Driven Success for Your Advancement Team.
We were impressed by the variety and quality of questions from attendees, in particular those having to do with stumbling blocks and barriers to progress. It was nice to be able to reassure people that when it comes to predictive modelling, some things aren’t worth worrying about.
Messy data, for example. Some databases, particularly those maintained by non higher ed nonprofits, have data integrity issues such as duplicate records. It would be a shame, we said, if data analysis were pushed to the back burner just because of a lack of purity in the data. Yes, work on improving data integrity — but don’t assume that you cannot derive valuable insights right now from your messy data.
And then the practice of predictive modelling itself … Oh, there is so much advice out there on the net, some of it highly technical and involving a hundred different advanced techniques. Anyone trying to learn on their own can get stymied, endlessly questioning whether what they’re doing is okay.
For them, our advice was this: In our field, you create value by ranking constituents according to their likelihood to engage in a behaviour of interest (giving, usually), which guides the spending of scarce resources where they will do the most good. You can accomplish this without the use of complex algorithms or arcane math. In fact, simpler models are often better models.
The workhorse tool for this task is multiple linear regression. A very good stand-in for regression is building a simple score using the techniques outlined in Peter’s book, Data Mining for Fundraisers. Sticking to the basics will work very well. Fussing with technical issues or striving for a high degree of accuracy are distractions that the beginner need not be overly concerned with.
If your shop’s current practice is to pick prospects or other targets by throwing darts, then even the crudest model will be an improvement. In many situations, simply performing better than random will be enough to create value. The bottom line: Just do it. Worry about perfection some other day.
If the decisions are high-stakes, if the model will be relied on to guide the deployment of scarce resources, then insert another step in the process. Go ahead and build the model, but don’t use it. Allow enough time of “business as usual” to elapse. Then, gather fresh examples of people who converted to donors, agreed to a bequest, or made a large gift — whatever the behaviour is you’ve tried to predict — and check their scores:
“Don’t worry, just do it” sounds like motivational advice, but it’s more than that. The fact is, there is only so much model validation you can do at the time you create the model. Sure, you can hold out a generous number of cases as a validation sample to test your scores with. But experience will show you that your scores will always pass the validation test just fine — and yet the model may still be worthless.
A holdout sample of data that is contemporaneous with that used to train the model is not the same as real results in the future. A better way to go might be to just use all your data to train the model (no holdout sample), which will result in a better model anyway, especially if you’re trying to predict something relatively uncommon like Planned Giving potential. Then, sit tight and observe how it does in production, or how it would have done in production if it had been deployed.
* A heartfelt thank you to APRA-IL and all who made our visit such a pleasure, especially Sabine Schuller (The Rotary Foundation), Katie Ingrao and Viviana Ramirez (Rush University Medical Center), Leigh Peterson Visaya (Loyola University Chicago), Beth Witherspoon (Elmhurst College), and Rodney P. Young, Jr. (DePaul University), who took the photos you see below. (See also: APRA IL Fall Conference Datapalooza.)
Click on any of these for a full-size image.
One morning some years ago, when I was a prospect researcher, I was sitting at my desk when I felt a stab of pain in my back. I’d never had serious back pain before, but this felt like a very strong muscle spasm, low down and to one side. I stood up and stretched a bit, hoping it would go away. It got worse — a lot worse.
I stepped out into the hallway, rigid with pain. Down the hall, standing by the photocopier waiting for her job to finish, was Bernardine. She had a perceptive eye for stuff, especially medical stuff. She glanced in my direction and said, “Kidney stone.”
An hour later I was laying on a hospital gurney getting a Toradol injection and waiting for an X-ray. It was indeed a kidney stone, and not a small one.
This post is not about my kidney stone. But it is a little bit about Bernardine. Like I said, she knew stuff. She diagnosed my condition from 40 feet away, and she was also the first person to suggest that I should present at a conference.
At that time, there were few notions that struck terror in my heart like the idea of talking in front of a roomful of people. I thought she was nuts. ME? No! I’d rather have another kidney stone.
But Bernardine had also given me my first copy of Peter Wylie’s little blue book, “Data Mining for Fundraisers.” With that, and the subsequent training I had in data mining, I was hooked — and she knew it. Eventually, my absorption with the topic and my enthusiasm to talk about it triumphed over my doubts. I had something I really wanted to tell people about, and the fear was something I needed to manage. Which I did.
To date I’ve done maybe nine or ten conference presentations. I am not a seasoned presenter, nor has public speaking become one of my strengths. But I do know this: Presenting stuff to my counterparts at other institutions has proven one of the best ways to understand what it is I’m doing. These were the few times I got to step back and grasp not only the “how” of my work, but the “why”.
This is why I recommend it to you. The effort of explaining a project you’ve worked on to a roomful of people you’re meeting for the first time HAS to force some deeper reflection than you’re used to. Never moving beyond the company of your co-workers means you’re always swimming in the same waters of unspoken assumptions. Creating a presentation forces you to step outside the fishbowl, to see things from the perspective of someone you don’t know. That’s powerful.
Yes, preparing a presentation is a lot of work, if you care about it enough. But presenting can change your relationship with your job and career, and through that it can change your life. It changed mine. Blogging also changed my life, and I think a lot more people should be blogging too. (A post for another day.) Speaking and writing have rewarded me with an interesting career and professional friendships with people far and wide. These opportunities are not for the exceptional few; they are open to everyone.
I mentioned earlier that Bernardine introduced me me Peter Wylie’s book. Back then I could never have predicted that one day he and I would co-author another book. But there it is. It gave me great pleasure to give credit to Bernardine in the acknowledgements; I put a copy in the mail to her just this week. (I also give credit to my former boss, Iain. He was the one who drove me to the hospital on the day of the kidney stone. That’s not why he’s in the acknowledgements, FYI.)
Back to presenting … Peter and I co-presented a workshop on data mining for prospect researchers at the APRA-Canada conference in Toronto in 2010. I’m very much looking forward to co-presenting with him again this coming October in Chicago. (APRA-Illinois Data Analytics Fall Conference … Josh Birkholz will also present, so I encourage you to consider attending.)
Today, playing the role of a Bernardine, I am thinking of who I ought to encourage to present at a conference. I have at least one person in mind, who has worked long and hard on a project that I know people will want to hear about. I also know that the very idea would make her vomit on her keyboard.
But I’ve been there, and I know she will be just fine.
(When I read this post by our university’s CIO on his internal blog, I thought “right on.” It’s not about predictive modelling, and CoolData is not about IT. But this message about taking responsibility for acquiring new skills hit the right note for me. Follow Dwight on Twitter at @cioDalhousieU — Kevin)
I recently recommended OneNote to a colleague. OneNote is a venerable note-taking and organizational tool that is part of the Microsoft Office suite. I spoke to the merits of the application and how useful and versatile a tool it is, particularly now that it is fully integrated with mobile devices through the cloud. I suggested that she look online and find some resources on how to use it.
Busy as she is, she asked her administrative assistant to look up how to use OneNote, who in turned called the HelpDesk looking for support. The Help Desk staff need to know a lot of information, but software expertise is not the type of thing they can and are able to provide deeper-level support. Unless they were to use the software on a day-in, day-out basis, how could they? As it was, the caller did not get the support she expected.
If that individual instead had gone to Google (or Bing, Yahoo, YouTube, whatever) and asked the question, they would have received a torrent of information. All she needs to understand is how to ask or phrase the question.
It occurs to me that we have provided support to our clients for so long, they have developed an unhealthy dependence on IT staff to answer all their issues. Meanwhile, the internet has developed a horde of information and with it, many talented individuals who simply like to share their knowledge. Is it all good information? Not always, but if you just do a little searching and modify your search terms, you’ll certainly find relevant information. Often times you’ll find some serendipitous learning as well.
We need to help our clients make this shift. Instead of answering their questions, coach them on how to ask questions in search engines. Give them a fish and they’ll eat for a day. Teach them to fish and they’ll eat heartily. And save the more unique technology questions for us.
P.S. I used to go to the bike store for repairs. I could do a lot of work on my bikes, but there were some things I just couldn’t do. But with a small fleet, that was getting expensive. I started looking up bike repair issues in YouTube and lo and behold, it’s all right there. I might have bought a tool or two, but I can darn near fix most things on the bikes. It just takes some patience and learning. There are some very talented bike mechanics who put out some excellent videos.
A few months ago I received an email from a prospect researcher working for a prominent theatre company. He wanted to learn how to do data mining and some basic predictive modeling, and asked me to suggest resources, courses, or people he could contact.
I didn’t respond to his email for several days. I didn’t really have that much to tell him — he had covered so many of the bases already. He’d read the book “Data Mining for Fund Raisers,” by Peter Wylie, as well as “Fundraising Analytics: Using Data to Guide Strategy,” by Joshua Birkholz. He follows this blog, and he keeps up with postings on the Prospect-DMM list. He had dug up and read articles on the topic in the newsletter published by his professional association (APRA). And he’d even taken two statistics course — those were a long time ago, but he had retained a basic understanding of the terms and concepts used in modeling.
He was already better prepared than I was when I started learning predictive modeling in earnest. But as it happened, I had a blog post in draft form (one of many — most never see the light of day) which was loosely about what elements a person needs to become a data analyst. I quoted a version of this paragraph in my response to him:
There are three required elements for pursuing data analysis. The first and most important is curiosity, and finding joy in discovery. The second is being shown how to do things, or having the initiative to find out how to do things. The third is a business need for the work.
My correspondent had the first element covered. As for the second element, I suggested to him that he was more than ready to obtain one-on-one training. All that was missing was defining the business need … that urgent question or problem that data analysis is suited for.
Any analysis project begins with formulating the right question. But that’s also an effective way to begin learning how to do data analysis in the first place. Knowing what your goal is brings relevance, urgency and focus to the activity of learning.
Reflect on your own learning experiences over the years: Your schooling, courses you’ve taken, books and manuals you’ve worked your way through. More than likely, this third element was mostly absent. When we were young, perhaps relevance was not the most important thing: We just had to absorb some foundational concepts, and that was that. Education can be tough, because there is no satisfying answer to the question, “What is the point of learning this?” The point might be real enough, but its reality belongs to a seemingly distant future.
Now that we’re older, learning is a completely different game, in good ways and bad. On the bad side, daily demands and mundane tasks squeeze out most opportunities for learning. Getting something done seems so much more concrete than developing our potential.
On the good side, now we have all kinds of purposes! We know what the point is. The problems we need to solve are not the contrived and abstract examples we encountered in textbooks. They are real and up close: We need to engage alumni, we need to raise more money, we need, we need, we need.
The key, then, is to harness your learning to one or more of these business needs. Formulate an urgent question, and engage in the struggle to answer it using data. Observe what happens then … Suddenly professional development isn’t such an open-ended activity that is easily put off by other things. When you ask for help, your questions are now specific and concrete, which is the best way to generate response on forums such as Prospect-DMM. When you turn to a book or an internet search, you’re looking for just one thing, not a general understanding.
You aren’t trying to learn it all. You’re just taking the next step toward answering your question. Acquiring skills and knowledge will be a natural byproduct of what should be a stimulating challenge. It’s the only way to learn.