CoolData blog

7 February 2012

Figuring out what those poll results mean

Filed under: Analytics, predictive analytics — Tags: , — kevinmacdonell @ 1:04 pm

Last week’s poll (Where’s your institution on the Culture of Analytics Ladder?) had all the things I dislike about quickie online polls, including the fact that the respondents were self-selected, and it was open to anyone who stumbled upon it, including people with no connection to the field.

I don’t bother with polls very often, but hey, it’s my blog and I’ll poll if I want to. People DO like polls, even if we should take the results with a grain of salt.

The poll is still open, so you can go back to view up-to-date results anytime, but here’s the breakdown of the 81 responses received as of today. My thoughts about the results continue below …

These results do not distinguish between non-profits and higher-education institutions, nor between small and large organizations. We don’t know if respondents are managers or staff, or even if they understand the question. So … take this for what it’s worth.

Less than a fifth of people who viewed last week’s post opted to cast a vote. Of those who did, it does seem they were honest about their answers. I am not surprised that almost a third of respondents admitted that although they have data, no one in their organization knows how to analyze it. That is deeply unfortunate but it’s a fact.

Coming a near second are people who said that their analyst’s data insights support decisions only on an ad hoc basis. (Answer 6 includes organizations who may have contracted out for predictive model score sets, so these organizations may not even have analytics talent on staff.) And in third place are the people who said data analysis is a regular process, not ad hoc, but that the benefits are limited to just part of the organization.

What might be encouraging here is that almost three-quarters of respondents have the data they need and are somewhere along the path of becoming data-driven, even if they aren’t quite there yet. On the other hand, these are CoolData readers, so keep the bias in mind.

Then there are the eight people who claim that analytics has been wholly embraced by their organizations, from top to bottom and in all operational areas. I’d love for those people to come forward and identify themselves, because we could all learn from them. Email me!

1 February 2012

Where’s your institution on the Culture of Analytics Ladder?

Filed under: Fun, predictive analytics, Why predictive modeling? — Tags: , , — kevinmacdonell @ 2:21 pm

I’m laying on the couch with a bad head cold, and there’s a mix of snow and rain in the forecast. Time to curl up with my laptop and a cup of tea. I’ve got a question for you!

Not long ago I asked you to give me examples of institutions you’re aware of that are shining examples of institution-wide data-driven decision making. I was grateful for the responses, but no single institution was named twice. A few people offered an opinion about how their own organizations size up, which I found interesting.

So let’s explore that a bit more with a quick and anonymous poll: Where do you think your non-profit organization or institution fits on the Culture of Analytics Ladder? (That’s CoAL for short … but no need to be so formal. I totally made this up under the influence of cold medication.) Don’t over think it. Just pick whatever stage you feel your org or institution occupies.

The categories may seem a bit vague. If it’s any help, by “analysis” or “analytics” I am referring to the process of sifting through large quantities of data in search of patterns that lead to insights, primarily about your constituents. I am NOT referring to reporting. In fact I want you to ignore a lot of the day-to-day processes that involve data but are not really “analysis,” including: data entry, gift accounting, appeal segmentation, reporting on historical results, preparation of financials, and so on.

I am thinking more along the lines of modelling for the prediction of behaviours (which group of constituents are most likely to engage in such-and-so a behaviour?), prediction of future results (i.e., forecasting), open-ended exploration of constituent data in search of “clusters”, and and any other variety of data work that would be called on to make a decision about what to do in the future, as opposed to documenting what happened in the past. I am uncertain whether A/B split testing fits my definition of analysis, but let’s be generous and say that it does.

A couple of other pointers:

  • If you work for, say, a large university advancement department and aren’t sure whether analytics is used in other departments such as student admissions or recruitment, then answer just for your department. Same thing if you work for a regional office of a large non-profit and aren’t sure about the big picture.
  • If you have little or no in-house expertise, but occasionally hire a vendor to produce predictive modelling scores, then you might answer “6” — but only if those scores are actually being well used.

Here we go.

8 November 2011

An opportunity not to screw up

Filed under: predictive analytics, Private sector — Tags: , , , — kevinmacdonell @ 1:18 pm

It’s taken me a while to catch my breath after my recent return from Seattle, where I attended the inaugural DRIVE Conference, hosted by the University of Washington on Oct 26-27. DRIVE stands for Data, Reporting, Information and Visualization Exchange, which gives you an idea how diverse the group of 80 or 90 attendees was. I had conversations with people working in IT/Info Management, annual giving, prospect research, reporting, advancement services, data analysis — a real cross-section of disciplines that rarely meet in one room even within their own institutions.

I found an interesting thread weaving through the career histories of the people I met, one that I haven’t encountered in Canada so much: A lot of these people came to the non-profit world from the for-profit sector. Some of them were squeezed out by the recession; some didn’t feel secure in their jobs and fled of their own accord.

Meeting people who used to work as analysts for banks and telecom companies, I asked myself, “Wow, is this not an amazing opportunity?”

Hear me out, and tell me if I’m wrong about this. I’d honestly like to know.

Downsized or not, these are people who have taken a pay cut to work with us. As the economy recovers, some of them will return to the private sector. But I’m optimistic we can retain a lot of them, because it might not hinge on paying them high salaries so much as paying them our attention.

What do nonprofits have in spades? Meaning. We have meaningful work to do. Anyone who has cares that extend beyond getting a paycheque derives happiness from knowing that their work has real results in people’s lives, regardless of the sector they work in.

But a warning: This is a time-limited offer! These people have had enough time to realize that the nonprofit sector is stuck in the 1950s. The “innovations” that excite us make them yawn. They are restless to make investments and changes that will enable organizations to be effective in carrying out their missions. And we’d better listen, or frustrate them into leaving. The stakes are rather high.

I’d like to make a bold prediction, a prediction not based on modelling or statistics. I think our sector is about to undergo a transformation which will bring more progress in data-driven decision making in the next five years than we’ve seen in the last twenty. Provided, that is, we do not flush this opportunity down the toilet.

6 October 2011

The emerging role of the (fundraising) analyst

Filed under: Data, skeptics, Why predictive modeling? — Tags: — kevinmacdonell @ 12:44 pm

Effective fundraisers tell stories. When we communicate with prospective donors, we do well to evoke feelings and emotions, and go light on the facts. We may attempt to persuade with numbers and charts, but that will never work as well as one true and powerful story, conveyed in word and image.

But what about the stories we tell to ourselves? Humans need narratives to make sense of the world, but our inborn urge to order events as “this happened, then that happened” leads us into all kinds of unsupported or erroneous assumptions, often related to causation.

How many times have you heard assertions such as, “The way to reach young alumni donors is online, because that’s where they spend all their time”? Or, “We shouldn’t ask young alumni to give more than $20, because they have big student loans to pay.” Or, “There’ s no need to look beyond loyal donors to find the best prospects for Planned Giving.” Or, “We should stop calling people for donations, because focus groups say they don’t like to get those calls.”

Such mini-narratives are all around us and they beguile us into believing them. Who knows whether they’re true or not? They might make intuitive sense, or they’re told to us by people with experience. Experts tell us stories like this. National donor surveys and reports on philanthropic trends tell stories, too. And we act on them, not because we know they’re true, but because we believe them.

Strictly speaking, none of them can be “true” in the sense that they apply everywhere and at all times. Making assertions about causation in connection with complex human behaviours such as philanthropy is suspect right from the start. Even when there is some truth, whose truth is it? Trend-watchers and experts who know nothing about your donors are going to lead you astray with their suppositions.

I’m reminded of the scene in the movie Moneyball, now playing in theatres, in which one grizzled baseball scout says a certain player must lack confidence “because his girlfriend is ugly.” We can hope that most received wisdom about philanthropy is not as prodigiously stupid, but the logic should be familiar. Billy Beane, general manager of the Oakland A’s, needed a new way of doing things, and so do we.

The antidote to being led astray is learning what’s actually true about your own donors and your own constituency. It’s a new world, folks: We’ve got the tools and the smarts to put any assertion to the test, in the environment of our own data. The age of basing decisions on fact instead of supposition has arrived.

No doubt some feel threatened by that. I imagine a time when something like observation-driven, experimental medicine started to break on the scene. Doctors treating mental illness by knocking holes in peoples’ skulls to let out the bad spirits must have resisted the tide. The witch-doctors, and the baseball scouts obsessed with ugly girlfriends, may have had a lot of experience, but does anyone miss them?

The role of the analyst is not to shut down our natural, story-telling selves. No. The role of the analyst is to treat every story as a hypothesis. Not in order to explode it necessarily, but to inject validity, context, and relevance. The role of the analyst, in short, is to help us tell better and better stories.

+++++++++++++++++++++++++++++++++++++++++++++

This blog post is part of the Analytics Blogarama, in which bloggers writing on all aspects of the field offer their views on “The Emerging Role of the Analyst.” Follow the link (hosted by SmartData Collective) to read other viewpoints.

7 June 2011

Career advice, five cents

Filed under: Training / Professional Development — Tags: , — kevinmacdonell @ 11:53 am

I sometimes get asked for career advice. Sometimes I break down and give it. I’m not as eager to dispense advice as Lucy van Pelt at her “Psychiatric Help” booth in the old Peanuts comic by Charles Schultz. My rates are cheaper than hers, though.

The questions come from around the world, from people hoping to work in various data-intensive industries: What do I have to study? How long will it take? Can someone my age get hired in this field?

I don’t have specific answers. My own background in analytics is non-existent; I simply stumbled into a field that I discovered a passion for (data mining), via another field that I stumbled into (higher ed fundraising). The last time I took math was in high school, and I’ve never taken a course in statistics. My career is a patchwork and, although I wouldn’t do anything differently, my path is hardly a model to emulate.

If I am cut out for this work in any way, it is that I am diligent about learning new things that help me do better work, and I seem to have some affinity for data that I wasn’t aware of even just a few years ago. My CV may be lacking in credentials, but I’m lucky to have an employer that listens to what my good work has to say, and not whatever claims my credentials might be making.

So I don’t know much, but I know more today than I did yesterday, and I am good at explaining what I’ve learned to other people. Here are a few things I’ve come to know.

First, I doubt that age makes any difference. The growing demand for workers with data analysis skills may never be satisfied, so I would think you’d be marketable whatever age you are. Unless perhaps you’ve never heard of the Hinterwebs.

For someone taking first steps, this is an exciting time. The data analytics field is wide open — it’s not some kind of priesthood. There is a ton of knowledge-sharing going on via the Web, in publications and at conferences. Expose yourself to all of that.

I imagine that formal education in computing and programming (or statistics and advanced mathematics, or business, or database-related information technology), would be a big asset — if you’re young and prepared for several years of university, and are bent in any of these directions, then go for it. But don’t let yourself be steered into subject areas that are not of central interest to you. Analytics, it seems to me, is best pursued as complimentary to work that interests you — as a means of doing great work in a new, insightful way.

That especially applies to older workers who are looking for a change from the work they’re doing now. You may not get to do analytics work for IBM without an advanced degree, but that doesn’t mean you don’t have plenty of options. Any industry, business or activity that generates reams of data related to human behaviour is a rich playground for the data miner.

But really, what activity these days doesn’t generate loads of data? I can’t think of a single area of human endeavour that does not (or can not) benefit from gathering and analyzing relevant data. Which leads me to my final and primary piece of advice: If you want to work with data, then just do it. Look around you, where you are working right now. Seek out any sort of data-related problem or project you can find in your current employment, and learn just enough to make some progress. Any exposure to real-world data and its messy problems will be good experience. And who knows but that you might become a data pioneer in your specific area of employment?

Why wait?

6 May 2011

Wanted: More ways to learn predictive modeling

Filed under: Peter Wylie, Training / Professional Development — Tags: , — kevinmacdonell @ 5:11 am

I remember the first time I opened up the statistics software package I now use to build predictive models. I had read Peter Wylie’s book, Data Mining for Fundaisers, so I had the basic idea in my head, plus a dose of Peter’s larger-than-life enthusiasm. The next step was to download a trial version of Data Desk to see if I could apply what I’d read to some of my own data. But I was a long way off from knowing how to build my first model.

Here’s what I saw:

It was a tabula rasa. Much like my brain. Exciting things may come from these blank-slate moments, but not this time — I had no idea what to do first. I clicked on some of the menus, like the one below, which didn’t help. Even after loading my data, a simple paste operation from Excel, I was missing the “now do this” element.

So I did what many others have done with a stats package they’ve looked at for the first time — I closed and uninstalled it. (I’ve done the same with SPSS, Minitab and other programs.) I could have tinkered with it and made some progress on my own, but I had pressing work to do. Data mining was a personal interest, not a priority. It wasn’t the latest crisis du jour and therefore it wasn’t “work”.

I don’t blame the software. Help files and manuals can be quite good. But most good software is capable of doing a lot more than just the one task ones seeks to carry out; the manual will be more general and comprehensive than required. Translating Peter’s straightforward method to precise steps in Data Desk required me to isolate those functions in the software, and I had no luck with that. As well, the manual was full of stats terms I was not familiar with.

Fortunately the story didn’t end there. Peter himself, aware of my interest, worked with me to show how I could get smart about using our data. Thus armed, I was able to convince my manager that we needed to invest in one-on-one training.

What did training accomplish that working on my own could not?

  • One, the training was couched in the language of fundraising, not statistics. Terms from statistics were introduced as needed, and selectively. A comprehensive understanding of stats was not the goal.
  • Two, it was specific to the software that I was actually using. This allowed every step to be as concrete as, “Next, click on the Manip menu and select …”. I was shown how to use the small set of software features that I really needed, and we ignored the rest.
  • Three, it was specific to my own data. I learned through the process of building a model for our own institution, with data pulled from our own database. It was the first time I had seen our alumni and donation data presented this way. If we had never proceeded to full-on data mining, I still would have learned a lot about our constituency.

Analytics is a popular topic of discussion at fundraising conferences, where everyone says the right things about predictive modeling and data-driven decision making. And yet, how many development offices are doing the work? Not as many as could be.

The bad news is, there is a skills shortage. The good news is, filling the shortage does not mean hiring analysts with advanced degrees in statistics (although, three cheers for you if you do). You or others in your office can do the work — but only if the barriers are removed.

What are the barriers? They are the flipside of the three strengths of one-on-one training:

  • One, many of the relevant books and online resources are couched in the language of statistics. Which elements of statistics are necessary to understand and which are optional is not made explicit. As well, there are numerous approaches to modeling, which confuses anyone trying to focus on the approach that works best for their application.
  • Two, the mechanics of modeling differ from software package to software package. A development office staff person looking for the exact set of steps to accomplish one specific task is not likely to find what they’re looking for.
  • Three, the would-be analyst needs to work with data from their own database and learn how to look at it in a whole new way. It helps if the teaching resource you’re using talks about data from an alumni or fundraising perspective, but even within that world, everyone’s data is different.

Any one of the three barriers may be surmountable on its own; it’s the fact that all three occur together that stops people in their tracks. That’s what happened to me in my tabula rasa moment. It’s like someone who’s never been in a kitchen before needing to cook a specific meal for which there is no recipe — because in the analytics kitchen, a recipe is not only specific to the desired dish (the outcome), but to the oven (software) and to the ingredients on hand (data). Any specific recipe would have to be adapted, which is too much to ask of the beginner cook. Conversely, any overall method that attempts to explain more than one dish, more than one brand of oven, and an endless variety of ingredients is too general to be called a recipe.

For these reasons, when people ask me how to get started in predictive modeling, I always steer them toward one-on-one training. Nothing else really works. Conference sessions can inspire, or lead to a new idea or two, but it stops there. Books are great, but there isn’t a single book that contains a step-by-step guide that covers more than a fraction of fundraising modeling situations. The Internet can be a wonderful resource, but much of what you’ll find is highly technical, doesn’t apply directly to our purposes, and is completely lacking a road map for the uninitiated.

Sadly, this blog has to be counted among the resources that don’t make the grade. I think CoolData does some things well: Addressing a gap, I have always used examples drawn from alumni, nonprofit and donor data; I’ve tried to string my ideas together in some kind of order (Guide to CoolData), and I’ve tried to stay focused on one outcome (behaviour prediction for segmentation, essentially) and one modeling technique (regression), instead of straying too often into other areas.

But I have not provided anything like a step-by-step guide that works for a majority of people who are interested in data mining but don’t know how to go about it. Not that I think it’s impossible. One-on-one training is superior to “book learning,” but I believe there ought to be options for other learning styles. A chef must learn the art in the presence of a master, but the rest of us have recipe books. While no one can deny the superiority of the former, the majority of us get by in the kitchen using the latter — and some dine very well thereby.

It would be an interesting challenge to come up with a way to convey how to do predictive modeling to a beginner in a way that balances the specificity of the recipe book with the endless variety of our real-world data kitchens. Such a product (whatever form it takes) might not be a substitute for training, but it could either augment training or at least get one started. Unlike this blog, it would probably not be free.

Well, it’s something to think about.

« Newer PostsOlder Posts »

Blog at WordPress.com.