CoolData blog

1 April 2015

Mind the data science gap

Filed under: Training / Professional Development — Tags: , , — kevinmacdonell @ 8:10 pm

 

Being a forward-thinking lot, the data-obsessed among us are always pondering the best next step to take in professional development. There are more options every day, from a Data Science track on Coursera to new masters degree programs in predictive analytics. I hear a lot of talk about acquiring skills in R, machine learning, and advanced modelling techniques.

 

All to the good, in general. What university or large non-profit wouldn’t benefit from having a highly-trained, triple-threat chameleon with statistics, programming, and data analytics skills? I think it’s great that people are investing serious time and brain cells pursuing their passion for data analysis.

 

And yet, one has to wonder, are these advanced courses and tools helping drive bottom-line results across the sector? Are they helping people at nonprofits and university advancement offices do a better job of analyzing their data toward some useful end?

 

I have a few doubts. The institutions and causes that employ these enterprising learners may be fortunate to have them, but I would worry about retention. Wouldn’t these rock stars eventually feel constrained in the nonprofit or higher ed world? It’s a great place to apply one’s creativity, but aren’t the problems and applications one can address with data in our field relatively straightforward in comparison with other fields? (Tailoring medical treatment to an individual’s DNA, preventing terrorism or bank fraud, getting an American president elected?) And then there’s the pay.

 

Maybe I’m wrong to think so. Clearly there are talented people working in our sector who are here because they have found the perfect combination of passions. They want to be here.

 

Anyway — rock star retention is not my biggest concern.

 

I’m more concerned about the rest of us: people who want to make better use of data, but aren’t planning to learn way more than we need or are capable of. I’m concerned for a couple of reasons.

 

First, many of the professional development options available are pitched at a level too advanced to be practical for organizations who haven’t hired a full-time predictive analytics specialist. The majority of professionals working in the non-profit and higher-ed sectors are mainly interested in getting better at their jobs, whether that’s increasing dollars raised or boosting engagement among their communities. They don’t need to learn to code. They do need some basic, solid training options. I’m not sure these are easy to spot among all the competing offerings and (let’s be honest) the Big Data hype.

 

These people need support and appropriate training. There’s a place for scripting and machine learning, but let’s ensure we are already up to speed on means/medians, bar charts, basic scoring, correlation, and regression. Sexy? No. But useful, powerful, necessary. Relatively simple and manual techniques that are accessible to a range of advancement professionals — not just the highly technical — offer a high return on investment. It would be a shame if the majority were cowed into thinking that data analysis isn’t for them just because they don’t see what neural networks have to do with their day to day work.

 

My second concern is that some of the advanced tools of data science are deceptively easy to use. I read an article recently that stated that when it’s done really well, data science looks easy. That’s a problem. A machine-learning algorithm will spit out answers, but are they worth anything? (Maybe.) Does an analyst learn anything about their data by tweaking the knobs on a black box? (Probably not.) Is skipping over the inconvenience of manual data exploration detrimental to gaining valuable insights? (Yes!)

 

Don’t get me wrong — I think R, Python, and other tools are extremely useful for predictive modelling, although not for doing the modelling itself (not in my hands, at least). I use SQL and Python to automate the assembly of large data files to feed into Data Desk — it’s so nice to push a button and have the script merge together data from the database, from our phonathon database, from our broadcast email platform and other sources, as well as automatically create certain indicator variables, pivoting all kinds of categorical variables and handling missing data elegantly. Preparing this file using more manual methods would take days.

 

But this doesn’t automate exploration of the data, it doesn’t remove the need to be careful about preparing data to answer the business question, and it does absolutely nothing to help define that business question. Rather than let a script grind unsupervised through the data to spit out a result seconds later without any subject-matter expertise being applied, the real work of building a model is still done manually, in Data Desk, and right now I doubt there is a better way.

 

When it comes to professional development, then, all I can say is, “to each their own.” There is no one best route. The important thing is to ensure that motivated professionals are matched to training that is a good fit with their aptitudes and with the real needs of the organization.

 

31 May 2014

Presenting at a conference: Why the pain is totally worth it

One morning some years ago, when I was a prospect researcher, I was sitting at my desk when I felt a stab of pain in my back. I’d never had serious back pain before, but this felt like a very strong muscle spasm, low down and to one side. I stood up and stretched a bit, hoping it would go away. It got worse — a lot worse.

I stepped out into the hallway, rigid with pain. Down the hall, standing by the photocopier waiting for her job to finish, was Bernardine. She had a perceptive eye for stuff, especially medical stuff. She glanced in my direction and said, “Kidney stone.”

An hour later I was laying on a hospital gurney getting a Toradol injection and waiting for an X-ray. It was indeed a kidney stone, and not a small one.

This post is not about my kidney stone. But it is a little bit about Bernardine. Like I said, she knew stuff. She diagnosed my condition from 40 feet away, and she was also the first person to suggest that I should present at a conference.

At that time, there were few notions that struck terror in my heart like the idea of talking in front of a roomful of people. I thought she was nuts. ME? No! I’d rather have another kidney stone.

But Bernardine had also given me my first copy of Peter Wylie’s little blue book, “Data Mining for Fundraisers.” With that, and the subsequent training I had in data mining, I was hooked — and she knew it. Eventually, my absorption with the topic and my enthusiasm to talk about it triumphed over my doubts. I had something I really wanted to tell people about, and the fear was something I needed to manage. Which I did.

To date I’ve done maybe nine or ten conference presentations. I am not a seasoned presenter, nor has public speaking become one of my strengths. But I do know this: Presenting stuff to my counterparts at other institutions has proven one of the best ways to understand what it is I’m doing. These were the few times I got to step back and grasp not only the “how” of my work, but the “why”.

This is why I recommend it to you. The effort of explaining a project you’ve worked on to a roomful of people you’re meeting for the first time HAS to force some deeper reflection than you’re used to. Never moving beyond the company of your co-workers means you’re always swimming in the same waters of unspoken assumptions. Creating a presentation forces you to step outside the fishbowl, to see things from the perspective of someone you don’t know. That’s powerful.

Yes, preparing a presentation is a lot of work, if you care about it enough. But presenting can change your relationship with your job and career, and through that it can change your life. It changed mine. Blogging also changed my life, and I think a lot more people should be blogging too. (A post for another day.) Speaking and writing have rewarded me with an interesting career and professional friendships with people far and wide. These opportunities are not for the exceptional few; they are open to everyone.

I mentioned earlier that Bernardine introduced me me Peter Wylie’s book. Back then I could never have predicted that one day he and I would co-author another book. But there it is. It gave me great pleasure to give credit to Bernardine in the acknowledgements; I put a copy in the mail to her just this week. (I also give credit to my former boss, Iain. He was the one who drove me to the hospital on the day of the kidney stone. That’s not why he’s in the acknowledgements, FYI.)

Back to presenting … Peter and I co-presented a workshop on data mining for prospect researchers at the APRA-Canada conference in Toronto in 2010. I’m very much looking forward to co-presenting with him again this coming October in Chicago. (APRA-Illinois Data Analytics Fall Conference … Josh Birkholz will also present, so I encourage you to consider attending.)

Today, playing the role of a Bernardine, I am thinking of who I ought to encourage to present at a conference. I have at least one person in mind, who has worked long and hard on a project that I know people will want to hear about. I also know that the very idea would make her vomit on her keyboard.

But I’ve been there, and I know she will be just fine.

2 December 2013

How to learn data analysis: Focus on the business

Filed under: Training / Professional Development — Tags: , , , — kevinmacdonell @ 6:17 am

A few months ago I received an email from a prospect researcher working for a prominent theatre company. He wanted to learn how to do data mining and some basic predictive modeling, and asked me to suggest resources, courses, or people he could contact. 

I didn’t respond to his email for several days. I didn’t really have that much to tell him — he had covered so many of the bases already. He’d read the  book “Data Mining for Fund Raisers,”  by Peter Wylie, as well as “Fundraising Analytics: Using Data to Guide Strategy,” by Joshua Birkholz. He follows this blog, and he keeps up with postings on the Prospect-DMM list. He had dug up and read articles on the topic in the newsletter published by his professional association (APRA). And he’d even taken two statistics course — those were a long time ago, but he had retained a basic understanding of the terms and concepts used in modeling.

He was already better prepared than I was when I started learning predictive modeling in earnest. But as it happened, I had a blog post in draft form (one of many — most never see the light of day) which was loosely about what elements a person needs to become a data analyst. I quoted a version of this paragraph in my response to him:

There are three required elements for pursuing data analysis. The first and most important is curiosity, and finding joy in discovery. The second is being shown how to do things, or having the initiative to find out how to do things. The third is a business need for the work.

My correspondent had the first element covered. As for the second element, I suggested to him that he was more than ready to obtain one-on-one training. All that was missing was defining the business need … that urgent question or problem that data analysis is suited for.

Any analysis project begins with formulating the right question. But that’s also an effective way to begin learning how to do data analysis in the first place. Knowing what your goal is brings relevance, urgency and focus to the activity of learning.

Reflect on your own learning experiences over the years: Your schooling, courses you’ve taken, books and manuals you’ve worked your way through. More than likely, this third element was mostly absent. When we were young, perhaps relevance was not the most important thing: We just had to absorb some foundational concepts, and that was that. Education can be tough, because there is no satisfying answer to the question, “What is the point of learning this?” The point might be real enough, but its reality belongs to a seemingly distant future.

Now that we’re older, learning is a completely different game, in good ways and bad. On the bad side, daily demands and mundane tasks squeeze out most opportunities for learning. Getting something done seems so much more concrete than developing our potential. 

On the good side, now we have all kinds of purposes! We know what the point is. The problems we need to solve are not the contrived and abstract examples we encountered in textbooks. They are real and up close: We need to engage alumni, we need to raise more money, we need, we need, we need.

The key, then, is to harness your learning to one or more of these business needs. Formulate an urgent question, and engage in the struggle to answer it using data. Observe what happens then … Suddenly professional development isn’t such an open-ended activity that is easily put off by other things. When you ask for help, your questions are now specific and concrete, which is the best way to generate response on forums such as Prospect-DMM. When you turn to a book or an internet search, you’re looking for just one thing, not a general understanding.

You aren’t trying to learn it all. You’re just taking the next step toward answering your question. Acquiring skills and knowledge will be a natural byproduct of what should be a stimulating challenge. It’s the only way to learn.

 

30 November 2012

Analytics conferences: Two problems, two antidotes

A significant issue for gaining data-related skills is finding the right method of sharing knowledge. No doubt conferences are part of the answer. They attract a lot of people with an interest in analytics, whose full-time job is currently non-analytical. That’s great. But I’m afraid that a lot of these people assume that attending a conference is about passively absorbing knowledge doled out by expert speakers. If that’s what you think, then you’re wasting your money, or somebody’s money.

There are two problems here. One is the passive-absorption thing. The other is a certain attitude towards the “expert”. Today I want to describe both problems, and prescribe a couple of conferences related to data and analytics which offer antidotes.

Problem One: “Just Tell Me What To Do”

You know the answer already: Knowledge can’t be passively absorbed. It is created, built up inside you, through engagement with an other (a teacher, a mentor, a book, whatever). We don’t get good ideas from other people like we catch a cold. We actively recognize an idea as good and re-create it for ourselves. This is work, and work creates friction — this is why good ideas don’t spread as quickly as mere viral entertainment, which passes through our hands quickly and leaves us unchanged. Sure, this can be exciting or pleasant work, but it requires active involvement. That’s pretty much true for anything you’d call education.

Antidote One: DRIVE

Ever wish you could attend a live TED event? Well, the DRIVE conference (Feb. 20-21 in Seattle — click for details) captures a bit of that flavour: Ideas are front and centre, not professions. Let me explain … Many or most conferences are of the “birds of a feather” variety — fundraisers talking to fundraisers, analysts talking to analysts, researchers talking to researchers, IT talking to IT. The DRIVE conference (which I have written about recently) is a diverse mix of people from all of those fields, but adds in speakers from whole other professional universes, such a developmental molecular biologist and a major-league baseball scout.

Cool, right? But if you’re going to attend, then do the work: Listen and take notes, re-read your notes later, talk to people outside your own area of expertise, write and reflect during the plane ride home, spin off tangential ideas. Dream. Better: dream with a pencil and paper at the ready.

Problem Two: “You’re the Expert, So Teach Me Already”

People may assume the person at the podium is an expert. The presenter has got something that the audience doesn’t, and that if it isn’t magically communicated in those 90 minutes then the session hasn’t lived up to its billing. Naturally, those people are going to leave dissatisfied, because that’s not how communicating about analytics works. If you’re setting up an artificial “me/expert” divide every time you sit down, you’re impeding your ability to be engaged as a conference participant.

Antidote Two: APRA Analytics Symposium

Every year, the Association of Professional Researchers for Advancement runs its Data Analytics Symposium in concert with its international conference. (This year it’s Aug 7-8 in Baltimore.) The Symposium is a great learning opportunity for all sorts of reasons, and yes, you’ll get to hear and meet experts in the field. One thing I really like about the Symposium is the  case-study “blitz” that offers the opportunity for colleagues to describe projects they are working on at their institutions. Presenters have just 20 or so minutes to present a project of their choice and take a few questions. Some experienced presenters have done these, but it’s also a super opportunity for people who have some analytics experience but are novice presenters. It’s a way to break through that artificial barrier without having to be up there for 90 minutes. If you have an idea, or would just like more information on the case studies, get in touch with me at kevin.macdonell@gmail.com, or with conference chair Audrey Geoffroy: ageoffroy@uff.ufl.edu. Slots are limited, so you must act quickly.

I present at conferences, but I assure you, I have never referred to myself as an “expert”. When I write a blog post, it’s just me sweating through a problem nearly in real time. If sometimes I sound like I knew my way through the terrain all along, you should know that my knowledge of the lay of the land came long after the first draft. I like to think the outlook of a beginner or an avid amateur might be an advantage when it comes to taking readers through an idea or analysis. It’s a voyage of discovery, not a to-do list. Experts have written for this blog, but they’re good because although they know their way around, every new topic or study or analysis is like starting out anew, even for them. The mind goes blank for a bit while one ponders the best way to explore the data — some of the most interesting explorations begin in confusion and uncertainty. When Peter Wylie calls me about an idea he has for a blog post, he doesn’t say, “Yeah, let’s pull out Regression Trick #47. You know the one. I’ll find some data to fit.” No — it’s always something fresh, and his deep curiosity is always evident.

So whichever way you’re facing when you’re in that conference room, remember that we are all on this road together. We’re at different places on the road, but we’re all traveling in the same direction.

18 October 2012

It’s your turn to DRIVE!

It’s been a full year since I attended the first DRIVE Conference in Seattle, and I’m pleased to let you know (if you don’t already) that a second one is on the way. DRIVE 2013 takes place February 20-21 at the Bell Harbor International Conference Center in Seattle, Washington, and is hosted by the University of Washington. Registration is now open!

I’ll be making the trip to DRIVE 2013, and I think you should, too. I’m there to speak, but I expect to get a whole lot more out of it than I give.

DRIVE stands for those most awesome and beautiful words “Data, Reporting, Information and Visualization Exchange.” It’s a gathering-place for the growing community of non-profit IT/data people seeking to bring new ideas and efficient processes and systems to their organizations. Whether you’re just joining the non-profit ranks or you’ve been in the sector a while, this is the place to explore the latest ideas in analytics, modeling, data, reporting, information and visualization with people who are of like mind but come from all sorts of different backgrounds.

It’s this diversity that really injects value into the “exchange” part of DRIVE: You’ll meet some fascinating people who will help you see data-driven performance through a whole new lens.

Especially this year … wow. There are fundraisers and report-writers and data miners – all great. But a developmental molecular biologist? And a major-league baseball scout? Yes!

On top of that, there’s an opportunity to sign up for some on-the-spot mentoring (either as a mentor or mentee) which will allow you to have a focused conversation on a topic of interest that goes beyond the merely social aspect of a conference. Check that out on the conference website.

A few speaker highlights:

DR. JOHN J. MEDINA, a developmental molecular biologist, has a lifelong fascination with how the mind reacts to and organizes information. He is the author of the New York Times bestseller “Brain Rules: 12 Principles for Surviving and Thriving at Work, Home, and School” — a provocative book that takes on the way our schools and work environments are designed.

ASHUTOSH NANDESHWAR, Associate Director of Analytics at the University of Michigan, will talk about how we can tackle the three biggest problems in fundraising using data science.

KARL R. HABERL, of Principal BI will be presenting on the merits of powerful visualization. His presentation will introduce you to three innovative ‘compound charting techniques’ that provide new levels of insights to analysts and their audiences.

ANDREW PERCIVAL, an advanced scout with Major League Baseball’s Seattle Mariners. For his presentation, Andrew will be speaking about the use of data in the game of baseball. Come hear how an MLB scout turns massive data sets into information that is used by coaches and front-office personnel.

Oh yeah – and me, and a whole lot more. For more information on the other speakers and topics lined up so far, visit the DRIVE 2013 website.

19 March 2012

Symposium on Data Analytics is a must-attend

If you’re interested in working with data for the benefit of a non-profit organization or for education institutional advancement, then you must make room in your calendar for the APRA Symposium on Data Analytics.

Kate Chamberlin of Memorial Sloan-Kettering Cancer Center recently posted the listserv message below which I am quoting in its entirety, with her blessing. Kate is Chair of this year’s Symposium, being held this summer in Minneapolis. I’ve attended a few of these symposiums (and presented at one), and I can tell you that they’re great. This is a conference where you can really learn, and meet the people who are doing cool stuff with data for their institutions and organizations.

Of particular interest are the Case Study sessions, which are brief (20 minutes) presentations of analytics projects that your colleagues at other institutions have carried out. If you’ve worked on a such a project, consider sharing! Contact information is included below.

Here’s Kate’s message:

Hello everyone!

Many of you may have noticed the fifth annual APRA Symposium on Data Analytics is definitely happening again this summer in conjunction with APRA’s International Conference in Minneapolis!  The dates are Wednesday and Thursday, August 1st and 2nd — some additional information is available here: http://www.aprahome.org/p/cm/ld/fid=72.

We don’t have the full schedule yet, but hopefully will within a week or so.  In the meantime, let me give you some preliminary details:

Wednesday morning the conference will open with a keynote from Rob Scott at MIT, who was instrumental in founding the Symposium, and has a bird’s-eye view of the history of analytics in fundraising, from the perspective of research, IT, front-line fundraising, and fundraising management.  Thursday morning, we will have the opportunity to join the larger conference to hear Penelope Burke, President of Cygnus Applied Research Inc., on Donor-Centered Fundraising.  http://www.aprahome.org/p/cm/ld/&fid=73

The fundamental track is intended as a two day introduction to analytics in fundraising, with the goal of giving participants a solid road map to approach their first project.  Topics will include: Various Variables: Data Preparation and Management for Successful Analytics, Walkthrough: Understanding the Problem and the Resources, Key Questions in Project Management, and Implementation.  Presenters will include Chuck McClenon at the University of Texas, James Cheng at Dana Farber Cancer Institute, Audrey Geoffroy at the University of Florida, and myself.  In addition, six short case studies from a variety of nonprofits will be presented in the fundamental track.

In the intermediate/advanced track, we will continue the focus on case study with nine short project presentations.  We will also have a presentation from Jeff Shuck of Event 360, who applies predictive modeling and segmentation to fundraising events and peer-to-peer fundraising programs.  Marianne Pelletier of Cornell University and Josh Birkholz of Bentz Whaley Flessner will present on constituent engagement.  Chuck McClenon of the University of Texas will lead a panel of practitioners to discuss the intricacies of collaborating with development IT.

Finally, we will have our usual faculty/committee panel to close the Symposium.  We will be asking our faculty, committee members, and a few guests to tell us about the one best idea they’ve heard recently in the area of development analytics, and follow up with a free-wheeling conversation including these ideas and any and all questions from the floor.

Last year we experimented with a case study format that gave us the opportunity to hear many of our colleagues present on projects they are working on at their institutions.  As you see above, with a few tweaks, we are continuing to set aside some time for case study this year.  If you’re planning to attend, I’m hoping some of you might have a project you’d be interested in presenting?  You will have 20 minutes to present a project of your choice and take a few questions.  Emma Hinke at Johns Hopkins has kindly agreed to handle the logistics of case studies for me, so if you have an idea, or would just like more information on the case studies, please be in touch with Emma at ehinke2@jhu.edu.  If we have a great flood of ideas, we may not be able to pack them all in, but wouldn’t that be a great problem to have?  Please send us your thoughts, and if we can’t manage them all this year, we’ll start a list for next year.

I do hope you will consider joining us — it’s the variety of attendees that makes the Symposium great.  I’ll let you know when we have the full schedule up on the Symposium web site.

Many thanks,

Kate Chamberlin
Chair, APRA Symposium on Data Analytics
Campaign Strategic Research Director, Memorial Sloan-Kettering Cancer Center

Older Posts »

The Silver is the New Black Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 1,177 other followers