18 January 2015
25 June 2014
27 December 2012
I’ve always tried to stay on-topic with CoolData content: If you subscribe, you know what you’re getting, and if you lose interest and unsubscribe, you know what you’re missing. But I’m on holiday, so I’m inclined to let content rules slip a bit. My wife and I are spending time with family on Cape Breton Island and in the Annapolis Valley in Nova Scotia. I’m less vigilant than usual about what I eat (more turkey, more sweets, more wine) and what I do (nothing, essentially). It is in this state of desuetude that I write this last blog post of the year.
Allow me to indulge by writing not about predictive analytics, but about CoolData itself, which has just turned three years old. That’s middle age for a blog, I figure. First I’ll go through some numbers, and then I’ll tell you about some things coming in the new year.
CoolData by the numbers
As of yesterday, CoolData has had 177,915 page views since it was launched. The number of visitors continues to grow gradually; 6,000 page views a month is the current average. These are page views, not unique visitors: WordPress has been informing me about unique visits only since early December. So far, each unique visitor averages 1.4 page views.
Visits have come from almost every country in the world, but of course most are from the United States. It is not unusual for my own country, Canada, to be edged out of second place by the UK, India or Australia on any given day. The top 20 or so countries since February 2012 are included in the WordPress-created graphic below. (Click for full size.)
These visitor numbers are not small, but I’m not pretending they’re impressive, either. My subject is rather niche. As well, many visitors aren’t really looking for CoolData. Half of my traffic comes from people stumbling in from Google and other search engines, and they’re looking for simple (or simplistic) explanations of statistical concepts. The most popular post by far is How high, R squared? — published in April 2010, it is still heavily visited every day by confused and desperate grad students from all corners of the globe. I don’t consider these people part of the CoolData “tribe”, if I can call it that.
The tribe — the readers I care most about — are typically the ones who have subscribed to receive updates. (There are also a lot of RSS subscribers — I don’t have as good a handle on those numbers.*) As of today, there are 680 subscribers — 48 subscribers via WordPress accounts, and 632 via email. This number has been growing very gradually over the past three years. I realize many people sign up for things they never return to (I do it all the time), but when an update goes out, I estimate that about half of my subscribers click through to the new post, which I find encouraging. They are far more likely to click through than my followers on Twitter (@kevinmacdonell).
Most readers visit during the work week (readership drops off dramatically on weekends), so not surprisingly most subscribers use their real work address rather than a free Gmail, Hotmail, or Yahoo account. From my own research, I know providing a work email is associated with higher levels of engagement, and “.edu” addresses alone (US-affiliated higher ed institutions) account for 293 subscribers. Another 101 addresses have the less restrictive top-level domain of “.org”. Among country-specific top-level domains, the top ones are Canada (.ca) with 46 and the United Kingdom (.uk) with 29. There are 142 “.com” addresses, and roughly half of them are Gmail, Yahoo or Hotmail. There are 443 unique domains in all, the top ones being uw.edu (University of Washington) and ubc.ca (University of British Columbia).
Up to now I’ve been coy about answering questions about my stats, for no real reason. I figure I might as well come clean. I have long felt that there is more room for writing on this topic, so if knowing more about my readership encourages you to start your own blog, then I encourage you to make 2013 your year to step up. All it takes is a few minutes to sign up on WordPress or similar free service, and start writing.
If you’re not up for creating your own blog, then consider writing a guest post for CoolData. Up to this point, guest posting has been by invitation only, but starting today I am open to receiving post ideas from anyone interested in writing on the topic of predictive analytics for nonprofit fundraising or higher education advancement (including alumni engagement). I plan to limit submitted guest posts to one per month. Multiple submissions are welcome, but submissions that are completely off-topic will not get a response. Email me at firstname.lastname@example.org to suggest/discuss your idea before you start writing.
No more comments
As I begin a new year, naturally I think of changes I’d like to make. For one, I will be taking a new approach to comments on posts. Only 514 comments have been contributed since December 2009, and 140 of those are mine. This is not a disappointment — I had no designs one way or the other — but the time has come to recognize the fact that CoolData has never been effective as a discussion forum. There have been a few good questions and observations made by commenters, but unfortunately too many comments are of the “drive-by” variety: Brief one-off criticisms that require rebuttal but never lead to any forward advance in the discussion or added enlightenment for beginning predictive modelers. The best questions, the most honest comments, and the most well-reasoned objections tend to come to me via private email.
For that reason, I am shutting off the ability to respond with public comments. There have been no nasty personal attacks, nor abusive language, nor anything I’ve felt forced to delete (aside from spam). I simply feel that, after three years of writing and editing this blog, I no longer feel the need to provide a platform for people whose main interest is something other than being part of a shared endeavour to learn, to grow, and to bring our institutions and organizations into the age of data. Responses, questions, critiques are always welcome via private email, and I may choose to gather the best responses for use in followup blog posts. Keep in mind, too, that the best forums for discussion are still the listservs (Prospect-dmm is the best example), and new conversations crop up every week in the many groups of interest you can find on social networking sites such as LinkedIn.
On a more positive note, 2013 will be the year that a new book, Score!, which I have co-written with Peter Wylie, will be published. I’ve said very little about it to date, in part because I won’t actually believe it until it’s in my hands. It’s a project with a long gestation … writing a book has nearly nothing in common with knocking off a blog post. However, I’m confident we’ll see it out sometime during the first half of the year.
That’s all for 2012. Best of luck in your data-related work in 2013!
* A regular reader who subscribes via RSS reminded me that I have given short shrift to the RSS crowd — I just don’t know how many subscribe via RSS. It is quite possible, then, that I am overestimating the number of email subscribers who click through to the post.
20 July 2012
When a bus holding 50 commuters is forced to wait for 30 seconds for a runner to catch it, the runner has saved the 15 minutes it will take to wait for the next bus, but the total cost in time is 50 people x 30 seconds, or 25 minutes. By this math, the driver is doing the world a favour by stepping on the gas and leaving the runner behind.
I was taking the Number 80 bus to work the other day when an elderly woman got on. As we were pulling away, she realized she had left her bag on a bench at the stop. She got the driver to halt and got off. She took quite a long time to make her way back, and the driver waited for her — I think she was surprised, but she appreciated it.
This driver chose not to apply the math. And I am glad he didn’t. I suspect most people on the bus felt the same way. We were mildly inconvenienced, but people are reassured when they see public examples of compassion. Yes, we still live among human beings. (Even if drivers are trained to sometimes be lenient, which I don’t think they are.)
When I advocate a data-driven approach to making decisions, I am speaking of specific scenarios, not an approach to life itself or a way to rid ourselves of experience and human wisdom.
Don’t expect me to conclude with twaddle such as “the most important things in life just can’t be measured” or “numbers aren’t everything.” (Blech!) The quantophobes among us are all too ready to embrace the half-truths in those statements and deliberately mistake them for the entire truth.
I prefer to say that there are small things, and there are big things. Don’t forget the big things while you are busy optimizing the small things. Efficiency with details may not always coincide with the greater good.
30 May 2012
Ten years ago, my wife and I created and co-hosted a jazz music program at a local radio station. (It was called Stop Time — yikes, hard to believe that website is still live.) One year, we attended a festival featuring all kinds of cool drummers from every conceivable genre of music, mainly to interview jazz drummer Paul Wertico. The keynote speaker was a highly energetic guy named Dom Famularo, and he delivered his address from behind (where else) a drum kit. The main thing that stuck with me was his message about achieving creativity and flexibility by intentionally making things harder for yourself.
There are two basic ways to grip a pair of drumsticks: The matched grip and the traditional grip. Normally a drummer picks one and uses that all the time. I forget which grip Dom grew up using, but for this performance he decided to change grips. It came at a cost — it was non-intutive for him to play this way, and he dropped a stick at least once. Like the stumbling, uncomfortable process of learning a new language, changing grips forced him to be mindful of his playing and probably gave his brain a thorough workout.
Often my advice, and my own modus operandi, is to work smarter, not harder — to seek efficiencies or, if you prefer, to be creatively lazy. But the way to get there isn’t always choosing the option that is easiest right now.
- Developing a new process to automate a task that you perform over and over again is harder to do than simply go through the familiar motions, particularly if the task is an easy one — but doing so will free you to pursue more creative work.
- Taking the time to document a new process you are developing is not easy, and threatens to interrupt your flow — but it will greatly aid your learning, allow you to drop your work and pick it up again anytime from where you left off, and entails long-term benefits that you can’t anticipate now.
- Going to the front of the room to give a presentation is harder than sitting in the back row nursing critical thoughts. But you will have a lot more success spreading your ideas. In a related vein, it’s easier to nit-pick and criticize the ideas of others than to risk putting a few ideas of your own out there. But life rewards those who are always ready to take “safe” risks and learn from failure.
Try deliberately sabotaging your comfort once in a while, like Dom Famularo did, and see where it takes you.
8 May 2012
One day in late March I got on a plane from Toronto (where I attended Annual Fund benchmarking meetings hosted by Target Analytics) to Las Vegas (for the Sungard Higher Education Summit), and picked up the Toronto Globe & Mail. I scanned a section that offered some ephemera, including the startling news that my fellow countryman William Shatner had turned 81. Once I got over that shock, I read the Globe’s “Thought du jour,” a quote from Ralph Waldo Emerson.
Because I’m an admirer of Emerson, and because I figured I could appropriate his quote for my own selfish purposes, I scribbled it down:
“The world can never be learned by learning all its details.”
Emerson did not live in the age of big data. But in a way, the world he experienced — the world we all experience through our senses — IS big data. We don’t perceive our surroundings directly, but only through our brain’s interpretations of sense impressions. We navigate the world via mental models of our own creation. These models leave out nearly everything. They are not reality, no more than a map of a city is faithful to the reality of the city, or than our memory of an event is faithful to the details of the event (which would overwhelm us every time it came to mind).
In our work with data, we measure things (or their proxies) in order to get a handle on them and in order to gain insight. We lose most of the detail in the process, but we need to in order to learn something. We build models based on general patterns. So as George E.P. Box said: All models are wrong, but some are useful.