CoolData blog

3 October 2016

Grad class size: predictive of giving, but a reality check, too


The idea came up in a conversation recently: Certain decades, it seems, produced graduates that have reduced levels of alumni engagement and lower participation rates in the Annual Fund. Can we hope they will start giving when they get older, like alumni who have gone before? Or is this depressed engagement a product of their student experience — a more or less permanent condition that will keep them from ever volunteering or giving?


The answer is not perfectly clear, but what I have found with a bit of analysis can only add to the concern we all have about the end of “business as usual.”


For almost all universities, enrolments have risen dramatically over the decades since the end of the second World War. As undergraduate class sizes ballooned, metrics such as the student-professor ratio emerged as important indicators of quality of education. It occurred to me to calculate the size of each grad-year cohort and include it as a variable in predictive models. For a student who graduated in 1930, that figure could be 500. For someone who graduated in 1995, it might be 3,000. (If you do this, remember not to exclude now-deceased alumni in your count.) A rough generalization about the conditions under which a person received their degree, to be sure, but it was easy to query the database for this, and easy to test.


I pulled lifetime giving for 130,000 living alumni and log-transformed it before checking for a correlation with the size of graduating class. (The transformation being log of “lifetime giving plus 1.”) It turned out that lifetime giving has a strong inverse correlation with the size of an alum’s grad class, for that alum’s most recent degree. (r = -0.338)


This is not surprising. The larger the graduating class, the younger the alum. Nothing is as strongly correlated with lifetime giving as age, therefore much of the effect I was seeing was probably due to age. (The Pearson correlation of LTG and age was 0.395.)


Indeed, in a multiple linear regression of age on lifetime giving (log-transformed), adding “grad-class size” as a predictor variable does not improve model fit. The two predictors are not independent of each other: For age and grad-class size, r = -0.828!


I wasn’t ready to give up on the idea, though. I considered my own graduation from university, and all the convocations I had attended in the past as an Advancement employee or a family member of a graduate. The room (or arena, as the case may be) was full of grads from a whole host of degree programs, most of whom had never met each other or attended any class in common. Enrolment growth has been far from even across faculties (or colleges or schools); the student experience in terms of class size and one-on-one access to professors probably differs greatly from program to program. At most universities, Arts or Science faculties have exploded in size, while Medicine or Law have probably not.


With that in mind, I calculated grad-class size differently, counting the size of each alum’s graduating cohort at the faculty (college) level. The correlation of this more granular count of grads with lifetime giving was not as negative (r = -0.283), but at the same time, it was less tied to age.


This time, when I created a regression of age on lifetime giving and then added grad-class size at the faculty level, both predictors were significant. Grad class size gave a good boost to adjusted R squared.


I seemed to be on to something, so I pushed it farther. Knowing that an undergrad’s experience is very different from that of a graduate student, I added “Number of Degrees” as a variable after age, and before grad-class size. All three predictors were significant and all led to improvements in model fit.


Still on the trail of how class size might affect student experience, and alumni affinity and giving thereafter, I got more specific in my query, counting the number of graduates in each alum’s year of graduation and degree program. This variable was even less conflated with age, but despite that, it failed to provide any additional explanation for the variation in lifetime giving. There may be other forms of counts that are more predictive, but the best I found was size of grad class at the faculty/college level.


If I were asked to speculate about the underlying cause, the narrative I’d come up with is that enrolments grew dramatically not only because there were more young people, but because universities in North America were attracting students who increasingly felt that a university degree was a rite of passage required for success in the job market. The relationship of student to university was changing, from that of a close-knit club of scholars, many of whom felt immensely grateful for the opportunity, to a much larger, less cohesive population with a more transactional view of their relationship with alma mater.


That attitude (“I paid x dollars for my piece of paper and so our business here is done”), and not so much the increasing numbers of students they shared the lecture halls with, could account for drops in philanthropic support. What that means for Annual Fund is that we can’t bank on the likelihood that a majority of alumni will become nostalgic when they reach the magic age of 50 or 60 and open their wallets as a consequence. Everything’s different now.


I don’t imagine this is news to anyone who’s been paying attention. But it’s interesting to see how this reality is reflected in the data. And it’s in the data that we will be able to find the alumni for whom university was not just a transaction. Our task today is not just to identify that valuable minority, but to understand them, communicate with them intelligently, connect with their interests and passions, and engage them in meaningful interactions with the institution.


31 August 2016

Phonathon call attempt limits: A reading roundup

Filed under: Annual Giving, Best practices, Phonathon — Tags: , , — kevinmacdonell @ 2:49 pm


As September arrives, Annual Fund programs everywhere are gearing up for mailing and calling. Managers of phone programs are seeking advice on how best to proceed, and inevitably that includes asking about the optimal number of call attempts to make for each alum.


How many calls is too many? What’s ideal? Should it differ for LYBUNTs and SYBUNTs?


In my opinion, these are the wrong questions.


If your aim is to get someone on the phone, more calling is better. However, by “call more” I don’t mean call more people. I mean make more calls per prospect. The RIGHT prospects. Call the right people, and eventually many or most of them will pick up the phone. Call the wrong people, and you can ring them up 20, 30, 50 times and you won’t make a dent. That’s why I think there’s no reason to set a maximum number of call attempts. If you’re calling the right people, then just keep calling.


For Phonathon programs that are expensive or time-consuming (and potentially under threat of being cut), and shops with some ability to make decisions informed by data, it doesn’t make sense to apply across-the-board limits. Much better to use predictive modeling to determine who’s most likely to pick up the phone, and focus resources on those people.


Here are a number of pieces I’ve written or co-written on this topic:


Keep the phones ringing – but not all of them


Call attempt limits? You need propensity scores


How many times to keep calling?


Answering questions about “How many times to keep calling”


Final thoughts on Phonathon donor acquisition


2 August 2016

Data Down Under, and the real reason we measure alumni engagement

Filed under: Alumni, Dalhousie University, engagement, Training / Professional Development — Tags: — kevinmacdonell @ 4:00 pm


coverI’ve given presentations here and there around Canada and the U.S., but I’ve never travelled THIS far. On Aug. 24, I will present a workshop in Sydney, Australia — a one-day master class for CASE Asia-Pacific on using data to measuring alumni engagement. My wife and I will be taking some time to see some of that beautiful country, leaving in just a few days.


The workshop attendees will be alumni relations professionals from institutions large and small, and in the interest of keeping the audience’s needs in mind, I hope to convince them that measuring engagement is worth doing by talking about what’s in it for them.


This will be the easy part. Figuring out how to quantify engagement will allow them to demonstrate the value of their teams’ activity to the university, using language their senior leadership understands. Scoring can also help alumni teams better target segments based on varying levels of engagement, evaluate current alumni programming, and focus on activities that yield the greatest boost in engagement.


There is a related but larger context for this discussion, however. I am not certain that everyone will be keen to hear about it.


Here’s the situation. Everything in alumni relations is changing. Alumni populations are growing, the number of donors is decreasing, and traditional engagement methods are less effective. Friend-raising and “one size fits all” approaches to engagement are increasingly seen as unsustainable wastes of resources. (A Washington, DC based consultancy, the Education Advisory Board, makes this point very well in this excerpt of a report which you can download here: The Strategic Alumni Relations Enterprise.)


I don’t know so much about the Asia-Pacific region, but in North America university leaders are questioning the very purpose and value of typical alumni relations activities. In this scenario, engagement measurement is intended for more than producing a merely informational report or having something to brag about: Engagement measurement is really a tool that enables alumni relations to better align itself with the Advancement mission.


In place of “one size fits all,” alumni relations teams are under pressure to understand how to interact with alumni at different levels of engagement. Alumni who are somewhat engaged should be targeted with relevant programs and messages to bring them to the next level, while alumni who are at the lowest levels of engagement should not have significant resources directed at them.


Alumni at high levels of engagement, however, require special and customized treatment. They’re looking for deeper and more fulfilling experiences that involve furthering the mission of the institution itself. Think of guest lecturing, student recruitment, advisory board roles, and mentorship, career development and networking for students and new grads. Low-impact activities such as pub nights and other social events are a waste of the potential of this group and will fail to move them to continue contributing their time and money.


Think of what providing these quality experiences will entail. For one, alumni relations staff will have to collaborate with their colleagues in development, as well as in other offices across campus — enrolment management, career services, and academic offices. This will be a new thing, and perhaps not an easy thing, for alumni relations teams stuck in traditional friend-raising mode and working in isolation.


But it’s exactly through these strategic partnerships that alumni relations can prove its value to the whole institution and attract additional resources even in an environment where leaders are demanding to know the ROI of everything.


Along with better integration, a key element of this evolution will be robust engagement scoring. According to research conducted by the Education Advisory Board, alumni relations does the poorest job of any office on campus in providing hard data on its real contribution to the university’s mission. Too many of us are still stuck on tracking our activities instead of the results of those activities.


It doesn’t have to be that way, if the alumni team can effectively partner with other units in Advancement. For those of us on the data, reporting, and analysis side of the house, get ready: The alumni team is coming.


5 July 2016

A simple score you can probably build in Excel

Filed under: Excel, Peter Wylie, Predictive scores — Tags: , , , — kevinmacdonell @ 4:22 pm

Guest post by Peter B. Wylie


In the evolving world of analysis for higher ed and non-profits, it’s apparent that a gap is widening: Many well-resourced shops are acquiring analytics talent comfortable with statistics and programming, but many others are unable to make investments in specialized talent.


Today’s guest post is a paper by Peter Wylie that addresses the latter group, the ones at risk of being left behind. Download his paper here: Simple_Score_in_Excel_Wylie


In this piece he uses data from two schools to show you something you can try with your own data, building a very simple predictive score using nothing but Excel.


Some level of data analysis ought to be accessible at some level to every organization, regardless of technical proficiency or tools. And in fact, shops that move too quickly to automate predictive scoring with black-box-like methods risk passing over the insights available to the exploratory analyst using more manual, time-consuming methods.


We hope you enjoy, and above all, that you try this with your own data. The download link again: Simple_Score_in_Excel_Wylie


13 June 2016

Nifty SQL regression to calculate donors’ giving trends

Filed under: Coolness, Predictor variables, regression, SQL — Tags: , , , — kevinmacdonell @ 8:28 pm


Here’s a nifty bit of SQL that calculates a best-fit line through a donor’s years of cash-in giving by fiscal year (ignoring years with no giving), and classifies that donor in terms of how steeply they are “rising” or “falling”.


I’ll show you the sample code, which you will obviously have to modify for your own database, and then talk a little bit about how I tested it. (I know this works in Oracle version 11g. Not sure about earlier versions, or other database systems.)


with sums AS (
 select, t1.fiscal_year, log(10, sum(t1.amount)) AS yr_sum
 from gifts t1
 group by, t1.fiscal_year),

slopes AS (
 select distinct,
 regr_slope(sums.yr_sum,sums.fiscal_year) OVER (partition by AS slope

from sums

 when slopes.slope is null then 'Null'
 when slopes.slope >=0.1 then 'Steeply Rising'
 when slopes.slope >=0.05 then 'Moderately Rising'
 when slopes.slope >=0.01 then 'Slightly Rising'
 when slopes.slope >-0.01 then 'Flat'
 when slopes.slope >-0.05 then 'Slightly Falling'
 when slopes.slope >-0.1 then 'Moderately Falling'
 else 'Steeply Falling' end AS description

from slopes
That’s it. Not a lot of SQL, and it runs very quickly (for me). But does it actually tell us anything?


I devised a simple test. Adapting this query, I calculated the “slope of giving” for all donors over a five-year period in the past: FY 2007 to FY 2011. I wanted to see if this slope could predict whether, and by how much, a donor’s giving would rise or fall in the next five-year period: FY 2012 to FY 2016. (Note that the sum of a donor’s giving in each year is log-transformed, in order to better handle outlier donors with very large giving totals.)


I assembled a data file with each donor’s sum of cash giving for the first five-year period, the slope of their giving in that period, and the sum of their cash giving for the five-year period after that.


The first test was to see how the categories of slope, from Steeply Rising to Steeply Falling, translated into subsequent rises and falls. In Data Desk, I compared the two five-year periods. If the second period’s giving was greater than the first, I called that a “rise.” If it was less, I called it a “fall.” And if it was exactly the same, I called it “Same.”


The table below summarizes the results. Note that these numbers are all percentages, summed horizontally. (I will explain the colour highlighting later on.)




For Steeply Rising, 60.6% of donors actually FELL from the first period to the next. Only 37.8 percent rose, and just 1.6% stayed exactly the same. Not terribly impressive. Look at Steeply Falling, though: More than three-quarters actually did fall. That’s a better result, but then again, “Falling” dominates for every category; in the whole file, close to 70% of all donors reduced their giving in the next period. If a donor has no giving in the second period of five years, that’s zero dollars given, and this is called a “Fall” — more on that aspect in just a sec.


(I’ve left out donors with a FY2007-11 slope of Null — they’re the ones who gave in only one year and therefore don’t have a “slope”.)


Let’s not give up just yet, however. The colour highlighting indicates how high each percentage value is in relation to those above and below it. For example, the highest percentages in the Falling column are found in the Slightly, Moderately, and especially Steeply Falling slope categories. The highest percentages in the Rising column are in the Slightly, Moderately, and Steeply Rising slope categories. And in the Same column, the Flat slope wins hands-down — as we would hope.


So a rising slope “sort of” predicts increased giving, a falling slope “sort of” predicts decreased giving. Unfortunately, many donors are not retained into the second five-year period, so there’s not a lot to be confident about.


But what if a donor IS retained? What if we exclude the lapsed donors entirely? Let’s do that:




Excluding non-donors seems to lead to an improvement … The slope does a better job sorting between the risers and fallers when a donor is actually retained. Again, the colour highlighting is referencing columns, not rows. But notice now that, across the rows, Rising has a slight majority for the Rising slope categories, and Falling has a slight majority for the Falling slope categories. (The bar is set too high for Flat, however, given that a donor’s giving in the first five years has to be exactly equal to her giving in the second five years to be called Same.)


Admittedly, these majorities are not generous. If I calculated a donor’s slope of giving as Steeply Rising and that donor was retained, I have only a 56.4% chance of actually being right. And of course there’s no guarantee that donor won’t lapse.


(Note that these are donors of all types — alumni, non-alumni individuals, and entities such as corporations and foundations. Non-alumni donors tend not to have patterns in their giving that are repeated, not to the extent that alumni do. However, when I limit the data file to alumni donors only, the improvement in this method is only very slight.)


Pressing on … I did a regression analysis using total giving in the second five-year period as the dependent variable, then entered total giving in the prior five-year period as an independent variable. (Naturally, R-squared was very high.) This allowed me to see if Slope provides any explanatory power when it is added as the second independent variable — the effect of giving in the first five-year period already being accounted for.


And the answer is, yes, it does. But only under specific conditions: Both five-year giving totals were log-transformed and, most significantly, donors who did not give in the second period were excluded from the regression.


There are other way to assess the usefulness of “slope” which might lead to an application, and I encourage you to give this a try with your own data. From past experience I know that donors who make big upgrades in giving don’t have any neat universal pattern such as an upward slope in their giving history. (The concept of volatility is explored here and here.) “Slope” is probably too simple a characteristic to employ on its own.


But as I’ve said before, if it were easy, obvious, or intuitive, it wouldn’t be data analysis.


30 May 2016

Donor volatility: testing years of non-giving as a predictor for the next big gift

Filed under: Annual Giving, Coolness — Tags: , , , , — kevinmacdonell @ 5:02 am

Guest post by Jessica Kostuck, Data Analyst, Annual Giving, Queen’s University


During my first few weeks on the job, my AD set me up on several calls with colleagues in similar, data-driven roles, at universities across the country. One such call was with Kevin MacDonell, keeper of CoolData, with whom I had a delightfully geeked-out conversation about predictive modeling. We ran the gamut of weird and wonderful data points, ending on the concept of donor volatility.


When a lapsed high-end donor has no discernable annual giving pattern, is it possible to use his or her years of non-giving to predict and influence their next big gift?


Our goal for our Annual Giving program was to identify these “volatile” donors (lapsed high-end donors with an erratic giving history), and reactivate (ideally, upgrade) them, through a targeted solicitation with an aggressive ask string.


(For more on volatility, see Odd but true findings? Upgrading annual donors are “erratic” and “volatile”, which describes findings that suggest the best prospects for a big upgrade in giving are those who are “erratic”, i.e. have prior giving but are not loyal, every-year donors, and “volatile”, i.e. are inconsistent about the amounts they give.)


I did some stock market research (see footnote), decided on a minimum value for the entry-point into our volatility matrix ($500), and, together with Senior Programmer Analyst, Kim Wilkinson, got cracking on writing a program to identify volatile donors.


volatile sql clip



Our ideal volatile donors had given ≥ $500 at least twice in the last 10 years, without any consecutive (“stable”) periods. Year over year, our ideal volatile donor would act in one of three ways – increase their giving by at least 60%, decrease their giving by at least 60%, or not give at all. Given the capacity level displayed by these volatile donors, we replaced years of very low-end giving <$99) with null values (“throwaway gifts”).


We had strict conditions for what would remove a donor from our table. If a donor had two years of consecutive giving within a ±60% differential from their previous highest giving point (v_value), we considered this a natural (or, at least, for this test, not sufficiently irregular) fluctuation in giving, and they were removed from the table. If the donor had two consecutive years of low-end (but not null) giving ($99-$499), this was considered a deliberate decrease, and they, too, were removed. Conversely, if a donor had two consecutive years of greatly increased giving, this was considered a deliberate increase, and they were also removed.


At any point, a donor could be admitted, or readmitted into our volatility matrix, by establishing, or re-establishing, a v_value and subsequent valid volatility point.


The difference between a lapsed donor and a volatile donor


Below is a sample pool of donors we examined.


volatile donor history image


Donor 1 is volatile all the way through, with greatly varying levels of giving, culminating in two years of non-giving. Donor 1 is currently volatile, and thus enters our test group.


Donor 2 is volatile for two years – FY07-08 and FY08-09 (v_value of $5,000 in FY07-08, followed by a valid volatile point in FY08-09 with a decrease of -80%), but then is removed from the table in FY09-10 with only a -50% decrease in giving. They do not establish a new v_value, even though their FY09-10 giving meets the minimum giving threshold for this test, because of their consecutive, only marginally decreased giving in FY10-11. This excludes Donor 2 from our test.


Donor 3 enters our volatility matrix in FY04-05, leaves in FY07-08, reenters in FY10-11, and maintains volatility to current day, and, thus, enters into our test solicitation.


While all three of these donors are lapsed, and are all SYBUNTs, only Donor 1 and Donor 3 are, by our definition, volatile.


Solicitation strategy and results


We now had a pool of constituents who were at least two years lapsed in giving, who all had a history of inconsistent, but not unsubstantial, contributions to the university. In an email solicitation, we presented constituents with both upgrade language and an aggressive ask matrix, beginning at a minimum of +60% of their highest ever v_value, regardless of where they were in the ebb and flow of their volatility cycle. Again, the goal of this test was to (1) identify donors with high capacity (2) whose giving to the university was erratic in frequency and loyalty and (3) encourage these donors to reactivate at greater than their previously-established high-end giving.


In our results analysis, we broadened our examination to include any gifts received from our testing pool within the subsequent four weeks, not just gifts linked to this particular solicitation code, to verify the legitimacy of tagging these donors as volatile – that is, having a higher-than-average probability to reactivate at a high-end giving level.


An important part of our analysis included comparing our testing pool to a control pool, pairing each of our volatile donors with a non-volatile twin who shared as many points of fiscal and biographic information as was possible.


Within the four-week time frame, our test group had about a 7% activity rate, whereas our control group had an activity rate of about 5% (average for the institution during this timeframe). Within our volatility test group, 50% of donors gave an amount that would plot a valid point on our volatility matrix.


Conclusion and next steps


Through our experiment, we sought to identify volatile donors, and test if we could trigger a reactivation in giving, ideally at, or greater than, their highest level on record.


Since not all of the donors within our test group made their gifts to the coded solicitation with the volatile ask matrix, it is indiscernible whether being presented with language and ask amounts that reflected their elusive giving behavior prompted a gift – volatile or otherwise. However, we do feel confident that we’re onto something when it comes to identifying and predicting the behavior of a particular, valuable set of donors to our institution.


Our above-average response rate (both versus the control group, and institution-wide) supports our “theory of volatility”, insofar as that volatile donors are an existing pool with shared behaviors within our donor population. We plan to re-run this test again at the same time next year, continuing our search to find a pattern within the instability.


Were we able to gather definitive results that will define and shape future annual giving strategy? Not exactly. But as far as data goes, this was definitely cool.


Jessica Kostuck is the Data Analyst, Annual Giving at Queen’s University in Kingston, Ontario. She can be reached at



1. Varadi, David. “Volatility Differentials: High/Low Volatility versus Close/Close Volatility (HVL-CCV).” CSS Analytics. 29 Mar. 2011. Web. Winter 2015.
Older Posts »

Create a free website or blog at