## Guest post by Peter B. Wylie

In the evolving world of analysis for higher ed and non-profits, it’s apparent that a gap is widening: Many well-resourced shops are acquiring analytics talent comfortable with statistics and programming, but many others are unable to make investments in specialized talent.

Today’s guest post is a paper by Peter Wylie that addresses the latter group, the ones at risk of being left behind. Download his paper here: Simple_Score_in_Excel_Wylie

In this piece he uses data from two schools to show you something you can try with your own data, building a very simple predictive score using nothing but Excel.

Some level of data analysis ought to be accessible at some level to every organization, regardless of technical proficiency or tools. And in fact, shops that move too quickly to automate predictive scoring with black-box-like methods risk passing over the insights available to the exploratory analyst using more manual, time-consuming methods.

We hope you enjoy, and above all, that you try this with your own data. The download link again: Simple_Score_in_Excel_Wylie

Comments Off on A simple score you can probably build in Excel

Not long ago, this question came up on the Prospect-DMM list, generating some discussion: How do you measure the rate of increasing giving for donors, i.e. their “velocity”? Can this be used to find significant donors who are poised to give more? This question got Peter Wylie thinking, and he came up with a simple way to calculate an index that is a variation on the concept of “recency” — like the ‘R’ in an RFM score, only much better.

This index should let you see that two donors whose lifetime giving is the same can differ markedly in terms of the recency of their giving. That will help you decide how to go after donors who are really on a roll.

You can download a printer-friendly PDF of Peter’s discussion paper here: An Index of Increasing Giving for Major Donors

Comments Off on How to measure the rate of increasing giving for major donors

The predictive modeling method I use (multiple regression) results in a “**raw score**” that is great for very fine ranking, because it will probably produce almost as many score levels as there are individuals in your sample. But it doesn’t work at all for other purposes.

For example, you can’t use ‘raw score’ to observe how a person’s or a group’s propensity to give **changes from year to year**. Your model changes over time, and so will the output. What does it mean if Joe’s raw score goes from 6349 to 9032? Not much. The value of the score itself has no practical meaning.

Because the values are not easy to explain to end-users, and because they change so much from year to year, you need to provide a more intuitive scoring system.

If you take everyone in the sample and divide them up into **groups of roughly equal numbers**, by their raw score, you produce a much more useful ranking.

Equal quarters are quartiles, equal fifths are quintiles, and so on. For our needs, equal tenths (**deciles**) and equal hundredths (**percentiles**) are the most useful.

For example, if Joe goes from the 60th percentile last year to the 93rd percentile this year, that’s a meaningful change.

But usually we’re not as interested in the score of a single individual as we are in getting a handle on a whole **segment** of a population. If your annual giving coordinator knows that the 9th and 10th deciles are always where the money is, regardless of how your model changes in any given year, you’ve taken a big step towards clarity. If your results aren’t clear, no one will embrace them.

Which type of predictive score you would use, deciles or percentiles, depends on how **selective** you need to be:

- An
**Annual Giving**manager trying to prioritize groups for the Telethon campaign might focus on the top one, two or three**deciles**. That represents thousands of alumni whose raw propensity-to-give scores place them in the top 10% to 30% of the population. - A
**Planned Giving**Officer trying to zero in on the best prospects might focus on no more than the top 1-5% of the population. For that person,**percentiles**will provide a much sharper knife.

When I produce a set of scores, I usually provide all three types, because one can’t always anticipate needs. The screen in Banner that holds the scores (**APAEXRS**) is able to accommodate three ‘flavours’ of scores, so I usually upload raw, decile, and percentile scores for each model.

Unfortunately, the output scores of a regression model are messy! They have to be worked on a bit in order to whip them into shape before you upload them to the database. Here’s what they often look like in their ‘unprocessed’ state:

`0.14878468`

0.14879054

0.1488943

0.14901018

0.14901177

0.14901665

0.1491403

0.14914408

The Banner field I upload scores to is able to contain four digits. So for ‘raw score’, I create a derived variable in Data Desk that multiplies these values by a thousand and rounds to the nearest whole number.

Deciles and percentiles are not exactly available at the push of a button, either. In future posts I will describe the methods I’ve been taught to produce a nice, clean set of scores for upload.