CoolData blog

25 March 2014

An all-SQL way to automate RFM scoring

Filed under: RFM, SQL — Tags: , , , — kevinmacdonell @ 8:44 pm

In my last post I described how to automate RFM scoring using the scripting language Python and pandas. Automating RFM scoring is a great way to learn a bit of Python — but on its own it’s a lousy reason for learning Python. It was pointed out to me that you don’t really need to call on the power of Python if you’re already comfortable with (or learning to use) SQL, and I agree.

Shortly after publishing the Python post, I got an email from Bruce Higgins, who works in the Alumni Affairs & Development office at Cornell University. His all-SQL solution is shown below. I leave it to readers to figure out how to make it work for them. In the wake of my post there was some great discussion and code-sharing on the Prospect-DMM discussion list, and I encourage you to look up that thread.

You’ll notice that Bruce has hard-coded the dollar values used for scoring “Monetary value” instead of dynamically dividing the file into quintiles. As he points out, if you’re scoring the same database over and over, it’s not likely these dollar thresholds are going to change much over time. You will of course need to make adjustments for your own donor data.

WITH Five_years AS
(SELECT id,
 SUM((gift_amount)) AS cash,
 MAX(decode(.gift_fiscalyear, 2014, 5, 2013, 4, 2012, 3, 2011, 2, 2010, 1)) AS recency,
 COUNT(DISTINCT(gift_fiscalyear)) AS frequency
 FROM
(SQL specifying the last five years cash transactions from our warehouse)
GROUP BY id
SELECT Five_years.id,
CASE
WHEN Five_years.cash >= 10000 THEN 5
WHEN Five_years.cash >= 2500 THEN 4
WHEN Five_years.cash >= 1000 THEN 3
WHEN Five_years.cash >= 500 THEN 2
ELSE 1
END + Five_years.recency + Five_years.frequency AS rfm_score
FROM Five_years
ORDER BY five_years.name
Advertisements

Create a free website or blog at WordPress.com.

%d bloggers like this: