Today I will offer some additional notes on how the Giving chart spreadsheet was assembled. As I said before, assembling the data was a lot more work than actually making the chart! It would be helpful for you to have query access to your institution’s database for this, but if you can get your IT people to extract the data, you should be able to prepare that data in Excel.
I started with an MS Access query of our database (Banner) which gave me a row for every single alumnus/na, living and deceased (about 43,000 records), donors and non-donors, with a column for Class Year, and a column for Total Giving for each fiscal year from 1989 to 2009.
(Each fiscal year total was in fact a sub-query – there’s definitely some work involved. Luckily I had all these sub-queries already made, from previous data projects. Gathering all this data by individual may not be the easiest way to go about this. If you can obtain data that is already aggregated in some way, such as by class year of donor, go for it.)
The length of your time series will be limited by the depth of historical giving data available in your database – ours happens to go back 20 years.
Let’s look at the final spreadsheet we need to create. Here are the first few rows of mine:
We need to collapse our tens of thousands of rows of data on individual alumni into something like what you see above. This is a lot easier to do with a statistics software package, but no doubt it can be accomplished in Excel as well.
I use Data Desk for stats. Paste all your individual giving data into a new data file. The first step is to create a derived variable to recode Class Year into decades (1930s, 1940s, etc.) These will be used to populate the first column of our spreadsheet.
The first column defines the ‘bubbles’. The last column (Category) gives the bubbles names. The two columns contain exactly the same data, but the gadget requires this label information to be contained separately.
The second essential column is the time-series data, in this case Year. This isn’t Class Year – this is the fiscal year in which gifts were made. Year is the variable reflected in the time slide-control at the bottom of the chart.
Creating the spreadsheet was a bit tedious from this point on. It involved a lot of cutting and pasting of the desired values from Data Desk into an Excel spreadsheet. You may be able to vastly improve on my methods.
The middle column, Number Alumni, is the count of individuals in each Class Decade. The relative size of each bubble is based on this number, from 719 in the 1930s to more than 10,000 in the 2000s. (I’ll show you how to define bubble size later.)
The other columns provide the data that will form the x-axis and y-axis of the chart.
In my chart, % Participation or Number donors can be on the x-axis – you can choose one or the other. (% Participation = Number donors divided by Number alumni.)
And finally, the data in the column Median $ is used for the y-axis.
As I said, I used Data Desk to help do the calculations and order the data properly before pasting it into Excel and getting it all arranged properly. Here is a Data Desk Summary by Groups table showing giving totals by Class Decade for the year 1995:
This provided me my donor counts as well as my median giving data. Of course I had to do the same thing over and over for each fiscal year, 1989 to 2009.
Give it a shot. Use the tools you’ve got to massage your data until it’s in the form of the spreadsheet I showed above. This really is the time-consuming phase of your project. Once you’re done, the rest is surprisingly easy. I’ll get to that in Part 4.