Last weekend, with so many other pressing things I could have been doing, I got it in my head to analyze people’s astrological signs for potential association with propensity to give. I don’t know what came over me; perhaps it was the Supermoon. But when you’ve got a data set in front of you that contains giving history and good birth dates for nearly 85,000 alumni, why not?
Let me say first that I put no stock in astrology, but I know a few people who think being a Libra or a Gemini makes some sort of difference. I imagine there are many more who are into Chinese astrology, who think the same about being a Rat or a Monkey. And even I have to admit that an irrational aspect of me embraces my Taurus/Rooster nature.
If one’s sign implies anything about personality or fortune, I should think it would be reflected in one’s generosity. Ever in pursuit of the truth, I spent a rather tedious hour parsing 85,000 birth dates into the signs of the zodiac and the animal signs of Chinese astrology. As you will see, there are in fact some interesting patterns associated with birth date, on the surface at least.
Because human beings mate at any time of year, the alumni in the sample are roughly equally distributed among the 12 signs of the zodiac. There seem to be slightly more births in the warmer months than in the period of December to February: Cancer (June 21 to July 22) represents 8.9% of the sample while at the lower end, Capricorn (Dec 22 to Jan 20) represents 7.6% of the sample — a spread of less than two percentage points.
What we want to know is if any one sign is particularly likely to give to alma mater. I coded anyone who had any giving in their lifetime as ’1′ and all never-donors as ’0′. At the high end, Taurus natives have a donor rate of 30.7% and at the low end, Aries natives have a donor rate of 29.0%. All the other signs fall between those two rates, a range of a little more than one and a half percentage points.
That’s a very narrow range of variance. If I were seriously evaluating the variable ‘Astrological sign’ as a predictor, I would probably stop right there, seeing nothing exciting enough to make me continue.
But have a look at this bar chart. I’ve arranged the signs in their calendar order, which immediately suggests that there’s a pattern in the data: A peak at Taurus, gradually falling to Scorpio, peaking again at Sagittarius, then falling again until Taurus comes around once more.
The problem with the bar chart is that the differences in giving rates are exaggerated visually, because the range of variance is so limited. What appears to be a pattern may be nothing of the sort.
In fact, the next chart tells a conflicting tale. The Tauruses may have the highest participation rate, but among donors they and three other signs have the lowest median level of lifetime giving ($150), and Aries have the highest median ($172.50). The calendar-order effect we saw above has vanished.
These two charts fail to tell the same tale, which indicates to me that although we may observe some variance in giving between astrological signs, the variance might well be due to mere chance. Is there a way to demonstrate this statistically? I was discussing this recently with Peter Wylie, who helped me sort this out. Peter told me that the supposed pattern in the first chart reminded him of the opening of Malcolm Gladwell’s book, Outliers, in which the author examines why a hugely disproportionate number of professional hockey and soccer players are born in January, February and March. (I won’t go farther than that — read the book for that discussion.)
In the case of professional hockey players, birth date and a player’s development (and career progress) are definitely associated. It’s not due to a random effect. In the case of birth date and giving, however, there is room for doubt. Peter took me through the use of chi-square, a statistic I hadn’t encountered since high school. I’m not going into detail about chi-square — there is plenty out there online to read — but briefly, chi-square is used to determine if a distribution of observed frequencies of a value for a categorical or ordinal variable differs from the theoretical expected frequencies for that variable, and from there, if the discrepancy is statistically significant.
Figuring out the statistical significance part used to involve looking up the calculated value for chi-square in a table based on something called degrees of freedom, but nowadays your stats software will automatically provide you with a statistic telling you whether the result is significant or not: the p statistic, which will be familiar to you if you’ve used linear regression. The rule of thumb for significance is a p-value of 0.05 or less.
As it turns out, the observed differences in the frequency of donors for each astrological sign has a significance value of p = 0.3715. This is way above the 0.05 confidence level, and therefore we cannot rule out the possibility that these variations are due to mere chance. So astrology is a bust for fundraisers.
Now for something completely different. We haven’t looked at the Chinese animal signs yet. Here is a table showing a breakdown by Chinese astrological sign by the percentage of alumni with at least some giving, and median lifetime giving. The table is sorted by donor participation rate, lowest to highest.
Hmm, it would seem that being a Horse is associated with a higher level of generosity than the norm. And here’s the biggest surprise: A Chi-square test reveals the differences in donor frequencies between animals to be significant! (p-value < 0.0001).
What’s going on here? Shall we conclude that the Chinese astrologers have it all figured out?
Let’s go back to the data. First of all, how were alumni assigned an animal sign in the first place? You may be familiar with the paper placemats in Chinese restaurants that list birth years and their corresponding animal signs. Anyone born in the years 1900, 1912, 1924, 1936, 1948, 1960, 1972, 1984, 1996 or 2008 is a Rat. Anyone born in 1901, 1913, 1925, etc. etc. is an Ox, and so on, until all the years are accounted for. Because the alumni in each animal category are drawn from birth years with an equal span of years between them, we might assume that each sign has roughly the same average age. This is key, because if the signs differ on average age, then age might be an underlying cause of variations in giving.
My data set does not include anyone born before 1930, and goes up to 1993 (a single precocious alum who graduated at a very young age). Tigers, with the lowest participation rate, are drawn from the birth years 1938, 1950, 1962, 1974 and 1986. Horses, with the highest participation rate, are drawn from the birth years 1930, 1942, 1954, 1966 and 1978, plus only a handful of young alumni from 1990. For Tigers, 77% were born in 1974 or earlier, but for Horses, 99% of alumni were born in 1978 or earlier.
The bottom line is that the Horses in my data set are older than the Tigers, as a group. The Horses have a median age of 45, while the Tigers have a median age of 37. And we all know by now that older alumni are more likely to be donors.
Again, my conversation with Peter Wylie helped me figure this out statistically. The short answer is: After you’ve accounted for the age of alumni, variations in giving by animal sign are no longer significant.
(The longer answer is: If you perform a linear regression of Age on Lifetime Giving (log-transformed) and compute residuals, then run an Analysis of Variance (ANOVA) for the residuals and Animal Sign, the variance is NOT significant, p = 0.1118. The residuals can be thought of as Lifetime Giving with the explanatory effect of Age “washed out,” leaving only the unexplained variance. Animal Sign fails to account for any significant amount of the remaining variance in LT Giving, which is an indication that Animal Sign is just a proxy for Age.)
Does any of this matter? Mostly no. First of all, a little common sense can keep you out of trouble. Sure, some significant predictors will be non-intuitive, but it doesn’t hurt to be skeptical. Second, if you do happen to prepare some predictors based on astrological sign, their non-significance will be evident as soon as you add them to your regression analysis, particularly if you’ve already added Age or Class Year as a predictor in the case of the Chinese signs. Altogether, then, the risk that your models will be harmed by such meaningless variables is very low.