CoolData blog

1 February 2011

The really big list: 100 variables for higher-ed predictive models

Filed under: Predictor variables — kevinmacdonell @ 6:26 am

In April last year I published a post called The big list: 85 predictor variables for alumni models. Since then I’ve added new ideas for predictor variables, some from sources I hadn’t tapped before, such as historical Phonathon data. Many of these predictors are specific to my own data, but you can use this list to suggest ideas for things to explore in your own.

The first list contains 88 variables that are independent of ‘Giving’, and the list after it has 19 variables that are all Giving-related. That’s an impressive total of 107 potential predictors. … OK, I admit a lot of these are just variations on, and transformations of, each other. The list does not include interaction variables (variables that are combinations of each other). There isn’t room here for that, but if you if have time, try multiplying some of the continuous variables (‘Age’ x ‘Phonathon talk time’, for instance, and see if it yields a new predictor or two.

Any model you build will probably have room for a maximum of 15 to 25 significant predictors, but I like to start with as many possibilities as I can. I’m sometimes surprised by the ones that prove most useful.

  1. Graduated Y/N
  2. First name is single initial
  3. Middle name is single initial
  4. Nickname present
  5. Preferred name present
  6. Nickname different from Preferred
  7. First name is in top 10% of popular names
  8. Age (and/or Class Year)
  9. Age missing
  10. Age transformed/binned (Log, sqr root, vingtiles)
  11. Age at graduation
  12. Legacy code present
  13. Prefix present (NOT Ms., Mrs., Mr.)
  14. Suffix present
  15. Female
  16. Marital status present
  17. Single
  18. Married
  19. Divorced
  20. Marital status other
  21. Total Phonathon talk time (totaled from historical calling projects)
  22. Phonathon talk time transformed/binned (log, sqr root, vingtiles)
  23. No phone talk time
  24. Total number Phonathon attempts
  25. Phone attempts transformed/binned (log, sqr root, vingtiles)
  26. Has Phonathon refusal reason code
  27. Number phone pickups
  28. Number phone pickups is 1
  29. Phone pickups > 1
  30. Number phone failed to pick up
  31. Number phonathon ‘No Pledge’
  32. 1 ‘No Pledge’ or more
  33. More than 1 ‘No Pledge’
  34. Number times bad phone number (Wrong Num, Disconnect)
  35. Had a bad phone number
  36. Personal info update is recent
  37. Local resident
  38. Lives in province
  39. Lives in Canada
  40. Lives in USA
  41. International
  42. International phone number
  43. Unlisted phone number
  44. Alumni survey responder
  45. Alumni survey responded by mail
  46. Has other degree(s) (from survey)
  47. Student satisfaction score (from survey)
  48. Income tier (from survey)
  49. Donor likelihood question responses (from survey)
  50. Donor Index score (from survey)
  51. Number of degrees
  52. More than 1 degree
  53. Preferred address type is campus mail
  54. Preferred address type is type Business
  55. Preferred address type is Residence
  56. Number of constituent codes (from Banner)
  57. PRNT (parent) code present
  58. NOSP (non-alum spouse) code present
  59. FACT or STAF (faculty/staff) code present
  60. Alumni directory purchaser
  61. Update mailing response activity code
  62. Number of activity codes
  63. Frat or Sorority activity code present
  64. Leader code present (Activities)
  65. Position present (employment)
  66. Number of position updates (employment)
  67. Prominent position (president , CEO, etc)
  68. Number alumni events attended
  69. 1 or more events attended
  70. Events attended binned (0, 1, 2-9, 10 or more)
  71. Email present (not Gmail, Yahoo, Hotmail)
  72. Email present (Gmail, Yahoo, Hotmail)
  73. Email present (any)
  74. Rural Cdn postal code
  75. Urban Cdn postal code
  76. Number of contact restrictions
  77. Any restriction present
  78. Many restrictions present
  79. Log of number of restrictions
  80. Home phone present
  81. Business phone present
  82. Parent phone present
  83. Any family cross-reference present
  84. Number of family cross-references
  85. Prefers phone over mail solicitation
  86. Prefers mail over phone solicitation
  87. No preference mail/phone
  88. Distance from campus

Giving-related variables (to use when the dependent variable is something other than ‘Giving’)

  1. Has made a third-party gift (i.e., via business)
  2. Number third-party gifts (binned: 0, 1, 2 or more)
  3. Number of unique gift designations (binned: 1, 2, 3, or 4)
  4. Has made an in-memory gift
  5. Has made an anonymous gift
  6. Has made an in-honour-of gift
  7. Has made a matched gift
  8. Age at time of first gift is older than variable median age
  9. Log of age at time of first gift
  10. Number of gifts per year
  11. More than 1 gift per year
  12. More than 2 gifts per year
  13. Month of donor’s most recent gift was December
  14. Quarter of donor’s most recent gift was 4th
  15. Gift made using American Express, other cards
  16. Gift made in form of cash
  17. Gift made on any credit card
  18. Has made an in-kind gift
  19. Has made gift of stocks/securities


  1. Kevin

    Great list! If you are doing major gift models I would add business check and personal check payment method.

    Also it varies by institution but I have found differences in populations using AmEx vs Visa/Mastercard and will tease that out as a seperate variable.

    Volunteer info is also valuable–as well as reunion attendance.

    Great stuff!

    Comment by Alex Oftelie — 1 February 2011 @ 5:06 pm

  2. Thanks for sharing, Kevin. I’m working on my next data schema and we need to combine at some point. I’ve got some in my model you might be interested in using.

    Comment by Jason Boley — 1 February 2011 @ 8:12 pm

  3. Thanks Alex and Jason – always on the lookout for new variables! My list is by no means exhaustive. As well, a list for another type of institution such as a large performing-arts nonprofit would have other variables to choose from.

    Comment by kevinmacdonell — 3 February 2011 @ 10:22 am

  4. […] if the individual had not given a gift. Some of the confusion springs from an earlier blog post, my Big List of 100 predictor variables. Near the end of that giant list of suggested predictor variables, I tacked on another list of […]

    Pingback by Giving-related variables: Keep or leave out? « CoolData blog — 4 March 2011 @ 1:37 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Create a free website or blog at

%d bloggers like this: