CoolData blog

13 December 2011

Finding connections to your major gift prospects in your data

Guest post by Erich Preisendorfer, Associate Director, Business Intelligence, Advancement Services, University of New Hampshire

(Thanks to Erich for this guest post, which touches on something a lot of prospect researchers are interested in: mapping relationships to prospects in their database. Actually, this work is more exciting than that, because it actually helps people find connections they may not have known about, via database queries and a simple scoring system. Is your Advancement Services department working on something like this? Why not ask them? — Kevin.)

Data miners often have an objective of exploring sets of data to determine meaningful patterns which can then be modeled for predictive patterning, hopefully to help meet their organization’s end goal(s).  However, there may be a time when the end behavior is not inherent in your database. Such a situation recently came up for my Advancement organization.

Our prospecting team recently started a program wrapped around peer recommendations: A prospect recommends new suspects to us based on the prospect’s interactions with the suspects. The question then became, what can we provide to the prospect to help get them thinking about potential suspects?

We currently do not have any type of data which would allow us to say, “Yes, this is what a relationship looks like,” outside of family relationships. We had to find a different way to identify potential acquaintances. I looked back at my own relationships to determine how I know the people I know. My friends and acquaintances largely come from some basic areas: school, work, places I’ve gone, etc.

Transforming my experience with relationships into what we have for useable data, I saw three key areas where relationships may exist: work history, education history, and extracurricular activities including one-time events. Fortunately, I was able to pinpoint our constituents’ time in each of the above areas to help isolate meaningful, shared experiences amongst constituents. Our work and extracurricular history includes to/from dates, and we have loads of educational history data that includes specific dates. Using this data, I am able to come up with potential relationships from a single prospect.

Prospect Profile (generated by entering a single prospect’s ID):

  • John Adams
  • Widget Factory, Employee 01/05/1971 – 06/16/1996
  • Student Activities: Football, Student Senate 09/1965-05/1966
  • Bachelor of Arts, Botany 1966

Potential Relationships (each item below is a separate query, using the Prospect Profile results):

  • Those employed by the Widget Factory who started before John ended, and ended after John began.
  • Those students who participated in Football and had a class year within +/-3 years of John.
  • Those students in Student Senate at the same time as John, similar to the Widget Factory example.
  • Those students who were in the same class year as John.
  • Those students who share John’s major.

Currently,since I have no way of proving the value of one point of contact over the other, each row returned in the potential relationships earns the constituent one point. Since my database stores historical records, I may get more than one row per constituent in any one category if they met more than one of John’s associated records – say they participated in Student Senate and played Football. This is great, because I want to give those particular constituents two points since they have more than one touch point in common with John.

I end up with a ranked list of constituents who share potential relationship contacts with my main prospect. The relationship lists provide our prospect researchers a starting point in putting together a solid list of high capacity constituents a single person may have some sort of relationship with, thus a greater insight into potential giving.

As of now, the report is in its infancy but looks to have high potential. As we grow the concept, there are multiple data points where further exploration could result in a higher level of functioning. As prospects use the lists to identify people they know, we can then deconstruct those choices to determine what is more likely a relationship. Should shared employment be ranked higher than shared class year? Should Football rank higher than Student Senate? I would guess yes, but I currently do not have supporting data to make that decision.

Another interesting concept, raised at the recent DRIVE 2011 conference, would be: “How are these two prospects potentially related by a third constituent?”  The result could mean the difference between two separate, forced conversations and one single conversation with three prospects shared over nostalgic conversations, drinks and, hopefully, money in the door!

Erich Preisendorfer is Associate Director, Business Intelligence, working in Advancement Services at the University of New Hampshire.

22 February 2011

Data disasters, courtesy of Mordac

Filed under: Best practices, Data, Pitfalls — Tags: , , — kevinmacdonell @ 6:28 am

(Image used via Creative Commons license. Click image for source.)

Have you ever heard of Mordac, the Preventer of Information Services? He’s a character in the comic strip Dilbert. Mordac cares about technology and security, but he doesn’t give a rip about users and their need to do actual work. I’ve never worked with a Mordac, but judging from some of the stories I’ve been collecting over the past week, I’m sad to say that a lot of YOU have.

Last week I wrote about the ways data is viewed differently by data miners and the good people who work in Advancement Services. (Warning: This data is different.) These differences can lead to misunderstandings, and much worse. When data is treated as essentially a technology issue, instead of a core institutional asset — that is, when Mordac gets to decide what happens to data — disaster can follow.

Data disasters come in three forms:

  1. Data that would be useful to have is never captured or entered.
  2. Historical data is overwritten with newer data.
  3. Data is deliberately deleted, or left out of a database conversion.

Some of these disasters unfurl slowly over years, others happen with the click of a mouse. The result is the same: Key insights into engagement are lost forever. Every annual fund donor who will never be proactively identified as a Major Gift or Planned Giving prospect is a huge loss. Institutional memory is flushed down the toilet, harming not just data mining efforts but prospect research and other data-related work. The word disaster is not too strong to describe the financial impact of the accumulation of such losses.

It’s a hidden disaster, too. No one will ever be able to add up the cost of what’s been lost collectively by the schools and nonprofits who sent me their tales of horror.

Let’s start with the issue of data that never gets entered in the first place:

  • One university established in the 1970s did nothing to capture athletic team membership, distinguished alumni, awards, campus club membership, and other key information.  This institution also never had yearbooks. The same university had a penchant for deleting old addresses.
  • I heard a similar tale from another school, but at least they have yearbooks going back to the 1940s. No one captured athletics, awards, or club memberships in any of the databases over the years. “We are trying to catch up,” this contributor writes, “but I probably won’t live long enough to see it.”
  • At various places where one contributor worked, development contact reports were not entered for long periods because management didn’t enforce policy.  “Future staff (including the new president) were constantly embarrassed when meeting with the prospects because they didn’t know the contact history — including asks, campus tours, first contacts, meetings with the college president, etc.”
  • Non higher-ed organizations are especially prone to neglect gathering data. One person writes: “Understanding that historical information in the database can be used for analytics is a concept that isn’t usually introduced until the organization has significant fundraising capacity. Also, non-higher ed organizations are sometimes so starved for staff that no one has sufficient database experience to consistently maintain data, even when raising millions.”
  • Sometimes, the attitude is that old data is useless data and therefore not worth the bother. An analyst at a large, non-higher ed nonprofit is dealing with a significant number of records for which the first gift date has been incorrectly entered. For example, 1980 is entered accidentally as 1908, 2009 is entered accidentally as 9200, 2001 is accidentally entered as 1901, which causes havoc for any kind of analysis. Correcting the errors is liable to affect the general ledger in Finance, so the bias is to do nothing. The contributor writes: “I was talking with one of the old managers in gift processing (who is no longer here, actually), and his response was, ‘Who cares, it was a long time ago, and it’s over now’.”

Then there is historical data that is lost by being overwritten with current data:

  • A database manager for one university deleted all the old addresses and did a global replacement of these addresses with a four-letter code standing for “Moved, left no address.”
  • “I’ve been in shops where that’s done,” writes a contributor, regarding the overwriting of constituent records.  “What’s worse is when the record gets deleted and the ID gets re-used. I’ve seen some really weird mailings go out in those shops.”
  • One school failed to protect historical fundraiser assignments to a prospect, says a contributor, despite the fact such tracking “is really critical if you think about piecing together a person’s institutional history.” The contributor traced the overwriting to the need for the information to come out correctly on a report. In other words, the technology tail was wagging the strategy dog, a common problem. Even after a way to work around such reporting issues is put in place, this person writes, data overwriting still happens “out of habit.”
  • “I have also experienced this,” writes another person. “No records being deleted (that I know of) but prospect assignments, addresses, employers, etc. all that stuff would get overwritten with the current stuff, and the historical info just goes away. Makes data mining significantly more difficult!”
  • The worst one, I think, is the university where this happened: When any alum returned for post-grad work or a higher degree, their original degree and data was, inexplicably, overwritten.

And finally, the horror of the deliberate destruction of data:

  • Here’s a breathtaking example: “The database had serious size limitations, so in order to free up space, all records marked deceased were deleted. Some old gift data also was deleted, I think (all older than seven years at the time of deletion, since apparently someone thought we only needed seven years of info). The result was that we lost some irreplaceable information, particularly in trying to track down relatives of individuals who made estate gifts or endowments, since the original records (with attached names, contact reports, etc.) were purged.”
  • A terrible tale from another school: “Due to limited database space, this organization did indeed perform a huge purge of information sometime in the early ‘90s.  Luckily, there was someone cognizant enough regarding historical values to not allow any purging of gifts. However, many never-givers’ records and parents of alumni records were all purged from the system with all of the notes and information in those records.  In addition, many historical addresses were purged. Someone decided that only ONE former address for each record was necessary!”
  • One university established in the 1930s, and which is on its third database for registration and alumni records, has been committing a variety of data crimes that include overwriting and deleting. During database conversions the following things have occurred: Those who attended but did not receive a degree were never transferred to the new database; the records of alumni who held certain types of degrees were never transferred from paper to digital format and are not in the database; people who died or had no valid address were not uploaded to the newer databases; and, finally, someone overwrote all the female constituent’s middle names with maiden names when they married. The contributor who sent me this list adds, “One of the older databases that still had some of the old missing data crashed and IT was going to just forget about it. I begged for the tables from this database and built an Access database so we could at least query and lookup this old data, otherwise it would have disappeared.”
  • Sometimes a data crime directly impacts Major Giving: “I was the moves manager and we also had a prospect researcher.  Between the two of us, we added a huge amount of information to the database, including prospect interests, historical contacts, affiliations and financial data.  After she and I left, the operations staff managed the conversion to a different database and, in their infinite wisdom, deleted all the data we had added. The prospect researcher who was hired after us called me to cry on my shoulder and I heard that even the development officers wept over the loss!”
  • Again, even more so than schools with alumni, it is non-higher ed nonprofits that might be most at risk. A contributor writes: “With the advent of databases housed online by the vendor and charged per record, organizations are deleting all non-donor records or donors who have not given in years.”

The reasons behind data disasters range on a spectrum from well-meaning good intentions, through lack of awareness, to criminal wrongdoing: Data integrity concerns, privacy concerns, space and cost concerns, miscommunication, lack of consultation with data users, short-sightedness, ignorance, laziness, expediency, and finally, deliberate sabotage.

What unites almost all of these cases is the identity of the person doing the deed: Mordac. His character represents the attitude that data is just part of the technology. What he doesn’t recognize is that servers and hard drives are replaceable, but data is not. The cost of acquiring and maintaining the technology may indeed be high, but because data has uses we can’t foresee, and because it cannot be recovered once deleted, its value cannot even be calculated. Mordac is simply not qualified to make decisions about data on his own.

I would like to think that most data disasters are caused by the more benign impulses on the spectrum. In fact, a number of contributors noted that sometimes deleting data is necessary in order to maintain data integrity. Often, updates from NCOA (the National Change of Address databases in both Canada and the U.S.) introduce incorrect addresses into our databases. In cases such as this, it would seem prudent to delete the errors completely.

There are two problems with that, both mentioned by contributors. First, data entry staff might not always know the difference between a legitimate historical address and an address introduced in error. And second, if you delete an error, you are doomed to repeat it. As one contributor writes: “We keep old addresses that are wrong and code them ‘incorrect address match’ so that if that address comes up again from NCOA or another update service, we have a record of the first oops.”

Advancement Services and IT staff are the professionals who keep our data ship afloat, and I do not mean to suggest that these horror stories represent typical operating procedure. Moreover, many of the disasters I was told about happened ten years ago or more. Consciousness about the value of historical data is on the rise, and we can hope that cases such as this are becoming increasingly rare. “We have since come to our senses,” one person writes. “(We) have filled out those (missing data) gaps to the best of our ability and we no longer delete information.  We are just discovering how important data analysis can be.”

So therefore, I’m now interested in hearing GOOD things about your IT and IS professionals. What is done at your institution to both protect the integrity of the data for today’s use, and ensure that it remains intact for analysis for many years to come? Do you have any advice for working with others in your organization to foster a culture of respect for historical data? Comment below, or send your ideas to me at kevin.macdonell@gmail.com.

Oh, but I’m still collecting stories about data disasters, though! Send your horrific tales to me in confidence at kevin.macdonell@gmail.com.

———————————————————————-

P.S.: Data disaster stories, and other related tales, sent to me since this post went live:

“We just got a significant donation with the promise of more from someone who attended our school for a short period more than 50 years ago. We found him by picking up his name from an old yearbook, adding him to our database, profiling him for a wealth snapshot, and inviting him to an event. Yes, there is a reason to keep old data!” (22 Feb 2011)

“We have all the scenarios in our institution.  We are also using a campus-wide database, and different departments keep creating duplicates of the same records because of different data standards. We’ve talked with many other people who use this database and they all agree that this is one of the major problems of sharing our database. It’s a dream to have a database integrated with other units of the institution, but quite a different situation when it comes to the reality of it, which is everyone is working on completely different priorities and paradigms (Finance, Development, Student Services in particular).” (22 Feb 2011)

17 February 2011

Warning: This data is different

Filed under: Best practices, Pitfalls — Tags: , — kevinmacdonell @ 2:09 pm

This post is named for a conference keynote I will give this spring for senior managers working in advancement services. These people are no strangers to data, in fact their working lives revolve around data. But they don’t necessarily see data through the same lens as we do, and don’t value the same things as we do.

We’d better learn to understand their perspective, because “our” data is at their mercy. I’m talking about gift processing, alumni records, IT and computing services, database admins — people who can be our best friends, or bring data disasters down on our heads. Oh, and there are disasters!

To illustrate, I will draw a distinction between “everyday” data — processing gifts, updating constituent records, maintaining databases, and pulling reports — and predictive modeling data. The differences might seem a bit philosophical, but they’re real and have real consequences.

Everyday data is used for sense-making and explaining in the present, via reporting and descriptive statistics. (“What were Decembers pledge totals, and how do they compare with this time last year?”) Modeling data is not reporting or explaining anything — so it’s hard for some people to put a value on it. Everyday data might be doing important things such as hunting for causes (“Did pushing the income tax deadline email on Dec 31 boost giving?”). But not modeling data, which only seeks to uncover associations between things without trying to determine causation. (“Is there a connection between giving in December and being a significant donor?”). In short, everyday data work pays off in the short term; modeling data work pays off over a much longer period of time.

When everyday data is messy, it will probably be dismissed as invalid. When modeling data is messy, that’s considered normal, and there are techniques to address it. For everyday data, missing values are an issue; for modeling data, missing values can be useful, (i.e., predictive). When missing data is troublesome rather than predictive, we are free to make up data to fill the gaps, using imputation. This is a foreign concept to people who deal exclusively with everyday data.

In the everyday, we are picky: “Give me these records, but not those, and include this field, and this field, but not those fields.” For modeling, we say, “Give me everything — I want it all!” Everyday data seeks an answer, a single-point destination reached by one route. Modeling data has a destination too, but it gets there via a myriad of routes. Every potential predictor is a new route to explore. And we don’t know in advance what routes will get us there fastest; we have to drive them all.

And finally, one key difference in philosophy which can spell disaster for your institution: In the everyday, the most current data supersedes and replaces old data. Think of address information: Of what use is a mailing list to the Alumni Office if it’s full of addresses from the 1970s? Well, in modeling, that old data is just as valuable as fresh data. For example, I’ve found that the count of address updates an individual has is highly predictive of giving. The only way I can get that count is if I total up the number of deactivated records, and then add the current, active record. No historical records, no predictor.

Yes, some institutions routinely overwrite or just flat-out delete this stuff. But that is not the half of it. Because I wasn’t sure this sort of thing really happened, I started asking around. I received a raft of data disaster stories from all sorts of organizations, from non-profits to universities. I’ve collected so many tales of horror that I’m going to share them with you in a separate post next week.

(By the way, plug plug: the conference I’ll be speaking at is CASE’s Institute for Senior Advancement Services Professionals in Baltimore, April 27-29.)

Blog at WordPress.com.