Guest Post by Jason Boley, Associate Director of Prospect Management and Tracking, Purdue University
NodeXL is a powerful social network analysis tool that is designed as a plug-in to Excel 2007. NodeXL was designed by a team of programmers at Microsoft Research and is now maintained by the Connected Action Consulting Group. In a nutshell, the tool was designed to produce network graphs from a list of network usage data, and includes built-in tools for getting information from Twitter, Flickr, YouTube, etc. (Link to NodeXL site.)
But can it be useful for fundraising too?
If you look at NodeXL from a fundraising perspective you immediately see the correlation. NodeXL is designed to be a discovery tool for discovering deep connections in your social networks. This in itself is pretty useful, especially if your fundraising department is heavily involved in social media (and it probably will be at some point). This tool helps you discover who is having an impact on your social reputation. But let’s put that aside. Even the most basic of fundraising offices are constantly looking to find new connections to and between donors. Who knows who that can provide me with an introduction to Mr. X? Depending on the size and complexity of your database (and your staff) these connections are often easy to miss. Sure, we may note that Bob and Jim sit on the same board or work at the same company, but going a layer deeper than that is often problematic.
With that in mind, I began to explore NodeXL.
There are a couple of things that are appealing about NodeXL. First, it’s free. It downloads as a zip file and operates essentially as a macro plug-in for Excel 2007. But this is unlike any macro you’ve ever seen. The visual capabilities are stunning. Secondly, it is relatively straightforward. While it does have some extremely powerful advanced features, it’s relatively easy to get started.
The only information that NodeXL accepts is called an edge list. An edge list is essentially an Excel list with two columns. In column A you have the first value and in column B you have a second value which is related to the value in Column A. So a sample edge list might look as follows:
These two columns of data are all that NodeXL accepts and needs to do network analysis. The rest of the work, calculating connections between these relationships, is done for you.
I rarely think small, so my first inclination was to see if we could visualize all of the recorded board memberships in our Advance database. The first step was to create my edge list. This was very simple and took the following format:
I_Bob Smith O_Bob_Smith_Foundation
I think you can get the picture. The I and O appended to the front are for facilitating sorting (if you should wish to color code them later you can easily sort the organizations from the individuals). As you might expect, this query returned a couple thousand connections. But I was pleased NodeXL handled them so efficiently. The resulting graph (Harel-Koren Fast Multiscale format) was visually stunning.
Stunning? Yes! Useful? I wasn’t sure yet. I knew that I could isolate an individual within this huge graph. The highlighted green line above, for example, is one of our major donors.
I decided to leave this dataset behind for a bit and move to more of a discovery project. We are very fortunate to have completed a full database screening a few months ago. I turned my eyes towards the data we received from our vendor that includes a table of people in our database with connections the vendor discovered to foundations (based on GuideStar data). I created an edge list from this data that resulted in 4,872 relationships. The resulting graph (Fruchterman-Reingold format) returned a huge blob.
You can click around on a graph like this to isolate individual connections (or edges) and try to do some discovery this way, but in a graph as dense of this it is basically impossible. But the other way to do discovery work is to pose the question from a certain perspective. For instance, I highlighted in the edge list two particular closely related foundations. I then used the subgroup option in NodeXL to select all of the vertices four layers deep, meaning let’s look at relationships between individuals not just directly to the organization, but let’s dig a few layers deeper. You can see from the graph below the highlighted connections from the larger mass.
I then extracted these selected vertices into a brand new NodeXL sheet. After adding some labels, you can reduce the useful information to just a few connections.
What can we tell from this graph? It is likely that the connection between Al Winner and Steven Cook was already known and well documented. But the real advantage here is seeing the tangential connections we may have missed. Steven Cook has a potential relationship with Charles Talor. Indeed, in our database Steven Cook is a well known alumnus, while Charles Talor is someone we would like to know. So a potential for establishing a relationship is provided. The same holds true for potential connections to James Jones and Jean Smith. While some of these connections may be known, the advantages to visualizing them are great. Potential connections jump from the page.
What I have discussed is scratching the surface of this advanced tool in only a cursory manner. I have very little knowledge of network graphing beyond what I’ve read online. Yet with NodeXL I have been able to produce some simple but powerful graphs. Relationships as they are defined in fundraising databases are very different from those that can be extracted from complicated social networking data. And yet there are similarities. As I begin to delve deeper into this tool, it is forcing me to rethink how we have relationships defined in our database, think about existing data that we have that might be useful in a visual format, and think about the future where we will certainly want to supplement our traditional alumni data with information from social networks.
(Jason Boley is the Associate Director of Prospect Management and Tracking at Purdue where he manages prospect information and focuses on data analytics. Jason has over ten years of experience working with fundraising databases and has presented nationally on the topics of database reporting and prospect management.)