CoolData blog

4 January 2010

Complex formula for name concatenation in Excel

Filed under: Excel — Tags: , , , — kevinmacdonell @ 3:42 pm

One common and very time-consuming task I perform in Excel is creating one clean-looking column of names out of half a dozen columns of name data. It’s the nature of the beast: Every time we pull data on individuals, we’re going to have prefixes (Dr., Rev., etc.), nicknames, and first, middle and last names as separate bits of data. Stringing names together using simple concatenation is not difficult, but the absence of data in some of these columns (prefix, middle name, and nickname) makes the process hard to automate. If you just string the whole row together, you’ll end up with extra spaces and other anomalies.

Here’s a formula that uses nested conditional statements to account for empty cells:

=SUBSTITUTE(TRIM(A2 & " "& B2 & " " & C2 & " " & IF(LEN(B2)=1," ", IF(LEN(D2)=0,"",IF(B2=D2," ","("&D2&")"))) & " " & E2), " ", " ")

Yes, it’s scary, but it works.

(For the record, my nickname is NOT Kev.)

Your columns MUST be in this order to work:

A: Prefix
B: First name
C: Middle name
D: Preferred first name or Nickname
E: Last name

Here’s what the formula does:

  • Concatenates the whole name from whatever “parts” are available
  • Automatically removes extra spaces from the concatenation for cells that are empty (for example, if there’s no Prefix)
  • If the first name is a single initial (see note below), then the first part of the name is the initial plus the middle name
  • If the first name is the same as the preferred first name (nickname), then column D is skipped
  • If the first name is different from the preferred first name (if it is present), then the value of column D is expressed in parentheses

NOTE: By data-entry convention, our Banner records express single initials without a period, therefore the test for a single initial is “length = 1”. If your data includes an initial, you’ll have to modify this to LEN(B2) = 2, which might cause a problem with names that are two letters in length.

Your first column might not be Prefix. In fact it probably isn’t: “ID” is probably your first column. In that case, it might be easier to paste the formula into a simple text editor and shift all the letters however many places to the right that you need, before using it in Excel, like so:

=SUBSTITUTE(TRIM(B2 & " "& C2 & " " & D2 & " " & IF(LEN(C2)=1," ", IF(LEN(E2)=0,"",IF(C2=E2," ","("&E2&")"))) & " " & F2), " ", " ")

This formula, and others like it, would be especially useful to save as a sort of template, if you frequently pull data that has the same columns. This idea was suggested to me by Sarah Ellison, Development Research Associate at the Curtis Institute of Music:

As long as you can get your data into an Excel spreadsheet with the same columns in the same order (she uses Raiser’s Edge, with saved queries and exports), you can take some of the hassle out of using Excel for reporting. She recommends that you have a tab/worksheet with all of your raw data (exactly the way it gets exported from the database), and another tab/worksheet with the formulas to make your page look the way you want it. Just ensure that all your cell references refer to cells in the other tab, and don’t overwrite the formulas sheet when you update/save over the page with the raw data.

Advertisement

Blog at WordPress.com.