top of page

14. Split Up a Name (complex)

Description

Parses a full name into its component names.

Code

Sophisticated Parser:

""
SET FirstName-t TO ""
SET MiddleName-t TO ""
SET LastName-t TO ""
SET Suffix-t TO ""

// What kind of name is it?
IF FullName-t CONTAINS ","
SET pos TO POSITION( FullName-t, "," )
IF LENGTH( FullName-t ) - pos < 5
AND POSITION( FullName-t, " " ) < pos
"FML"
ELSE
"LFM"
END IF
ELSE
"FML"
END IF

// FullName-t = Last, First Middle
IF RESULT = "LFM"
""
SET pos TO POSITION( FullName-t, "," )
SET LastName-t TO FIRST( FullName-t, pos - 1 )
SET FirstName-t TO LAST( FullName-t, LENGTH( FullName-t ) - pos - 1 )
IF FirstName-t CONTAINS ","
SET pos TO POSITION( FirstName-t, "," )
SET Suffix-t TO LAST( FirstName-t, LENGTH( FirstName-t ) - pos - 1 )
SET FirstName-t TO FIRST( FirstName-t, pos - 1 )
END IF
IF FirstName-t CONTAINS " "
SET pos TO POSITION( FirstName-t, " " )
SET MiddleName-t TO LAST( FirstName-t, LENGTH( FirstName-t ) - pos )
SET FirstName-t TO FIRST( FirstName-t, pos - 1 )
IF MiddleName-t CONTAINS " "
SET pos TO POSITION( MiddleName-t, " " )
SET FirstName-t TO FirstName-t + " " + FIRST( MiddleName-t, pos - 1 )
SET MiddleName-t TO LAST( MiddleName-t, LENGTH( MiddleName-t ) - pos )
END IF
END IF

// FullName-t = First Middle Last
ELSE
""
IF FullName-t CONTAINS ","
SET pos TO POSITION( FullName-t, "," )
SET Suffix-t TO LAST( FullName-t, LENGTH( FullName-t ) - pos - 1 )
SET FullName-t TO FIRST( FullName-t, pos - 1 )
END IF
SET pos TO POSITION( FullName-t, " " )
IF pos < 1
SET pos TO LENGTH( FullName-t ) + 1
END IF
SET FirstName-t TO FIRST( FullName-t , pos - 1 )
SET LastName-t TO LAST( FullName-t , LENGTH( FullName-t ) - pos )
IF LastName-t CONTAINS "."
SET pos TO POSITION( LastName-t , "." )
SET MiddleName-t TO FIRST( LastName-t , pos )
SET LastName-t TO LAST( LastName-t , LENGTH( LastName-t ) - pos - 1 )
IF MiddleName-t CONTAINS " "
SET pos TO POSITION( MiddleName-t , " " )
SET FirstName-t TO FirstName-t + " " + FIRST( MiddleName-t, pos - 1 )
SET MiddleName-t TO LAST( MiddleName-t , LENGTH( MiddleName-t ) - pos )
END IF
ELSE IF LastName-t CONTAINS " "
SET pos TO POSITION( LastName-t , " " )
SET MiddleName-t TO FIRST( LastName-t , pos - 1 )
SET LastName-t TO LAST( LastName-t , LENGTH( LastName-t ) - pos )
IF LastName-t CONTAINS " "
SET pos TO POSITION( LastName-t , " " )
IF pos < 3 OR pos > 4
SET FirstName-t TO "«FirstName-t» «MiddleName-t»"
SET MiddleName-t TO FIRST( LastName-t , pos - 1 )
SET LastName-t TO LAST( LastName-t , LENGTH( LastName-t ) - pos )
END IF
END IF
END IF
END IF

Explanation

This is a moderately sophisticated name parser. It will not recognize prefixes, but it deals fairly well with four and five name elements or more, including a suffix. Note how it parsed the following names:

Sally Johnson > Sally : : Johnson
Sally Ann Johnson > Sally : Ann : Johnson
Sally Ann P. Johnson > Sally Ann : P. : Johnson
Sally Ann de Johnson > Sally : Ann : de Johnson
Sally Ann Pauline Johnson > Sally Ann : Pauline : Johnson
Sally Ann Pauline de Johnson > Sally Ann : Pauline : de Johnson
This computation will work as-is. Simply create the following variables or change them to match your variables:

FullName-t - A text variable containing the full name to be parsed
FirstName-t - A temporary text variable where the first name will be placed
MiddleName-t - A temporary text variable where the middle name will be placed
LastName-t - A temporary text variable where the last name will be placed
Suffix-t - A temporary text variable where the suffix will be placed
Pos-n - A temporary number variable
(For all temporary variables, set the Advanced options to "Ask only in dialog," "Don't warn if unanswered," and "Don't save in answer file")
The computation will work with names in either of the following formats:

First Middle Last
Last, First Middle

To use the computation, simply place the full name in the variable FullName-t, then invoke the computation. After the computation has run, the broken-down elements will be found in the variables FirstName-t MiddleName-t and LastName-t.

The Nitty-Gritty. For names in the format Last, First Middle, the computation can easily determine the last name as everything before the comma. The rest of the name is then inspected for spaces. If a space is found, it will split the name at the space to form the first and middle names. The middle name is then inspected for a space. If a space is found, it is assumed that the first half belongs back with the first name.

The First Middle Last format is more difficult. The name is split at the first space to form the first and last names. We then inspect the last name for a period ".", which presumably will mark the end of the middle initial. If one is found, everything after the period becomes the last name and everything before it the middle name. We should now inspect the middle name for a space. If one is found, there is an extra name which should be shifted from the middle name to the end of the first name.

If there is no period in the last name, then we inspect it for a space. If one exists, the last name will be split into middle and last names. The last name is again inspected for a space. If one is found, a judgment call must be made: does the extra name belong with the last name (as in "Van Winkle") or should it be shifted out of the last name? The decision is based on the length of the first portion of the last name. If it is just two or three characters long (de, dos, Van, Von, etc.) we will assume that it belongs with the last name and leave it as-is. Otherwise, we shift it to the middle name position and append the current middle name to the end of the first name.

This is still an imperfect process, but will produce very good results for most names.

bottom of page