Another early morning question posed on Facebook…
The person was trying to use Word’s wildcard find and replace to convert all strings of Authorname nnnn, Authorname nnnn, Authorname nnnn, Authorname nnnn (i.e. any author’s name, followed by a 4-digit number for a year, such as Smith 2005, Jones 1997, etc., followed by a comma, followed by another author’s name etc.). He wanted to convert all the comma separators to semicolons, ending up with Authorname nnnn; Authorname nnnn; Authorname nnnn; Authorname nnnn. (I’ve italicised the text for clarity — it wasn’t in his original.)
Wildcard find/replace is all about finding the pattern and then figuring out how best to interpret that pattern in a meaningful way in how you search for what you want, and how you replace it with what you want.
In this example, an author name always ends in a lower case letter, is followed by a space, then four numbers for the year, a comma, a space, then an upper case letter for the next author’s name. The last item in the list doesn’t quite match the pattern (no comma, space, upper case letter following it), but that one doesn’t need to change so we can ignore that variation to the pattern. He wanted to keep everything except the comma, which he wanted to change to a semicolon.
Here’s how I solved it using Word’s wildcard find and replace (there may be a more elegant solution, but this one worked for me):
- Find: ([0-9]{4})(,)( )([A-Z])
- Replace: \1;\3\4
If you need to use this, I suggest you copy it as there’s a space in the third set of parentheses that you can’t see.
How this works:
- Find: Look for any number from 0 to 9 [0-9] that has 4 digits {4} — this is the first element and is surrounded by parentheses. Then look for a comma (another element, so also surrounded by parens). Next look for a space (wow, more parens), and finally look for any upper case letter [A-Z] and as it’s a unique element, surround it by parens too.
- Replace: Replace the first element (the 4-digit number) with itself (that’s the \1 bit), then a semicolon, then replace the third and fourth elements of the find with themselves (e.g. \3\4).
You keep everything you don’t want to change (elements 1, 3, and 4) and only change the second element by typing a semicolon in between elements 1 and 3.