Posts Tagged ‘find and replace’

h1

Word: Transpose Surname, Firstname to Firstname Surname

December 26, 2017

I came across a heap of names styled ‘Surname,<space>Firstname’ (e.g. Smith, Jane) and needed to change them to ‘Firstname Surname’ (i.e. Jane Smith).

As with any find/replace operation, identifying the pattern is the first step. Once you’ve done that, the rest is pretty easy. In this example, the pattern was clear — each surname and first name started with a capital letter followed by one or more lower case letters, there was a comma after the surname, and then a space before the start of the first name. Each surname only had a single first name. Because names vary in length, I needed to use wildcards to specify matching the pattern for any number of letters.

Below is what I came up with for this swap — others more clever than me may have a more elegant way to do this, but this worked for me.

CAUTIONS and WARNINGS:

  • I don’t advise doing a ‘replace all’ with this — if there’s anything else that matches the pattern that ISN’T a name, it will get changed too.
  • This find/replace only finds whole names with a single capital letter (i.e. it finds Smith, Jones, Haythornthwaite, Jane, Rosemary, Jonathan). It does NOT find names with more than one capital (e.g. McDonald, AnnMarie) or with an apostrophe (e.g. O’Malley).
  • Hyphenated words are found, but transpose incorrectly (e.g. Smith, Jane-Ann changes to Jane Smith-Ann not Jane-Ann Smith; similarly Jones-Brown, John changes to Jones-John Brown).
  • Surnames with a first and middle name or initial will be found but transposed incorrectly (e.g. Smith, Jane K. Susan will become Jane Smith K. Susan instead of Jane K Susan Smith). Surnames with an initial letter instead of a first name will not be found (e.g. Philips, A. is not found)
  • Names separated with anything other than a comma, or that have two or more spaces between the comma and the first name will NOT be found.
  • Names with accents, umlauts, and other diacritical marks over letters (e.g. René) are found and transposed correctly.

Despite all the cautions and warnings above, if you have a long list of names to change, then you could run this find/replace, replacing one at a time and manually fixing the others that aren’t found or that will transpose incorrectly. It’s still quicker than doing them all manually.

Steps:

  1. Save your document.
  2. Press Ctrl+H to open the Find/Replace window.
  3. Click the More button.
  4. Select Use Wildcards.
  5. In the Find What field, enter this (copy it from here and paste as there’s a space in the string of characters): (<[A-Z])([a-z]@>)(, )(<[A-Z])([a-z]@>)
  6. In the Replace With field, enter this (again, copy/paste as there’s a space in here that’s hard to see): \4\5 \1\2
  7. Click Find Next.
  8. Click Replace if it finds a name you want to transpose; if not, click Find next to go to the next one. (Note: Replace All is super powerful and you could change things you don’t want to, so err on the side of caution and click Find Next > Replace > Find Next until all are done).

Explanation:

  • Parentheses surround each ‘element’ of the find. These are represented by numbers in the replace (i.e. the 4th set of parentheses in the find becomes \4 in the replace)
  • < indicates the beginning of a word; > indicates the end of a word
  • [A-Z] looks for any upper case letter; [a-z] looks for any lower case letter
  • @ looks for any number of the instruction immediately previous (e.g. [a-z]@> looks for any number of lower case letters up to the end of a word — this covers the varying length of names)

 

h1

Word: Find duplicated words

December 6, 2017

This find/replace is based on Paul Beverley’s work, so full acknowledgement to him for teaching me how to do this via his YouTube videos and his free book.

********

Some of my authors inadvertently type the same word twice (e.g. is is, the the), and it’s often hard to pick these up when editing. If you run spellcheck, you may find them, but there’s no guarantee of that. The find and replace below uses wildcards to find any instance of duplicated words, followed by a space or a common punctuation mark, and then replaces that with a single word and the trailing space or punctuation.

NOTE: This find/replace only finds words with the exact same case, so it will find ‘the the’, ‘THE THE’, and ‘The The’, but it won’t find instances where each word has the same letters but with different cases (e.g. ‘the The’, ‘The the’, ‘tHe thE’ etc.)

Steps:

  1. Press Ctrl+H to open the Find and Replace dialog box.
  2. Click More, then select the Use wildcards option.
  3. In the Find field, type: (<[A-Za-z]@)[ ,.;:]@\1>
    (Note: There’s a space in there, so I suggest you copy this Find string.)
  4. In the Replace field, type: \1
  5. Click Find Next then click Replace. Repeat.

 

How this works — at least how I *think* it works:

  • Find: Look for the start of any word (<) made up of any number (@) of letters ([A-Za-z]) followed by a space or punctuation ([ ,.;:]) then repeat that find (@\1) until you can’t any more words that match the pattern (>).
  • Replace: Replace the first element (the first of the duplicate words) with itself (that’s the \1 bit), which effectively deletes the rest.

[Links last checked December 2017]

h1

Word: Find a year followed by a comma and replace with a semicolon

November 22, 2017

Another early morning question posed on Facebook…

The person was trying to use Word’s wildcard find and replace to convert all strings of Authorname nnnn, Authorname nnnn, Authorname nnnn, Authorname nnnn (i.e. any author’s name, followed by a 4-digit number for a year, such as Smith 2005, Jones 1997, etc., followed by a comma, followed by another author’s name etc.). He wanted to convert all the comma separators to semicolons, ending up with Authorname nnnn; Authorname nnnn; Authorname nnnn; Authorname nnnn. (I’ve italicised the text for clarity — it wasn’t in his original.)

Wildcard find/replace is all about finding the pattern and then figuring out how best to interpret that pattern in a meaningful way in how you search for what you want, and how you replace it with what you want.

In this example, an author name always ends in a lower case letter, is followed by a space, then four numbers for the year, a comma, a space, then an upper case letter for the next author’s name. The last item in the list doesn’t quite match the pattern (no comma, space, upper case letter following it),  but that one doesn’t need to change so we can ignore that variation to the pattern. He wanted to keep everything except the comma, which he wanted to change to a semicolon.

Here’s how I solved it using Word’s wildcard find and replace  (there may be a more elegant solution, but this one worked for me):

  • Find: ([0-9]{4})(,)( )([A-Z]) 
  • Replace: \1;\3\4

If you need to use this, I suggest you copy it as there’s a space in the third set of parentheses that you can’t see.

How this works:

  • Find: Look for any number from 0 to 9 [0-9] that has 4 digits {4} — this is the first element and is surrounded by parentheses. Then look for a comma (another element, so also surrounded by parens). Next look for a space (wow, more parens), and finally look for any upper case letter [A-Z] and as it’s a unique element, surround it by parens too.
  • Replace: Replace the first element (the 4-digit number) with itself (that’s the \1 bit), then a semicolon, then replace the third and fourth elements of the find with themselves (e.g. \3\4).

You keep everything you don’t want to change (elements 1, 3, and 4) and only change the second element by typing a semicolon in between elements 1 and 3.

 

 

h1

Word: Wildcard replace with a backslash

November 22, 2017

This morning, well before I was properly awake, I solved a problem someone had posed on a Facebook group I’m in. They had an issue getting Word’s wildcard find and replace to do what they wanted and had asked members of the group to help. I’m writing this up for my own future reference as there’s some information in here about the peculiarities of the backslash character that I may need to use again in the future. [Random fact: The backslash character is known by several names, including the reverse virgule and the reverse solidus.]

The person was trying to find an easy way to find all instances of 3x and replace with 3\x\. Actually, she was trying to do more than that — if she’d only been looking for that, then a normal find/replace should work. For the rest of the string, however, she really needed to use wildcards. Where she was getting stuck was defining the Find correctly, and then the Replace.

Here’s my solution (using wildcards):

  • Find: (3)(x) 
  • Replace: \1^92\2^92

How this works:

  • First, look for 3 followed immediately by x. I separated them in the Find string with parentheses so that I could treat them as separate elements in the Replace string.
  • Next, for the replace, type \1 to replace the first element (the 3) with itself, then type ^92 to add a backslash character (you can’t type a \ as that won’t work), then \2 to replace the second element of the Find with itself (i.e. the x), then another ^92 for a final backslash character.

Two things to note:

  • The backslash is an escape character in a Find, so if you need to find one, you need to surround it with square brackets and ‘escape’ it — i.e. [\\] in a Find.
  • The backslash is a special character in Replace too as it designates the element you want to replace with itself. Instead, you have to use ^92 in place of a \.

 

h1

Word: Switch the number and punctuation order

October 29, 2017

On another blog post, Peter asked for some help:

I have hundreds of superscript characters (not footnote markers) that have a space before them and punctuation (periods and commas only) after them. I’m trying to delete the space and move the punctuation in front of it.

You can do this using a find/replace with wildcards. However, the instructions below DON’T differentiate between numbers that are superscripted and numbers that aren’t, so it will switch those too. If you don’t have any instances of <space>single ordinary number<period or comma>, then you should be fine. I suggest you try this on a COPY of your document and make sure you get what you want and nothing more, before using it on your main document.

Steps:

  1. Press Ctrl+H to open the Find and Replace dialog box.
  2. Click More, then select the Use wildcards option.
  3. In the Find field, type: ( )([0-9])([.,])
    (Note: There’s a space between the first set of parentheses. Because you have hundreds of these, there’s a good chance that you won’t have just single digit numbers. For multi-digit numbers, type this instead: ( )([0-9]@)([.,])
  4. In the Replace field, type: \3\2
  5. Click Find Next then click Replace. Repeat.

(Note: Only click Replace All if you are certain that no other ordinary numbers will be affected.)

What you are doing here is looking for a space (item 1), followed by any single digit number (item 2), followed by either a period or a comma (item 3). Then you’re replacing that string with the period or comma (item 3) then the number (item 2).

 

h1

Word: Find and replace multiple spaces after punctuation

December 21, 2016

You receive a document that has multiple spaces after standard punctuation — periods, commas, semicolons, colons, question marks, and exclamation marks. Sometimes the author used two spaces, sometimes three, sometimes five!

How to fix it?

Well, you can run several find/replace routines but as the number of spaces is unknown and as there are many types of punctuation, that could take quite a lot of time. Instead, you can use a wildcard find/replace routine to find them all at once, keep the punctuation, and replace the unknown number of spaces with a single space. Here’s how (for your own peace of mind, test this on a COPY of your document first):

  1. Press Ctrl+H to open the Find and Replace dialog box.
  2. Click the More button to show more find/replace options.
  3. Select the Use wildcards checkbox.
  4. In the Find what field, type: ([,.;:\?\!])( {2,9})
    NOTE: There’s a single space before the {2 — make sure you include that. To be safe, copy the ‘code’ in this step, and paste it into your Find what field.
  5. In the Replace with field, type: \1 followed immediately by a single space.
  6. Click Find Next to find the first instance, then Replace to replace the multiple spaces with a single space.
  7. Repeat step 6 as many times as you need to be confident that it’s finding the right things. Once you’re confident, click Replace All to run through the whole document and fix all instances.

Use wildcards to find and replace multiple spaces after defined punctation

Explanation for how this works:

  • ([,.;:\?\!]) looks for any of the listed punctuation characters. Question and exclamation marks are special cases and need to be ‘escaped’ with a \. Because you’re using wildcards, you need to surround the text you want to find in parentheses. This string defines the first section of the Find.
  • ( {2-9}) looks for a space followed by two or more spaces, up to 9 spaces (you can put whatever numbers you like inside the curly braces — if you think you might have some instances of punctuation followed by 15 spaces, then change these numbers to {2-20}, for example. Again, this section is surrounded by parentheses to define it as a separate section.
  • \1 replaces the first part of the wildcard string with itself. In other words, the punctuation character found is replaced with itself, so no change apparently occurs.
  • The space after \1 replaces the multiple spaces found in the second part of the wildcard string with a single space.
h1

Word: Find expanded text or spaces

July 26, 2016

Problem

Today I edited an activity guide. It had a formatting oddness that took me a while to figure out. Every so often (not consistently, but at least once or twice per paragraph), there would be a single space that looked like a double space.

It took me forever to figure out what the problem was (Expanded font style), then research how to fix it. I couldn’t find anything that indicated that I could do a global search & replace. If anyone knows a way to search & replace on particular formatting on Word, I’d love to know.

Solution

In Word for Windows, you can search for expanded text and replace it with normal, BUT you need to know how much it’s expanded by, and hope that all is expanded to the same degree.

In the screenshot below, some spaces (highlighted in green) are expanded by 2 pt. All others are not expanded. The yellow highlight shows an instance of a normal space followed by a ‘Y’ so you can see the difference between that and the green one with the expanded space in front of another ‘Y’. These things are hard to see, so make sure your formatting marks are turned on and zoom in — I zoomed in to 150% in this example.

FR_expanded space01

  1. Select one of the expanded spaces and check the Font settings > Advanced tab to find out what degree of expansion is used (e.g. 1 pt, 1.1 pt, 2 pt etc.). Write it down.
  2. Go to the Find and Replace dialog box (Ctrl+H), then the Replace tab.
  3. Type the space into the Find what field.
  4. Click More.
    FR_expanded space02
  5. Click Format > Font.
    FR_expanded space03
  6. Click the Advanced tab and select Expanded from the Spacing options, then enter the point size you found out earlier into the By field.
    FR_expanded space04
  7. Click OK to return to the Replace tab — you should have ‘Expanded by xx pt’ below the Find what field.
    FR_expanded space05
  8. Go to the Replace with field, type a space, then More > Format > Font > Advanced tab, select Spacing = Normal.
    FR_expanded space06
  9. Click OK to return to the Replace tab. The Replace with field should have ‘Not Expanded by /Condensed by’ below it.
    FR_expanded space07
  10. Click Find Next and then Replace to find each expanded space and replace it with a normal space (if you’re confident, click Replace All).