Posts Tagged ‘find and replace’

h1

Word: Carriage returns, not paragraph returns

August 28, 2012

One of my readers emailed me with a problem they couldn’t figure out. Here’s a summary of the problem:

I’ve been working on a macro to clean up a document converted from PDF. A lot of these conversions place a pilcrow at the end of each line, I guess because Word isn’t sure what else to do with a line break in a PDF.

Part of the macro runs a find and replace that deletes most of the pilcrows but retains them where it recognizes a paragraph. What’s important here, and what I hope you can help with, is that when it makes a new pilcrow (I’m using ^13 in the find & replace), it treats it like a carriage return. The symbol on the screen is definitely a pilcrow—the same, filled-in black pilcrow I’m used to seeing at the end of a paragraph. I went into the xml of the document, and sure enough there was <cr> instead of <p>.

The paragraphs are all set to 12pts after, but none of them are actually displaying that space. The printed version is the same. It’s not a display issue. It’s just that Word is accepting the command to insert a pilcrow, inserting it, but treating it like a carriage return.

I found a solution, which is to search the document for pilcrows, select them, then type a paragraph. But I’d really like to understand what’s happening (and why) with that find and replace. Why is manually typing a paragraph working while find & replace is not?

My first step in trying to solve the problem was to create a new Word document, add a couple of soft line returns and some normal paragraph returns, then do a find/replace on the soft line returns (^l – that’s a lower case ‘L’) replacing them with ^13. Sure enough, the pilcrow looks the same, but the paragraph formatting (paragraph above/below spacing mostly) wasn’t quite right. When I saved the doc as XML, I found ‘cr’ instead of ‘p’ where I’d done this find/replace, just as my worried reader had found.

I know that ^13 is often required when you’re doing a find/replace with wildcards as ^p doesn’t work (or at least, it doesn’t work under most circumstances).

I did the test again, this time replacing ^l (soft line return) with ^p (hard line return), and the paragraph formatting changed to what it should be for each paragraph.

I did a final test. I added in some soft line returns, replaced ^l with ^13, then to see if ^p would work, I replaced ^13 with ^p and it worked! Those pesky carriage returns created from the ^l to ^13 replace changed to hard line returns when I replaced them with ^p.

Now I’m not sure how all this would work in the macro my reader had created, but at least they now have something they know will work.

All this testing was just in the normal Find/Replace dialog — no special settings at all.

Interestingly, Wikipedia says that ‘In ASCII and Unicode, the carriage return is defined as 13′ (http://en.wikipedia.org/wiki/Carriage_return) but says nothing about the code for paragraphs/hard line returns.

[Link last checked August 2012]

h1

Word: More find and replace with wildcards

April 20, 2012

I needed to make a global change to some text that the author had written incorrectly (i.e. not according to our house style guide).

Scenario

They had written [Ref 1] for a reference instead of our style of [Ref. 1], which not only has a full stop (period) after the f but also uses a non-breaking space to separate the full stop and the following numeral. All the numbers were different. Some were a single number, while those from References 10 and higher were two-digit numbers. There were a LOT of references like this in the document.

So I used my knowledge of find and replace wildcards to globally make the change in just a few seconds.

Solution

This solution works in Word 2003, Word 2007, and Word 2010.

  1. In the Word document where you want to make this change, press Ctrl+H to open the Find and Replace dialog box; the Replace tab should be in focus.
  2. In the Find what field, type: (Ref)( )([0-9])
    Note: There is only ONE space in this string — it’s between the two parentheses that don’t appear to enclose anything. The 0-9 bit is a zero, not an ‘o’ for orange.
  3. In the Replace with field, type: \1.^s\3
    Note: There are NO spaces in this string and the s must be in lower case. Don’t forget the full stop!
  4. Click the More button.
  5. Select the Use wildcards check box.
  6. Click Find Next then click Replace to test that it works fine. If so, click Replace All.

Explanation for how this works:

  • (Ref) looks for the string of letters: Ref
  • ( ) looks for a space immediately after Ref (i.e. no punctuation)
  • ([0-9]) looks for any number that follows immediately after Ref and its following space. The square brackets indicate a range — in this case any number from 0 to 9 will be found. It doesn’t matter whether the numbers are one or two-digit numbers — the critical thing the find/replace is looking for is any numeral after Ref<space>.
  • \1 replaces the first part of the Find string with itself (in other words, Ref gets replaced with Ref)
  • .^s replaces the second part of the Find string (the space) with a full stop followed immediately by a non-breaking space. For a non-breaking space, you MUST use a lower case s and precede it with the ^ (Shift+6). Note: If you just want an ordinary space — not a non-breaking one — then use \2 instead of .^s.
  • \3 replaces the third part of the Find string with itself (in other words, the number found gets replaced with the same number).
h1

Word: Delete tabs and page numbers from the end of a paragraph

February 13, 2012

One of my colleagues wanted me to grab all the tables of contents (TOCs) out of 18 separate chapters of a really long report and put them in a single document that she could share with the stakeholders. She only wanted the outline numbering and the heading titles, down to three TOC levels.

I copied each table of contents and pasted it as plain text into a new document. That preserved the outline numbering, followed by a tab, then the heading title, but it also added a tab after the heading title and the page number, neither of which were required. As the plain text version of the 18 chapter TOCs came to well over 12 pages (!), there were several hundred lines, one for each TOC entry. I could manually delete these tabs and page numbers, but that was going to get very tedious very quickly.

What I wanted was a single command to get rid of the end tabs followed by the page number. But I needed to keep the tabs after the outline numbers.

Find and replace wildcards to the rescue!

My first attempt only found the single digit page numbers and replaced them, so I tested a bit more to find a way to delete page numbers no matter what their length.

Here’s how:

  1. Open Word’s Find and Replace dialog box (Ctrl+H).
  2. Click More to show more options.
    Find and Replace dialog - click the More button
  3. Select the Use wildcards check box.
    Find and Replace dialog - select Wse Wildcards
  4. In the Find what field, type: (^t)([0-9]*)(^13)
    Note: There are NO spaces in this string, the t must be lower case, and there’s an asterisk (*) immediately after [0-9] bit.
  5. In the Replace with field, type: ^p
    Note: The p must be in lower case.
    Replace tab and page number with paragraph mark
  6. Click Find Next, then click Replace to test that it works fine. If so, click Replace All.

Explanation for how this works:

  • (^t) looks for a tab character; you MUST use a lower case t and precede it with the ^ (Shift+6). Because you are using wildcards, you need to surround the characters you want to find in parentheses.
  • ([0-9]*) looks for any numeral of any length that follows immediately after the tab character. The square brackets indicate a range — in this case any numeral from 0 to 9 will be found. And the asterisk looks for any number of characters that are in the range of 0 to 9 (this finds all the one, two, three etc. digit page numbers).
  • (^13) looks for the paragraph marker immediately following the page number. Note: When using wildcards you can’t use the usual ^p for the paragraph marker — you MUST use ^13 (the control code for a carriage return). See: http://office.microsoft.com/en-us/word-help/find-and-replace-text-or-other-items-HA001230392.aspx#_Toc282602054
  • ^p in the Replace field replaces everything found (the tab followed by the page number[s] followed by the paragraph mark) with a paragraph mark, effectively deleting the tab and page number(s).

It worked like a charm! My colleague was super impressed and I learned something new — can’t ask for more than that!

[Links last checked February 2012]

h1

Word: Insert a space between a number and a letter

February 8, 2012

A document I edited the other day was peppered with hundreds of values followed immediately by the unit of measure (e.g. 5km, 20mm, 50m/s etc.). Our house style follows the Australian Style Manual, which is to have a space between almost every value and its unit of measure (e.g. 5 km, 20 mm, 50 m/s).

While I could run several find and replace (F/R) passes looking for various measurement units (e.g. km) and then replacing them with a space followed by the measurement unit, there were a LOT of different units used and some, such as ‘m’ for meter, weren’t easy to catch using the normal F/R methods — either that, or I caught more than I wanted and ended up with two spaces in front of any word starting with ‘m’. Doing 20+ passes of F/R hoping to get most of the measurement units didn’t seem like a lot of fun.

So I applied some of the knowledge I’ve learned recently about wildcard F/R to catch them all. I still had to replace them one at a time as I needed to avoid adding a space between legitimate instances of numbers followed by a letter (e.g. chemical symbols such as H2S). But using the wildcard search made it much easier to find every instance of a number followed immediately by a letter, and replace the relevant ones with the same number followed by a space then the same letter.

Here’s how:

  1. Open Word’s Find and Replace dialog box (Ctrl+H).
  2. Click More to show more options.
    Find and Replace dialog - click the More button
  3. Select the Use wildcards check box.
    Find and Replace dialog - select Wse Wildcards
  4. In the Find what field, type: ([0-9])([A-z])
  5. In the Replace with field, type: \1 \2
    Note: There’s ONE space immediately after the \1, so make sure you type that too.
  6. Click Find Next, then click Replace to insert a space into each one that’s relevant. Repeat until you’ve done them all.

Explanation for how this works:

  • ([0-9]) looks for any numeral. This string defines the first section of the Find. Because you are using wildcards, you need to surround the characters you want to find in parentheses. The square brackets indicate a range — in this case any numeral from 0 to 9 will be found.
  • ([A-z]) looks for any letter, upper or lower case. This string defines the second section of the Find. Again, you define the range of letters to be found with square brackets; the [A-z] means any upper case letter from A through the lower case letters to z.
  • \1 replaces the first part of the wildcard string with itself. In other words, the numeral found is replaced with itself, so no change apparently occurs.
  • The single space after \1 replaces adds a space between the two parts found.
  • \2 replaces the second part of the wildcard string with itself. In other words, the letter found is replaced with itself, so no change apparently occurs.

However, what I couldn’t figure out how to do was make the space I inserted a non-breaking space. In a normal F/R, I’d use ^s for a non-breaking space, but I couldn’t figure out how to add that successfully to the Replace field. Anything I tried just put in ^s or (^s) as text, NOT as a non-breaking space. Anyone know how?

See also:

[Links last checked August 2012]

h1

Word: Find and replace multiple spaces between words

January 6, 2012

Your document has varying numbers of spaces between words — for example, 2, 3, 4 etc. spaces instead of just one space. You want to be able to find all multiple spaces after any word and replace them with a single space in one Find/Replace action.

NOTE: This find/replace DOES NOT look for multiple spaces in front of numbers, or after punctuation characters — only letters.

Here’s how to do it.

  1. Open Word’s Find and Replace dialog box (Ctrl+H).
  2. Click More to show more options.

    Find and Replace dialog - click the More button

    Click More

  3. Select the Use wildcards check box.

    Find and Replace dialog - select Wse Wildcards

    Select Use Wildcards

  4. In the Find what field, type: ([A-Za-z])( {2,9})
    Note: There’s ONE space in this text — it’s between the second ( and the {. To be safe, copy the text from Step 4 and paste it into your Find what field.
  5. In the Replace with field, type: \1
    Note: There’s ONE space immediately after the \1, so make sure you type that too.

    Find and replace dialog - make sure you enter the spaces correctly

    Make sure you enter the spaces correctly

  6. Click Find Next to find the next instance of multiple spaces between words. If the Find is successful and you are confident you’re not going to mess anything up, click Replace All. If you want to check each instance before replacing it, just click Replace then Find Next, Replace until you’ve dealt with them all.

Explanation for how this works:

  • ([A-Za-z]) looks for any letters, upper (A-Z) or lower case (a-z). Because you are using wildcards, you need to surround the text you want to find in parentheses. This string defines the first section of the Find.
  • ( {2-9}) looks for a space followed by two or more spaces, up to 9 spaces (you can put whatever numbers you like inside the curly braces — if you think you might have some instances of semicolons followed by 15 spaces, then change these numbers to {2-20}, for example. Again, this section is surrounded by parentheses to define it as a separate section.
  • \1 replaces the first part of the wildcard string with itself. In other words, the letter found is replaced with itself, so no change apparently occurs.
  • The space after \1 replaces the multiple spaces found in the second part of the wildcard string with a single space.

See also:

[Links last checked December 2011]

h1

Word: Find and replace multiple spaces before a number

January 5, 2012

Your document has varying numbers of spaces before a number string — for example, 2, 3, 4 etc. spaces before a number like 85. You want to be able to find all multiple spaces before a numeral and replace them with a single space in one Find/Replace action.

Here’s how to do it.

  1. Open Word’s Find and Replace dialog box (Ctrl+H).
  2. Click More to show more options.

    Find and Replace dialog - click the More button

    Click More

  3. Select the Use wildcards check box.

    Find and Replace dialog - select Wse Wildcards

    Select Use Wildcards

  4. In the Find what field, type: ( {2,9})([1-9])
    Note: There’s ONE space in this text — it’s between the first ( and the {. To be safe, copy the text from Step 4 and paste it into your Find what field.
  5. In the Replace with field, type: <space>\2
    Note: There’s ONE space immediately before the \2, so make sure you type that too. I’ve indicated it with <space>, but you don’t type ‘<space>’ — just press the spacebar instead.

    Find and replace dialog -- make sure you put the spaces in the right places

    Make sure you put the spaces in the right places

  6. Click Find Next to find the next instance of a string of spaces followed by a number. If the Find is successful and you are confident you’re not going to mess anything up, click Replace All. If you want to check each instance before replacing it, just click Replace then Find Next, Replace until you’ve dealt with them all.

Explanation for how this works:

  • ( {2-9}) looks for a space followed by two or more spaces, up to 9 spaces (you can put whatever numbers you like inside the curly braces — if you think you might have some instances of semicolons followed by 15 spaces, then change these numbers to {2-20}, for example. This section is surrounded by parentheses to define it as a separate section.
  • ([1-9]) looks for any numerals. [1-9] says to find any number between 1 and 9 inclusive, and the  parentheses define this as a separate section.
  • The <space> before \2 replaces the multiple spaces found in the first part of the wildcard string with a single space.
  • \2 replaces the second part of the wildcard string with itself. In other words, the numbers found are replaced with themselves, so no change apparently occurs.

You can use this same technique for multiple spaces before any other character — just replace the [1-9] with [A-Z] for any upper case letter, [a-z] for any lower case letter, or [A-z] for any letter whether in upper or lower case (however, Word MVP Graham Mayor says it’s safer to use [A-Za_z] for any upper or lower case letter — see http://www.gmayor.com/replace_using_wildcards.htm).

See also:

[Links last checked December 2011]

h1

Word: Find and replace multiple spaces after a punctuation mark

January 4, 2012

Your document has varying numbers of spaces after a punctuation mark — for example, 2, 3, 4 etc. spaces after a semicolon, colon, period, comma etc. You want to be able to find all multiple spaces after a defined punctuation character and replace them with a single space in one Find/Replace action.

Here’s how to do it for a semicolon. Use the same process for other punctuation marks, substituting your punctuation character where I’ve used a semicolon.

  1. Open Word’s Find and Replace dialog box (Ctrl+H).
  2. Click More to show more options.

    Find and Replace dialog - click the More button

    Click More

  3. Select the Use wildcards check box.

    Find and Replace dialog - select Wse Wildcards

    Select Use Wildcards

  4. In the Find what field, type: (;)( {2,9})
    Note: There’s ONE space in this text — it’s between the second ( and the {. To be safe, copy the text from Step 4 and paste it into your Find what field.
  5. In the Replace with field, type: \1
    Note: There’s ONE space immediately after the \1, so make sure you type that too.

    Find and replace dialog - make sure you enter the spaces

    Make sure you enter the spaces in the correct places

  6. Click Find Next to find the next instance of a semicolon followed by more than one space. If the Find is successful and you are confident you’re not going to mess anything up, click Replace All. If you want to check each instance before replacing it, just click Replace then Find Next, Replace until you’ve dealt with them all.

Explanation for how this works:

  • (;) looks for the semicolon. Because you are using wildcards, you need to surround the text you want to find in parentheses.
  • ( {2-9}) looks for a space followed by two or more spaces, up to 9 spaces (you can put whatever numbers you like inside the curly braces — if you think you might have some instances of semicolons followed by 15 spaces, then change these numbers to {2-20}, for example. Again, this section is surrounded by parentheses to define it as a separate section.
  • \1 replaces the first part of the wildcard string with itself. In other words, the semicolon is replaced with itself, so no change apparently occurs.
  • The <space> after \1 replaces the multiple spaces found in the second part of the wildcard string with a single space.

You can use this same technique for multiple spaces after any other punctuation character — just replace the semicolon with a comma, colon, period, etc.

NOTE: Some characters — [ \ ^ $ . | ? * + ( ) -- are special and need to be 'escaped' before they can be replaced. To 'escape' a special character, type \ in front of it. For example, to run the same find/replace above for a question mark, you'd type this into the Find what field: (\?)( {2,9})

See also:

[Links last checked December 2011]

h1

Word: Add something just before a table cell marker

July 7, 2011

Did you know that you can add something (perhaps a missing period or a % symbol) just before each cell marker in a Word table? By default, there’s no easy way to find the cell markers using Word’s Find and Replace, but with a little trickery you can do it.

The critical thing that allows this to work is knowing the style used for the table cells. Word has to have something to find and as it can’t find a cell marker and as the text in each cell is different, there needs to be something it can hook on to — the style is what we get it to look for. Once Word’s got something to find, it can perform the replace action.

In the steps below, I take you through how to add a period to the end of each selected cell in a table (substitute your own character(s) for the period if you want something different). This technique works in Word 2003 and later.

  1. Locate the table where you want to add the period to the end of each cell.
  2. Check the style used for the cells (e.g. Table Cell). The style should be the same for each cell where you want to make the change. If it’s not the same, make it the same — you can always change it back later.
  3. Select the cells you want to change. Don’t worry about what text is in the cells — it will all still be there after you do the Find/Replace.
  4. Press Ctrl+H to open the Find and Replace dialog box. The cursor is in the Find what field — DO NOT type anything in here!
  5. Click the More button.
  6. Click the Format button and select Style from the list.
  7. In the Find Style dialog box, type the first letter of the style’s name (e.g. t for Table Cell) to jump to that section in the list. Scroll to find the style’s name (e.g. Table Cell), select it, then click OK. The style name is listed below the Find What field.
  8. Move your cursor into the Replace with field.
  9. Type ^&. (that’s the caret character [Shift+6], followed immediately by the ampersand [Shift+7], followed immediately by a period — the character you’re adding in this exercise).
  10. Click Replace All.

Add a character after the cell marker

And you’re done!

By the way, the ^& part of the replace string means replace the same text with what you already had.

See also:

[Links last checked July 2011; thanks to my friend Char who asked the question and sent me on the chase to find out the answer]

h1

Word: Replace and reformat text inside square brackets using wildcards

June 20, 2011

My husband wanted to select a long column of text and find any text that was inside square brackets and reformat it so that the text — and the square brackets — was 4 pt and blue (no, I don’t know why either…).

This is an ideal job for using wildcards in Word’s find and replace. However, square brackets are special characters in wildcard searches, so they have to be treated differently. With some help from http://word.mvps.org/faqs/general/usingwildcards.htm and a bit of trial and error, I figured it out. I explain what all the settings mean after these steps, if you’re interested. Meantime, here’s my solution, which works in all versions of Word:

  1. Select the text you want to change (e.g. entire document, selected paragraphs, selected columns or rows of a table).
  2. Press Ctrl+H to open the Find and Replace dialog box.
  3. Click the More button.
  4. Select the Use wildcards check box.
  5. Put your cursor into the Find what field.
  6. Type the following exactly (or copy it from here): (\[)(*)(\])
  7. Go to the Replace with field and type: \1\2\3
  8. Click the Format button, and select Font.
  9. On the Font dialog box change the settings to what you want — in my husband’s case, this was 4 pt and blue — then click OK. Your Find and Replace dialog box should now look like this:

    Find and reformat text inside square brackets

    Find and reformat text inside square brackets

  10. Click Replace All.
  11. Once all replacements have been made, check that you got what you expected before making further changes to the document. If it’s all OK, save your document with the new changes.

What it all means

The three elements of the Find are:

  1. (\[) -- You need to find a specific character (the opening square bracket), so you need to enclose it in parentheses. However, because the square brackets are special wildcard characters in their own right, you need to tell Word to treat them as normal text characters and not as special characters, so you put in a backslash '\' (also known as an 'escape' character) before the [.
  2. (*) -- This tells Word to look for any characters after the opening square bracket. There's no limit on what sort of characters (alpha, numeric, or symbols) Word is to find, or on how many there are.
  3. (\]) — This tells Word to stop the find at the first closing square bracket it finds after an opening square bracket followed by any other characters. As with the opening square bracket (1. above), the closing square bracket is a special wildcard character, so needs a backslash in front of it for Word to treat it as ordinary text, and it needs to be enclosed in parentheses as it’s an exact match you want.

There are no spaces between any of these elements — the aim is to find a string such as [green frog] and replace it with exactly the same text but formatted in a different color and with a difference font size.

The three elements of the Replace are:

  1. \1 — Tells Word to replace the first element of the Find with what was in the Find (the opening square bracket).
  2. \2 — Tells Word to replace the second element of the Find with the same text as what was found. In other words, keep the exact text as was found, but change it’s font size and color.
  3. \3 — Tells Word to replace the third element of the Find with what was in the Find (the closing square bracket).

As with the Find elements, there are no spaces between these elements. You still want [green frog], not [ green frog ].

See also:

[Links last checked June 2011]

h1

Word: Find and replace any number of spaces

September 13, 2010

Big thanks go to Mike Starr (http://www.writestarr.com) for sharing this tip!

If you’re editing a document written by one or more authors who don’t have a lot of Word experience/knowledge, there’s a good chance you’ll come across instances of multiple spaces where there should only be one space. The authors might have used multiple spaces to force layout (instead of tabs, styles, tables etc.), or they may have pasted words from another source into the document thus adding an extra space or two where there should only be one space.

Your job is to clean up the document — including finding all those extra spaces and removing them.

There are several methods you can use to do this (see the list at the end of this post for alternative methods I’ve documented), but the best — and quickest — method is to use a wildcard search to find any number of consecutive spaces and replace them with one space.

Here’s how:

  1. Press Ctrl+H to open the Find and Replace dialog box.
  2. Click the More button to display the search options.
  3. Select the Use wildcards check box (this method won’t work unless this option is turned on).
  4. In the Find what field, press the spacebar once followed by {2,10}. (See the Notes below about these numbers.)
  5. In the Replace with field, press the spacebar once.
  6. Click Replace All.

NOTES:

  • You can use any numbers you like. The first number is the minimum number of spaces to find and replace and the second represents the maximum number of spaces to find and replace — 2,10 represents a minimum of two and a maximum of 10 consecutive spaces. If you legitimately use double spaces after a period, type {3,10} after the space instead, and if you know you’ve got long strings of spaces in the document, use a range such as {2,80}.
  • You can also use this method to find all instances of two or more spaces after a period and replace them with one space. For that you’d type . {2,5} (note the space immediately after the period) in the Find what field, and a period followed by a single space in the Replace with field, as shown in the screen shot below.

Other methods and related posts:

[Links last checked September 2010; thanks again to Mike Starr for sharing this very simple wildcard method!]

Follow

Get every new post delivered to your Inbox.

Join 229 other followers