One of my readers emailed me with a problem they couldn’t figure out. Here’s a summary of the problem:
I’ve been working on a macro to clean up a document converted from PDF. A lot of these conversions place a pilcrow at the end of each line, I guess because Word isn’t sure what else to do with a line break in a PDF.
Part of the macro runs a find and replace that deletes most of the pilcrows but retains them where it recognizes a paragraph. What’s important here, and what I hope you can help with, is that when it makes a new pilcrow (I’m using ^13 in the find & replace), it treats it like a carriage return. The symbol on the screen is definitely a pilcrow—the same, filled-in black pilcrow I’m used to seeing at the end of a paragraph. I went into the xml of the document, and sure enough there was <cr> instead of <p>.
The paragraphs are all set to 12pts after, but none of them are actually displaying that space. The printed version is the same. It’s not a display issue. It’s just that Word is accepting the command to insert a pilcrow, inserting it, but treating it like a carriage return.
I found a solution, which is to search the document for pilcrows, select them, then type a paragraph. But I’d really like to understand what’s happening (and why) with that find and replace. Why is manually typing a paragraph working while find & replace is not?
My first step in trying to solve the problem was to create a new Word document, add a couple of soft line returns and some normal paragraph returns, then do a find/replace on the soft line returns (^l – that’s a lower case ‘L’) replacing them with ^13. Sure enough, the pilcrow looks the same, but the paragraph formatting (paragraph above/below spacing mostly) wasn’t quite right. When I saved the doc as XML, I found ‘cr’ instead of ‘p’ where I’d done this find/replace, just as my worried reader had found.
I know that ^13 is often required when you’re doing a find/replace with wildcards as ^p doesn’t work (or at least, it doesn’t work under most circumstances).
I did the test again, this time replacing ^l (soft line return) with ^p (hard line return), and the paragraph formatting changed to what it should be for each paragraph.
I did a final test. I added in some soft line returns, replaced ^l with ^13, then to see if ^p would work, I replaced ^13 with ^p and it worked! Those pesky carriage returns created from the ^l to ^13 replace changed to hard line returns when I replaced them with ^p.
Now I’m not sure how all this would work in the macro my reader had created, but at least they now have something they know will work.
All this testing was just in the normal Find/Replace dialog — no special settings at all.
Interestingly, Wikipedia says that ‘In ASCII and Unicode, the carriage return is defined as 13′ (http://en.wikipedia.org/wiki/Carriage_return) but says nothing about the code for paragraphs/hard line returns.
[Link last checked August 2012]

















