Word annoyance: Spell checking duplicate species names

August 17, 2010

In my current contract, I’ve been editing hundreds of environmental science reports. Many of these reports contain Latin names of species, families, genera, etc. And some reports have thousands of these Latin species names.

So that I’m not continually faced with these when I run a spell check over each document, I’ve added many to my custom dictionary so that Word doesn’t tell me these Latin words are spelling errors.

However, Word does not deal with duplicate Latin species names very well. For other spelling errors, there’s an Ignore All option, but not when there’s a duplicate word.

For example, when Word’s spell checker comes across Rattus rattus (the Latin name for the Black Rat), Caretta caretta (Loggerhead Turtle), and the like, it flags the first word as an unknown word.

I add it to the dictionary so that it’s not flagged again. But now Word skips to the next word (rattus in the example above) and flags that as not in the dictionary either because it’s a lower case variation. So I add it to the dictionary too.

Then Word won’t automatically skip to the next word as it now flags the second word (the one I just added to the dictionary) as a repeated word. The only option I have is to Ignore OnceIgnore All (which is what I really want for this scientific name) is grayed out and is not available for me to click.

So even after adding both forms of the word (initial capped and lower case) to the dictionary, Word still flags the second word as a duplicate EVERY time and I have no other choice except to click Ignore Once. As many of these scientific reports I’m editing use LOTS of Latin names and repeat many of them throughout the document, that’s an awful lot of time I waste just clicking Ignore Once on the repeated word, which is a very legitimate repeated word in species nomenclature, by the way.

I tried adding the two words together to the dictionary but that made no difference either. Word continues to flag the second word as a repeated word. After I’ve clicked Ignore Once several hundred times in a document, I get pretty annoyed with this limitation.

Microsoft could improve the spell checking functionality to deal with things such as Latin names by any of the following:

  • Provide an ‘Ignore All’ option for a repeated word.
  • Allow two- and three-word scientific names or phrases to be added to the dictionary — and recognize them. This could be expanded to other ‘words’ like email addresses that Word sometimes wants to flag as incorrect, either in part or as a whole. If a short phrase is added to the dictionary, then Word should treat it as a correctly spelled word every time.
  • Add an option to the Spelling & Grammar options to ignore italicized words. We can tell Word to ignore uppercase words and those containing numbers — why can’t we choose to ignore those in italics too?
  • Add an option to the Spelling & Grammar options to ignore case variations of a word. For example, if I add Rattus to the dictionary and have the ‘ignore case variations’ option checked, then I shouldn’t be told that rattus/rAttus/RAttus etc. are new words. This option should be used with caution, but it should be up to the user to decide whether to use it or not. Currently, case variations (except all upper case) are treated as new words and are not ignored.

Do you have any other suggestions you’d like to see for checking spelling?

Do you have any other workable suggestions for dealing with these repeated Latin names, so that I don’t have to click Ignore Once all the time?


  1. I do a lot of work with multilingual documents, and many use Latin. Consider using character styles. Here’s what I do…

    Set up a character style — I call mine “Latin char” — based on the underlying text (the default) and set the language attribute to “Do not check spelling or grammar”. Now when you tag text with the style, the words won’t be checked so you won’t see the red squigglies. But you also won’t see that the invisible language attribute has been used either.

    To make that visible, alter the style definition to add a colour — I have my Latin char style set to Grey.

    During editing and proofing, the colour is useful; for final publication, I alter the style definition to use colour = Automatic. (In fact, I create a 2nd template with all similar tricks adjusted appropriately.)

    I use character styles for other languages — French is blue; Spanish is green; Portuguese is orange… — but also ones for email addresses (light blue), web addresses (purple), citations (brown), and notes (red bold underline).

    Beyond visibility and being able to use the correct language, this approach has other hidden benefits:

    — I can extract all instances of a given style (Find All, copy). This lets me proof them separately: a Latin specialist will spot errors I would miss. Being able to sort extracted URLs makes it way easier to catch typos. (Plus you can also use 3rd party tools to check all extracted links used in a document.)

    — I can make text disappear: my “notes” style shows up to remind me about something, but is set to invisible for the final. (This is handy for tagging instructor notes in course handouts for example.)

    — For technical documents with cited references (e.g. “according to Brown et al, 2008, the …”), I can extract all the citations, then sort them with the references to too help confirm that all are included during proofreading.

    — When I zoom way out to see many pages at once, the coloured text really stands out, so I can click in or near it, then zoom back in to examine it more closely. (Ctrl-roll)

    — If you need to be compatible to other layout programs, having styled content makes life much easier. I strive to avoid using ANY direct formatting — particularly in large complex documents.

    Hope this helps… I hadn’t intended to take so many words!

  2. Thanks Eric! That’s a comprehensive list of things you’ve contributed — I’ll have to try some of those in my own docs. Unfortunately, I’m editing these client docs, not writing them. Getting the multiple (and often third party) authors to use even the rudimentary styles can be difficult. Getting them to paste as unformatted text when they copy text from another document then apply the styles is almost impossible for many to grasp and most forget to do it. I guess it keeps me in a job ;-)

  3. That’s what I do too. It was way too frustrating for all concerned to force authors to adhere to standards, so we decided to assemble tools and methods to deal with whatever was thrown at us (Mac, Windows, Linux; different media; different word processors…). Now I run a clean up utility that basically strips everything from the supplied content, but fixes some obvious stuff (like — to em dash; digit hyphen digit to digit en dash digit; smart quotes; etc.) Then I can apply styles to manage all formatting. All of our documents use field codes extensively for automation, and to link to external objects (tables, pictures, formulae, charts…).

    We’ve produced documents as big as 600 pages with hundreds of charts and formulae, yet the Word files are seldom >2MB. If you tried to prepare a 600 page book with embedded objects and direct formatting, Word would choke.

    (And yeah, like you, knowing how to use the tools means that there will always be work from people who don’t!)

  4. It isn’t just latin phrases. There are many Australian place names that have repeated words e.g. Obi Obi, Bli Bli, Wagga Wagga, Gin Gin.
    The way many of the Aboriginal languages work is that plurals are created by saying the word twice. eg the plural of dog would be dog dog instead of dogs. This has extended into place names.
    I have tried adding place names into the dictionary – but it still flags up the second part of the name as an error every time.

  5. Great point, Belinda.

  6. Agree this is extremely annoying, as I work for a municipality which has a road whose name is a repeated word (Fee Fee), so I have to repeatedly select “Ignore Once” when checking spelling. Why can’t the user be allowed to add exceptions to to this function? Or simply allowing the user to select “Ignore All” would be an improvement. The only alternative is to turn off “Flag repeated words”, which is not desirable. Microsoft’s HQ is in Washington – have they never heard of Walla Walla, WA?

  7. I came upon this blog whilst trying to find out if there is any way to save binomials in my custom dictionary. After all, for me it is not a matter of ignoring repeats, such as Rattus rattus above, but to recognise this as a (one) unique word – which as far as Latin names go is definitely a requirement for proofing. There is no point in saving all generic and species names separately; binomials are unique and you can’t combine just any and every generic and specific name. But Microsoft obviously haven’t addressed this at all! In previous versions of Word I had to store them with some sort of ‘joiner’, a middle dot or _. The problem here is that these need to be replaced (once finished) with a space. I have also tried a non-breaking space (since this is invisible) but this doesn’t work as the species name is still flagged as a separate misspelling.
    There appears to be no answer to this problem yet…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: