Indexes v. full text searchesOctober 4, 2008
Comment from an email discussion list member
In my opinion, indexing is pointless. Searching is much more important than indexing.
I think you should provide the simplest automatic indexing, based on topic headings, and rely on your users to use full-text search to actually find anything.
Full-text search is easy and it works perfectly. Indexing, on the other hand, depends on the skill and commitment of the indexer. For myself, I feel that my life is too short to spend much of it on indexing.
Yes, full text search (FTS) has its place (and Google et al have a lot to answer for in this regard).
BUT FTS brings with it a lot of garbage as it find any matching words without any reference to their context. You can narrow the results list by judicious use of Boolean operands and syntax such as quote marks, parentheses and the like—but most users don’t know about them or how to use them. However, this still doesn’t put the words into context. In many ways FTS is like a concordance… ‘let’s list every word we find in the document no matter how important it is’. Concordances treat every word equally.
Indexes on the other hand—at least, well-constructed indexes—serve a different purpose in focusing the user’s attention on the words most likely to have relevance.
Well-constructed indexes are done by humans who can detect nuances and meaning of words—in the context of their usage. Indexes created by software (of any kind) just don’t have this sophistication at the moment. A good indexer can create “see” references from unused terms to used terms, and “see also” references to similar terms. Complex indexes also refer users to broader and narrower terms.
Others have mentioned synonyms and they are one of the prime reasons why human indexers are much better at this than software. A FTS can’t distinguish between words such as “editing”, “amending”, “changing”, altering”, “modifying” (and their variations)—yet they could all be used in the document. A human indexer can set one term as the preferred term and refer all other uses to that term, thus offering the user the FULL range of topics that cover anything to do with changing something.
Others have also mentioned that print output (NOT screen versions of print, like PDF) doesn’t have FTS, and a good TOC and Index is the way users access the information. I believe there is a place for both… and I would be very sad to see the day when indexes are relegated to the trash.