home | about us | contact us | member area

Bay Area Editors' Forum

find the right editor | what do editors do? | editing resources | next forum | forum index | membership

Indexing a Multi-Volume Project

Managing a Multi-Volume Indexing Project

November 18, 2003
Forum organized by Virginia Rich
Presentation by Cynthia Berman and Ellen Perry
Notes by Dawn Adams

As the technical field moves more and more toward using multi-volume reference works, indexing has had to change along with it. Indexing software manuals already differs substantially from trade book indexing (traditional back-of-the-book indexing) in that the manuals are often produced electronically in PDF or HTML formats rather than print. Now that many of the manuals are no longer standalone books but volumes in a set, the complexity of the indexing process has increased correspondingly, necessitating new processes and project management guidelines.

"One thing we want to point out is that a well-formed index is a well-formed index, well-formed in social science, software, or whatever," said Cynthia Berman, BAEF member and freelance technical editor, indexer, and writer, formerly with Siebel Systems. "Indexing has been a real area of growth in editing and software. Now very few hardbound books are going out any more in our world. The indexing process is evolving from just creating the index to creating hierarchies and taxonomies. We work with content objects and content management systems rather than books."

Ellen Perry, BAEF member and lead technical editor for AutoCAD Technical Publications at Autodesk, Inc., said that one issue can be getting the upper management buy-in for creating an index in the first place. Once that's accomplished, convincing the writers to close the files (i.e., stay out of the manual's files) long enough for the files to be indexed is the next hurdle. Perry has had to work nights and weekends to accommodate writers who need daytime access to their files.

"You have to gauge upper management support for help files and the time needed to edit them," Perry said. "You have to push hard up front to make sure that everyone is on board, consider staffing issues, and define the schedule."

Planning out the process

One of the key elements in making the indexing process for a multi-volume work go smoothly is figuring out staffing and project requirements. For example:

* skill set-what kind of skill set are you looking for in an indexer? Someone who uses dedicated indexing software or who embeds index entries in a FrameMaker document?

* toolset-what tools do the indexers need from you? Do you need to provide an add-on product for FrameMaker or PDF files?

* work environment-will the indexer work onsite or will he or she be working offsite?

* project management tool-what will you use to track milestones and maintain schedules? According to Perry and Berman, the tools they have used run the gamut from Excel to Word to MS Project

* defining project start-when is a book ready for indexing? For Perry, indexing is occurring earlier and earlier in the production process-now at the proofreading stage. Localization (translating the help files into different languages for different user markets) often determines her deadlines for indexing. Berman noted that indexing can begin when a book is 85% done. In her projects, the documents were often prioritized for indexing by how they were used.

Maintaining indexing standards

In multi-volume projects especially, it is vital to have indexing standards set up. When several indexers are working on different parts of the same work, there is huge potential for deviation in how concepts and products are handled by different people. Berman recommends having writers follow a writing model on the front end and having the indexers follow an indexing style guide on the back end.

"If you have several indexers working, you want to make sure that product or feature is spelled the same way throughout index by using a glossary or something similar," Berman said. "That will speed up editing and delivery, help with translating the content, and help your user build a mental model of how the index and volume relate."

According to Berman, having the writers follow a writing model can help to "seed" an index. The writers deal with important concepts in the same manner, using standard ways to refer to standard content elements. This will help the editor to develop the indexing style guide as well as enable the indexers to consistently identify concepts for indexing. Berman cautioned that only product names and feature names that have been approved by marketing should go into the style guide and index.

Some of the important guidelines that belong in the style guide are:

* What your indexable subjects are (e.g., tasks and concepts, or one or the other?)

* How to handle cross-references and double-posting

* What the entry structure should be-for example, Gerund, noun (Printing, reports)

* What the sort sequence is-do cross-references belong at the top or the bottom of an entry?

* Whether any special formatting is used (e.g., bold or italic) and how

* Where the index entries are embedded (in the text or the heading)? According to Berman, the delivery mode has implication for placement of entries. For example, when using WebWorks Publisher she had to put entries in headings. And Perry said that quite often you have to put entries into a certain location for the localization team.

Berman emphasized the need for controlled vocabulary. According to Berman, using particular terminology not only builds a good mental model, but it facilitates information mapping, making it much easier to track types of information. It also enables content to be more easily reused later, since the vocabulary is standardized, and facilitates the work of the localization team responsible for translating the documents and indexes into languages other than English.

"There are advantages to controlled vocabulary," said Berman. "It is easier to edit, easier for contractors to get to know the vocabulary. And for content management systems, it seeds the metadata and has positive implications for searching."

In addition, consider up front how you will allow for updates to the style guide as styles evolve, Berman said.

Merging volumes into a whole

At times, volumes that were indexed separately will need to have their indexes merged into a consolidated whole. Perry noted that once indexes are merged, there is normally quite a bit of duplication, some mysterious entries appear, and sometimes top-level entries will get lost. In the editing process, Perry recommended deciding up front whether you're going to maintain your edits for separate delivery methods (e.g., for both print and HTML), since some HTML edits don't apply to print.

"In editing a merged index, you have to decide what tradeoffs can you live with as an editor?" she said. "You need to make some compromises."

Scheduling and planning are also an integral part of editing a merged index. According to Perry, You should not only plan what you will do in your indexing passes but how long you will allow for each pass.

"I once had about 17,000 entries to edit in 3 to 4 days," Perry said. "In the best of all possible worlds, I would have had a week-you edit subentries then cross-references; check for style adherence; spell-check; and then merge entries until you drop."

Perry highly recommends using IXgen if you are indexing in FrameMaker. IXgen pulls all of the index markers into a list, enabling you to search in the file. Once in IXgen, you can edit the text any way you'd like, both deleting and adding index entries. Once you have done all your edits, you apply the markers in the Frame document.

Tools of the trade

"Software-it's great to have lots of it," Berman said.

As far as software tools go, there are quite a few options available. According to Berman, some of the more common out there are FrameMaker using IXgen as a plug-in; Word; Excel; dedicated indexing software such as Macrex, Cindex, or Sky Index; and specially developed scripts. The index can be delivered in PDF, WebWorks Publisher, or HTML Help Workshop.

"In FrameMaker, it's easier to generate an index, but harder to enter and edit index entries," Berman said. "IXgen is a plug-in that turbo-charges indexing in Frame. Some say that Word is easy to use for indexing, but it's absolutely impossible for multi-volume works. And custom scripts, while versatile, they do require special skills."

Single-source indexes

Single-source indexes are a special beast, according to Berman and Perry. Because a single work may be published in multiple formats (e.g., print, PDF, and HTML), there can be implications for the indexing process. Everything that you take into account for a multi-volume work applies, along with additional considerations of marker placement for the PDF vs. the HTML versions. But no matter what, all page ranges must go, said Berman, and using controlled vocabulary becomes even more important.

Reindexing or reinventing the wheel?

One issue that comes up over and over again with management is whether to reindex a work that has been revised or whether to update the existing index, said Berman. Indexers generally maintain that it's both easier and cheaper to reindex, while managers claim that that cannot be so, she said.

"Managers see indexing as a software function not an analytical function," Berman said. "Revision can work, if the writers are using a writing model that promotes chunking and labeling such as information mapping. But indexes still can get stale, so you need clear delineation of what's new, updated, and deleted."

Updating an existing index that was created with embedded entries can be difficult because writers quite often will do a lot of cutting and pasting. Sometimes the entry gets pasted, and sometimes it doesn't.

"If the writer doesn't change the index entry, and the indexer doesn't have the time to do a really thorough job, no one may notice that that entry is wrong," Berman said. "This gets to be a problem with conditional text in Frame."

In addition, it can also be problematic to identify new material that needs to be indexed if you are updating an index for a legacy document-change bars are not always a good indicator of new text, Berman said. The flip side of that is that it can be difficult to find deleted text and make sure that the index entry has been deleted. An updated index can also become stale, as terminology, features, and products all change.

The bottom line is that it may be quicker to reindex than to perform a thorough review, said Berman, but in any case there is almost never enough money, and almost never enough time.

Resources:
Indexing Books, Nancy Mulvany, 1994
Indexing From A to Z, Hans Wellisch, 2nd ed., 1995 ANSI NISO Standard for Monolingual Thesauri
Information Architecture for the World Wide Web, Rosenfeld and Morville, 2nd ed., 2002
Managing Enterprise Content, Anne Rockley, 2003
The Content Management Bible, Bob Boiko, 2002
What is a Controlled Vocabulary? Karl Fast, Fred Leise, and Mike Steckel
www.willpower.org

This panel is adapted from a presentation that Perry and Berman gave with Jan C. Wright in Vancouver, B.C., June 19-21, 2003 at the annual meeting of the American Society of Indexers (ASI) and the Indexing and Abstracting Society of Canada/Societe Canadienne Pour L'analyse de Documents (IASC/SCAD).