Migrating Existing Content

By Eric Severson  |  Posted 2009-02-18 Print this article Print

Migrating existing content

As with authoring new content, the most difficult part of converting legacy content is to make it topic-oriented. This includes the following three considerations:

1. Deciding what level of information should constitute a "topic" in the new system. - This should be done keeping in mind that a topic should have a specific subject and a specific purpose. For example, describing a single concept or a single, well-defined task.

2. Ensuring that each topic is self-contained. - This includes removing context-specific assumptions and references (for example, assuming you've just read the previous section of the book, or stating "see below").

3. Ensuring that topics are reusable across multiple contexts. - This includes generalizing context-specific descriptors (for example, changing "replacement memory card" and "new memory card" to simply "memory card").

Making one topic out of many

Where there's opportunity for content reuse, the challenge is also to make one topic out of many. For example, the following variations might occur across four existing documents:

Variation No. 1: To install the widget, remove the screw on the right-hand side of the tray, slide the widget into the tray, and replace the screw to secure the widget.

Variation No. 2: You will need a standard Phillips screwdriver to install the widget. First, locate the tray and remove the screw. Then slide the widget in and replace the screw.

Variation No. 3: Locate the tray and remove the screw with a Phillips screwdriver. After sliding in the widget, replace the screw.

Variation No. 4: After locating the tray and removing the screw, slide in the new widget.  When finished, replace the screw.

When legacy content is converted to DITA, all four of these versions will still exist. Ideally, authors will consolidate these into a single topic that can be reused across all three of the original publications. This can be done by picking the best, most reusable version, or by creating a new version that captures the best of each. In this example, perhaps the following:

New variation: Locate the tray and remove the screw with a Phillips screwdriver. Then slide the widget into the tray, and secure the widget by replacing the screw.

Finally, this new set of reusable topics must be linked back into a set of DITA maps that allow the output deliverables to be assembled and produced.

Of course, doing all this across your entire set of content can be a tremendous amount of work. Luckily, DITA doesn't have to be an all-or-nothing approach. In practice, there is usually a "sweet spot" of content that's really worth the effort, while other content can be used as is until there's time and motivation to work on it. Content in the sweet spot typically is core material (as opposed to introductory or supplementary information), has the potential for significant reuse, changes frequently, and has significant cost or risk if it's inaccurate or inconsistent.

Other content, even though it may not meet the strict definition for standalone and reusable topics, can still be broken up into "topics" and linked into DITA maps. However, such topics should not yet be marked as reusable. It's also okay if we continue to have some redundancy across these lower-priority topics. We can keep multiple versions of topics and include them in different maps. Later, we can work to consolidate them and make them fully reusable as time permits.

Eric Severson is co-Founder and Chief Technology Officer for Flatirons Solutions Corporation. Eric is also on the board of directors for IDEAlliance and is a former president of OASIS--both XML industry consortiums. He can be reached at Eric.Severson@flatironssolutions.com.

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel