Deciphering the Lawsuit Claims

By Jeff Cogswell  |  Posted 2009-08-14 Print this article Print

Consider the following sentence, which has some parts bold, some parts italics, and some parts normal text:

The quick brown fox jumped over the lazy dogs.

The content is the sentence itself without any formatting. Here's the content, which has no formatting:

The quick brown fox jumped over the lazy dogs.

When formatting is included, the word processing program can use different means to store the formatting information. The patent notes that a common way of storing such formatting is by placing the formatting right inside the sentence. For example, one might store it as HTML like this:

 The <b>quick</b> brown fox jumped <i>over</i> the lazy dogs.

The formatting code <b> means "turn on bold" while </b> means "turn off bold." Similarly with <i> and </i> for italics. This text with the formatting codes could be stored in a file and then read character-by-character from left to right by a word processing program. The software would display it on a screen, not with the formatting codes, but instead with the formatting actually applied:

 The quick brown fox jumped over the lazy dogs.

So what is claimed in the patent? The patent is long and makes 20 claims and provides some step-by-step algorithms for processing such text and for processing the formatting codes (or metacodes). Ultimately the algorithms create a data structure called a metacode map, which contains information about where in the text formatting is applied. The end result is two separate data structures: the metacode map with formatting information, and the plain text with no formatting. The program uses both data structures to display the fully formatted text on the screen.

This is in contrast to the way things were apparently done before the patent; word processors would use metacodes embedded right inside the text (similar to the HTML I showed earlier, but usually using other, earlier, types of codes). The word processor would read in the text character-by-character, turning on and off formatting sequentially as it went along, displaying the text on the screen formatted appropriately (or, in the case of early monitors that couldn't show formatting, displaying codes to denote how parts of the text is formatted).

So far what we're talking about isn't rocket science. But remember, the patent was written back in 1994 (and granted in 1998). This may well have been the first time such technology was seen. The word XML doesn't even appear in the patent, because it didn't exist yet in 1994. Instead, SGML (Standard Generalized Markup Language) is mentioned, which is essentially a precursor to XML.

Now consider Microsoft Word. Prior to Word 2007, the main format for Word documents was a proprietary, secret format that didn't use XML (technically speaking, it was a binary format). With Word 2003 came additional XML capabilities; with Word 2007 came the Open XML format, which stores the documents in a single .zip file that contains several files including images that might be embedded in your document, as well as all your text, which is all XML. Is this in violation of the patent? To try to answer this, I want to now figure out what "custom XML" means.

Jeff Cogswell is the author of Designing Highly Useable Software ( among other books and is the owner/operator of CogsMedia Training and Consulting.Currently Jeff is a senior editor with Ziff Davis Enterprise. Prior to joining Ziff, he spent about 15 years as a software engineer, working on Windows and Unix systems, mastering C++, PHP, and ASP.NET development. He has written over a dozen books.

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel