Measuring Office Format Fidelity with Acrobat 9

When considering alternatives to Microsoft's Office productivity suite, one of the most important issues to evaluate is that of the success with which Office rivals such as OpenOffice.org can handle Microsoft's ubiquitous binary file formats. While the phrase "small formatting inconsistencies" sums up the situation fairly accurately, organizations and individuals out to bring the open-source suite into their application mix could use a more rigorous means of measuring OpenOffice.org's handling of MS Office formats. That's why, when Adobe briefed me on Acrobat 9, I was particularly interested in Acrobat's new "compare documents" feature, which analyzes two PDF documents and parses out all of the inconsistencies between them...

When considering alternatives to Microsoft's Office productivity suite, one of the most important issues to evaluate is that of the success with which Office rivals such as OpenOffice.org can handle Microsoft's ubiquitous binary file formats.

acrobat2.jpg

Over the past few years, eWEEK Labs has approached the MS Office to OpenOffice.org file format fidelity issue several times. Our conclusions haven't changed much since 2004, when Anne Chen and I helped one of our corporate partners test the productivity suite pair for themselves:

""Although OpenOffice.org does a good job of handling Microsoft Office file formats, small formatting inconsistencies will require reworking of complex documents.""

While the phrase "small formatting inconsistencies" still sums up the situation fairly accurately, organizations and individuals out to bring the open source suite into their application mix could use a more rigorous means of measuring OpenOffice.org's handling of MS Office formats.

That's why, when Adobe briefed me on Acrobat 9, I was particularly interested in Acrobat's new "compare documents" feature, which analyzes two PDF documents and parses out all of the inconsistencies between them.

I grabbed a Word-formatted reviewer's guide document from Microsoft's Web site, opened it up in Word 2007, and printed it to a PDF using Acrobat 9.

Next, I opened the document in OpenOffice.org 2.4 and used Acrobat 9 to print it to a PDF document. I could have used OpenOffice.org's built-in PDF export function, or Office 2007's plugin-based PDF exporter, but I opted to stick with Acrobat in order to minimize inconsistencies that the differing PDF exporters might have introduced.

I fired up Acrobat 9 (I tested with a beta version of the software) and pointed the application's compare document feature at my Office and OpenOffice.org-rendered PDF documents. The result? Good fidelity overall, but various inconsistencies remained. This time, however, I had Acrobat 9 on hand to point the inconsistencies out to me.

For instance, right on the first page of the document, OpenOffice.org rendered a 935 by 227 pixel logo at 936 by 234 pixels--a formatting inconsistency that resulted in a slightly misplaced logo, but one that I would have had a tough time putting my finger on without Acrobat 9's help.

Another odd, slight inconsistency came in the document's table of contents, in which OpenOffice.org rendered 146 periods between the section name and page number, where Office had rendered 145 periods.

I also downloaded a test version of the upcoming OpenOffice.org version 3, and compared that version's Word document rendering to that of OpenOffice.org 2.4. Both versions appeared to render my test Word document exactly the same--a result that Acrobat's compare function confirmed.

Since support for Microsoft's new Office Open XML formats is one of the new features in OpenOffice.org 3.0, I fetched another document from Microsoft's Web site, this time in the DOCX format, and cheffed up some PDFs to gauge the open-source suite's OOXML chops. This time, the formatting differences were much more pronounced and included misplaced images and jumbled bullet lists.

I expect to see OpenOffice.org 3.0 improve its handling of OOXML documents as it moves closer to its release. I'll be testing the suite's OOXML capabilities as subsequent test releases emerge, and I expect that I'll be using Acrobat 9 to help with those tests.

For a walkthrough of my Acrobat-fueled Office vs. OpenOffice.org file format adventures, see our slide show, here.