Wednesday, June 28, 2006

Stress-testing

Several times since last year I heard a complain about the lack or insufficience of QA in small FOSS projects. Judge yourself the example of libwpd.

Recently we had a crasher regression discovered just some days after a release and your servant was not happy at all about it. To prevent this kind of situation to arrise again, I approached sum1 of AbiWord fame and he agreed to QA the cvs head tree after a major commit and before a release. Here is his methodology:

  • First, he crawls the web and, using a nice firefox add-on called DownThemAll!, he downloads all WordPerfect documents he finds.
  • Like that, he created himself a stock of about 40k real life documents in different WordPerfect formats. He throws 10k - 15k of these documents on wpd2raw, a nice small utility that is distributed a s part of libwpd-tools package and is used by libwpd developers in their regression test suite. This QA run takes about an hour.
  • The documents that crash the tool and/or throw an exception are filed as bugs in AbiWord bugzilla, so that they can be dealt with.

A simple plan that has a lot of benefits. First of all, the library is stress-tested against a considerable quantity of real-life documents, so that the developers can deal with any crasher situation found. Second, this QA testing helps to find quite complicated documents in rare WordPerfect formats and renders much easier to add new features in the list of converted features; especially where the documentation is a bit fuzzy or unavailable. Third, the documents that produced crashes that were really hard to fix / not trivial to see take place in the above-mentioned regression test suite so that they do not bite anymore.

Just a little question: How much stress-tested are the proprietary, binary, stale and unmaintained WordPerfect filters distributed with StarOffice 8?

Friday, June 23, 2006

The change is urgent

It is now urgent to find a new job. If you are interested in hiring me, check my resume.

Thanks in advance

Sunday, June 11, 2006

Kind request towards libwpd packagers/maintainers

Hello, one of the best QA guys in the known universe, Sum1 of AbiWord fame, run a lot of WP documents through the wpd2raw 0.8.5 and discovered a libwpd crash in WP5 parser in a very rare situation (if the font descriptor packet is there, but not the font list packet). Although it is really rare case, the problem exists and this patch is solving it. I would kindly ask the libwpd packagers to reissue for their distribution a new package containing this patch.

This means that I will try to push the 0.8.6 sometimes in the beginning of July just after I stuff a little bit more the WP42 parser. My apology for this regression.

Thursday, June 01, 2006

libwpd 0.8.5 "reward for your patience" in the wild

It has been almost 6 months without a libwpd release. So, after you waited for it patiently, the libwpd developer community has the pleasure to announce you that libwpd 0.8.5, "reward for your patience", has been released today (1st June 2006) just to throw some shadow on the rather insignificant event of Ubuntu Dapper release.

So, what has been done:

  • We added some new items in the list of converted features for WP5 and WP3 formats. Font information (face, size, colour), headers/footers and footnotes/endnotes for WP5 and headers/footers for WP3 file-format.

  • We fixed some bugs (from which one crasher) and annoyances.

  • We are now having the page/section/paragraph margins right even in multicolumn sections, which allows us to know which absolute position corresponds to which column and to count the relative position from the begining of the column. This is also allowing us to count correctly tabulator positions in multicolumn sections.

  • We are now preventing -- inside the document body -- negative paragraph margins that were resulting from page margin change in the middle of a page which removes the ugly text border lines running across the text in some WP documents opened inside OpenOffice.org. Although, it is still possible to craft a document the way that there will be negative margins --in header and footer --, it is very unlikely to find something like this in the real life.

  • We are avoiding closing page spans and/or sections inside a paragraph which used to add paragraph breaks where they were not supposed to be. We defer now the page span change to the end of the paragraph.

  • We added to the wpd2text tool an option switch "--info"; if called with this switch, wpd2text will not convert the document in plain text, but dump the document meta-data instead. This feature could be used in a beagle WPD indexer if designed as a wrapper of wpd2text.

Future plans:

  • A fine hacker known from several FOSS projects, Ariya Hidayat, is sponsored by Google to work on the wpg2odg converter. Which means that the conversion of images in WP documents is not an abstract issue anymore. Welcome to Ariya and thank you to Google.

  • As for me, I would like to implement tabulators in WP3 and WP5 file formats for 0.8.6 as well as to try to bring all the formats to the same level of features converted. Which means add multicolumn sections for WP5 and list styles for both WP3 and WP5. In the same time, I would like to stuff a little bit the WP42 parser.

  • One of my other objectives would be to motivate someone with a copy of WP 5.x to do the QA of changes in this file-format. I would like to thank in the same token Smokey Ardisson for his great QA work of WP3 changes. Without Smokey it would not be possible to arrive so far in the WP3 support. Thanks, Smokey, for being my eyes :-)

Bottom-line: With this changes, the libwpd based filter is more powerfull than the proprietary WP filters shipped with StarOffice 8. A time to get rid of stale, unmaintained, proprietary binaries, isn't it?