Wednesday, April 26, 2006

Writerperfect and KWord - reusing of code with many benefits!

Writerperfect is the import filter for WordPerfectTM Documents inside OpenOffice.org 2.x. It uses a library, libwpd, to read WordPerfectTM Documents and converts libwpd's output to SAX messages in OpenOffice.org 1.0 Text File format (designed originally as an add-on for OpenOffice.org 1.1 and not needing any of the advanced features of OpenDocument format, the sxw output remained and works well in OpenOffice.org 2.x too).

Written principally by William Lachance (big thanks, Will), Writerperfect has an intelligent design that is not limiting it to a specific document handler. This allows us to use Writerperfect as a base of wpd2sxw command-line conversion tool, of the import filter integrated inside OpenOffice.org 2.x, of an import filter add-on for OpenOffice.org 1.1.x, and, soon, of a WordPerfectTM import filter for KWord, the word-processing component of KDE office suite, KOffice.

In fact, your servant and Ariya Hidayat of KDE fame have been working last weeks on an integration of Writerperfect with KWord. KOffice has a nice feature that allows different filters to be chained.

Since the version 1.4.0, as all major FOSS word processors, KWord is using, the libwpd library as a base of it's import filter for WordPerfectTM Documents. Since the use of libwpd was something new for KOffice, the existing import filter is a bit rudimentary. Reusing of Writerperfect's code allows the import filter to benefit almost instantly from all features converted by libwpd modulo those that KWord's sxw importer does not support. Along the way, as we already found bugs in the sxw importer, it benefits also from the change. And, last but not least, Writerperfect itself is likely to become more robust too.

Kendy, would you ever believe that your servant would be contributing to KDE? :-)

Tuesday, April 18, 2006

WordPerfect Graphics conversion WANTS YOU!

I heard many times a complain about how difficult it is to join the OpenOffice.org project as a developer. Although giving credit to some of these voices, I would like attract your attention to a simple way to do some development for OpenOffice.org without having to be familiar with UNO or any other of the strange OpenOffice.org concepts. At http://sourceforge.net/projects/libwpg there is an interesting project trying to write a library for reading (and converting) WordPerfect Graphics (WPG) files. It is a nice little project written in C++ that is somehow stale now because those who contribute to it are busy with other things (like, for instance, the WordPerfect Document import).

What should be done?

IMHO, there are some things that can be interesting for someone that is beginning. In no particular order:

  • iostreams - Since the WPG files are not embedded in an OLE stream, it is not necessary to use libgsf for input and output stream layer abstraction. In order to reduce the list of dependencies of the library, porting the input/output operations to C++ standard library classes would be a good thing. Moreover, it could be a good starting point to understanding how the library works without needing to have an in-depth knowledge of the WPG file format.
  • Add conversion of bitmap parts of WPG1 files.
  • Add conversion of WPG2 files supporting single and double precision.
  • Write a small command-line tool that would use this library to convert graphics into OpenDocument Drawing file format.

What already exists?

The vector parts of WPG1 files are quite correctly converted although some attributes may be omitted. There is a command line tool called wpg2svg that can be used to convert and visualize the vector graphics from WPG files in SVG format.
If you are interested, just grab the cvs module "libwpg" and start hacking. Send your patches to libwpg-devel@lists.sourceforge.net. Your work will be most appreciated.

What is the benefit for me when doing it?

"I do not have silver or gold, but what I have I will give it to you ..." Again in no particular order:

  • You will be my personal hero :-)
  • You will learn a lot about how to design and optimize a library that will be possibly used by many applications.
  • You will have the opportunity to collaborate in integrating this library into Openoffice.org. You will become in fact an Openoffice.org developper. This will give you the possibility to work with people like Caolán McNamara, Michael Meeks, Pavel Janík or Jens-Heiner Rechtien ;-)

So, come! The success, glory and prosperity are waiting for you!

Monday, April 03, 2006

Back again

Long time since I blogged the last time! No valid excuses coming to my mind, so not giving any.

SF.net vs. Colab.net - close call

Every time I find a bit of time, I am trying to advance in the field of converting headers and footers in older WordPerfect formats (WP 5.x for Unix/DOS/Windows and WP 3.0 - 3.5e for Mac). A nice bunch of code is sitting in my computer. The only problem is that I cannot commit it since the SF.net project CVS services are off-line (partial outrage) since Wednesday. Although the match SF.net vs. Colab.net was starting clearly bad for the later, it becomes a close call now.

WordPerfect for Mac 3.5 extended file-format

It is starting to be a bit problematic to convert documents that I am not able to see with the original application. For simple features it was OK. For headers and footers, one has to go through about 2 A4 sheets of hexadecimal garbage for every meaningful paragraph. It was really a genial idea to incorporate all formatting information into the document stream. It makes WordPerfect to be quick to render the document, but a programmer with ghex2 trying to convert it has a hard task.

mdbdriver02

Thanks to Hanno Meyer-Thurow of Gentoo phame and his skills with gdb, the M$ Access driver in mdbdriver02 CWS is not crashing any more with the Northwind.mdb Orders table. Nevertheless, there is still a bit of work to be done before it is ready for the spotlights. But, we are advancing.

Code review

It is something that is missing in OpenOffice.org, but it could really improve the quality of the code. (IMHO of a trained monkey, naturally.) I am excited about the fact that Frank Schönheit is wanting to review the code of the mdbdriver02 CWS.