tag:blogger.com,1999:blog-134796142024-03-14T04:22:17.947+01:00Trained Monkey Hacking ExperienceOn LibreOffice, programming, reverse-engineering and file formatsFridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comBlogger101125tag:blogger.com,1999:blog-13479614.post-68338146268966586322014-05-23T17:15:00.001+02:002014-05-23T17:15:22.494+02:00librevenge 0.0.0 is out: Document Liberation Project's framework is available to be used<p>It is not without emotion that the <a href="http://www.documentliberation.org" target="_blank">Document Liberation Project</a> announces today the first release of the new framework library, <code><b>librevenge-0.0.0</b></code>. This release means that the API of <a href="http://sourceforge.net/p/libwpd/librevenge/ci/master/tree" target="_blank"><code>librevenge</code></a> is now set into a stone (at least until the 0.1.x series) and thus the library can be used by willing filter-writers.</p><p>You might be familiar with some aspects of the <code>librevenge</code> framework from <a href="http://fridrich.blogspot.com/2013/11/libreoffice-import-filters-what-is.html" target="_blank">this blog</a> or from <a href="https://speakerdeck.com/fridrich/librevenge-is-suite-what-is-new-in-the-world-of-import-filters-and-what-is-coming-soon" target="_blank">this FOSDEM 2014 presentation</a>. <a href="http://davetardon.wordpress.com" target="_blank">David Tardon</a> started a <a href="http://davetardon.wordpress.com/2014/05/06/writing-import-libraries-with-librevenge-part-i-getting-started" target="_blank">nice serie</a> of articles explaining how to use the framework. So, there are no valid excuses remaining not to use it and not to contribute to the world domination that is the ultimate destiny of the <a href="http://www.documentliberation.org" target="_blank">Document Liberation Project</a>.</p><h2>Standing on the shoulders of giants</h2><p>But the first release of a new framework would be empty without mentioning those on shoulder of whom we stand. First we would love to thank <a href="https://twitter.com/wlach" target="_blank">Will Lachance</a> and <a href="http://uwog.net" target="_blank">Mark Maurer</a> for having started more then 10 years ago the development of <a href="http://libwpd.sourceforge.net" target="_blank"><code>libwpd</code></a>. It is this library and its wise interface design that allowed us to move incrementally to the current framework. <b>Thank you guys, you know that without you we would be nowhere!</b></p><p>Besides <a href="http://fridrich.blogspot.com" target="_blank">your servant</a>, <a href="http://davetardon.wordpress.com" target="_blank">David Tardon</a>, and <a href="https://plus.google.com/108983215764171548842/posts" target="_blank">Valek Filipov</a>, we would love to single out a discrete person, who speaks little but codes a lot. It is <a href="http://www.loria.fr/~alonso" target="_blank">Laurent Alonso</a>, without whom we would never be able to recover a huge amount of old MacIntosh documents. We thank equally to all our past and present <a href="http://www.google-melange.com" target="_blank">Google Summer of Code</a> students, without whom the road would be much more thorny.</p><h2>LibreOffice and The Document Foundation</h2><p>It would be a very big mistake if we did not thank the project from which we all originate, the <a href="http://www.libreoffice.org" target="_blank">LibreOffice project</a>. The community gravitating around <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> creates is caring, encouraging and creates the right environment to foster innovation.</p><p>Last but not least, our thanks go to <a href="http://www.documentfoundation.org" target="_blank">The Document Foundation</a> that did not hesitate to take us under its umbrella and provide all the necessary institutional support.</p><h2>How to contribute</h2><p>Now a new phase starts and you can be part of it! There are many <a href="http://www.documentliberation.org/contribute" target="_blank">ways to contribute</a>. You drop by at the <a href="irc://chat.freenode.net/documentliberation-dev"><code>#documentliberation-dev</code></a> channel at <a href="http://webchat.freenode.net" target="_blank"><code>irc.freenode.net</code></a>. There will always be someone to help you to join this exciting journey.</p><p>For more information about our activities, follow <a href="https://twitter.com/DocLiberation" target="_blank">@DocLiberation</a> on twitter, <a href="https://plus.google.com/communities/109370061751362503274" target="_blank">Join our Google plus community</a> or <a href="http://www.facebook.com/documentliberation" target="_blank">like us on Facebook</a>.<p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-45190579598689978512014-04-22T12:46:00.002+02:002014-04-22T12:46:30.957+02:00LibreOffice projects for Google Summer of Code 2014<p>We are happy to announce that the <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> project has 10 Google Summer of Code projects for this 10th edition of the program. The selected projects and students are:</p><table align="center"><tr><td><p><b>Project Title</b></p></td><td><p> </p></td><td><p><b>Selected Student</b></p></td></tr><tr><td><p>Connection to SharePoint and Microsoft OneDrive</p></td><td><p> </p></td><td><p>Mihai Varga</p></td></tr><tr><td><p>Calc / Impress tiled rendering support</p></td><td><p> </p></td><td><p>Andrzej Hunt</p></td></tr><tr><td><p>Improved Color selection</p></td><td><p> </p></td><td><p>Krisztián Pintér</p></td></tr><tr><td><p>Enhancing text frames in Draw</p></td><td><p> </p></td><td><p>Matteo Campanelli</p></td></tr><tr><td><p>Implement Adobe Pagemaker import filter</p></td><td><p> </p></td><td><p>Anurag Kanungo</p></td></tr><tr><td><p>Improvements to the Template manager</p></td><td><p> </p></td><td><p>Efe Gürkan YALAMAN</p></td></tr><tr><td><p>Dialog Widget Conversion</p></td><td><p> </p></td><td><p>freetank</p></td></tr><tr><td><p>Dialog Widget Conversion</p></td><td><p> </p></td><td><p>sk94</p></td></tr><tr><td><p>Improve Usability of Personas</p></td><td><p> </p></td><td><p>Rachit Gupta</p></td></tr><tr><td><p>Refactor god objects</p></td><td><p> </p></td><td><p>Valentin</p></td></tr></table><p>We wish all of them a lot of success and let the coding start!</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-28239150175177530522014-04-06T08:52:00.001+02:002014-04-06T21:38:35.600+02:00LibreOffice CorelDraw import filter - support of version x7 landed<p>Corel released CorelDraw x7 on 27 March 2014. We had some time to look at the changes in file-format and we adapted <a href="http://www.freedesktop.org/wiki/Software/libcdr/" target="_blank"><code>libcdr</code></a> to be able to open it. The changes landed this week in <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> code, in master and libreoffice-4-2 branch. That means that support will be available in the next 4.2.x release.</p><p>It is good to note that while introspecting the files we discovered a flaw in CorelDraw x7 that makes files using the Pantone palette number 30 pretty unusable for CorelDraw users. We worked it around and the files are opening just fine in <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a>. Take this as a first contribution by the new <a href="http://www.documentliberation.org" target="_blank">Document Liberation Project</a>.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-68065953123688984882014-03-20T17:37:00.002+01:002014-03-20T17:37:58.358+01:00LibreOffice and Google Summer of Code 2014<p>Hello, dear students!</p><p>This little blog is to remind you that in a bit more then 24 hours, the student applications for the 10<sup>th</sup> edition of Google Summer of Code will be closed. It is always better to submit an imperfect proposal before the deadline then to miss the deadline by 5 minutes with perfect proposal. So, check our <a href="https://wiki.documentfoundation.org/Development/GSoC/Ideas" target="_blank">Ideas page</a> and hurry up with applying.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-67603645981510614592014-01-17T17:48:00.000+01:002014-01-17T17:48:43.366+01:00AbiWord import filter in LibreOffice: another tool for the swiss army knife<p>It all started by an innocent (?) question on 28th of November 2013. The inimitable Caolán asked whether anybody considered writing an import filter for AbiWord document format. And the distinguished readership of this blog knows well what makes your servant tick. So, the very evening, a skeleton was written and <a href="http://www.freedesktop.org/wiki/Software/libabw/" target="_blank"><code>libabw</code></a>, a library to read AbiWord file-format, started. It was pretty exciting to write -- after a host of libraries for file-formats that are not documented anywhere -- a filter for a file-format of our <a href="http://abisource.com" target="_blank">cousin</a>. There was a hope that existence of a reference implementation whose source code is widely accessible would make the endavour easy. It is undeniable that grepping for values of different enums made the work a bit easier. Nonetheless, a huge part of the work was still figuring out what is permitted in <a href="http://abisource.com" target="_blank">AbiWord</a> and how a change of one parameter affects the rendering of a document. Other thing to find out was how to map the concepts in the ABW files into the <a href="http://libwpd.sf.net" target="_blank"><code>libwpd</code></a> API that is heavily influenced by ODF concepts.</p><p>But the date of the start meant that soon came the Christmas and with it a possibility to spend some free time on the library. Eventually it became very usable and the import filter made it -- as a late feature -- into the <a href="http://www.libreoffice.org" target="_blank">LibreOffice 4.2</a> line and users of the upcoming <a href="http://www.libreoffice.org" target="_blank">LibreOffice 4.2.0</a> release.</p><p>The library currently supports both the plain xml ABW files as well as the gzipped ZABW files. The converted features include:</p><p><ul><li>Tables, including nested tables</li><li>Headers and footers, including different left, right and first page headers/footers</li><li>Footnotes and endnotes</li><li>Multi-column sections</li><li>Embedded images</li></ul></p><p>And since a picture speaks louder then hundred words, here are some screenshots:</p><table align="center"><tr><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/ABW_In_Abi.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_ABW_In_Abi.png"/></a></td><td> </td><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/ABW_In_LO.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_ABW_In_LO.png"/></a></td></tr><tr><td align="center">A sample ABW file opened</br>in AbiWord</td><td> </td><td align="center">The same ABW file opened</br>in the upcoming LibreOffice 4.2.0</td></tr></table><table align="center"><tr><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/ZABW_In_Abi.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_ZABW_In_Abi.png"/></a></td><td> </td><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/ZABW_In_LO.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_ZABW_In_LO.png"/></a></td></tr><tr><td align="center">A sample (zlib compressed) ZABW</br>file opened in AbiWord</td><td> </td><td align="center">The same ZABW file opened</br>in the upcoming LibreOffice 4.2.0</td></tr></table><p>As you can see from the screenshots, the world domination that we are actively seeking is having several contenders. But if you believe that we are the closest to its realization, please join the filter-writing fun! Show up on <a href="irc://chat.freenode.net/libreoffice-dev" target="_blank"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/" target="_blank"><code>irc.freenode.net</code></a>. You are also encouraged to follow my <a href="https://twitter.com/FridrichStrba" target="_blank"/>twitter</a> and <a href="https://plus.google.com/108382325637135111255/posts" target="_blank">Google+</a> accounts. And stay tuned for more exciting news in the near future. We can promise you that you will have a lot of fun in the growing community of <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> filter writers.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-47045857578369649092014-01-08T14:34:00.003+01:002014-01-08T14:34:53.320+01:00Thank You!<p>Dear friends!</p><p>From the bottom of my heart I would like to thank you for your support during the past elections for The Document Foundation Board of Directors. Without you my election would be never possible and I never took it for granted. I am thankful for your trust. You cannot even immagine how happy and grateful I am for your support. Especially in a moment where my relationship with our project undergoes major changes.</p><p>I pray to be always up to the task to co-guide our project with wisdom and integrity.</p><p>I love you</p><p>Fridrich</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-8980882245314332172013-12-10T11:46:00.001+01:002013-12-10T11:46:09.729+01:00Fridrich Štrba, candidate for TDF Board of Directors<p>The time has come when The Document Foundation will elect a new Board of Directors. As you might already know, there are many good candidates. And since I clearly think I am the best of them, I am writing this to ask you to vote for me. Some of you might know me a bit already, but it is never bad to present myself.</p><p></p><p>My name is Fridrich Štrba, national of Switzerland and Slovakia, happily married with Susan since more then 12 years and father of 3 wonderful children: Patrick (9), Miriam (6) and Nathanael (3).</p><p></p><p>My story with LibreOffice started around 2004, with its predecessor, OpenOffice.org. I was just trying to contribute to <code>libwpd</code> which is the horse-power of our WordPerfect import and the OpenOffice.org integration was an interesting thing to contribute to. And since then, my love story with our project went through different stages, but we are still together and sometimes even happy.</p><p></p><p>I have been mentoring Google Summer of Code students since 2006 and recently I was co-responsible for several import filters for reverse-engineered formats (i.e. Visio, CorelDraw, MS Publisher). I can frankly say that my development and marketing work around the filters are a huge part of the reason why LibreOffice is called the "Swiss army knife of file-formats". We managed quite recently to bootstrap a vibrant community of filter-writers and the the amount of supported file-formats will only grow.</p><p></p><p>Between 2007 and 2013, I was highly blessed to be working on LO as my day-job, employed by Novell, then SUSE. Since September 2013, I am again a volunteer as many of you. This new-acquired independence is an advantage. I have no monetary interests of any kind in LibreOffice and, if elected, I will take decisions only and only considering the good of the project as such.</p><p></p><p>The advantage of my election would be that I am part of various native language communities. I speak several languages and can understand the aspirations of the corresponding communities. Besides that, I was part of the Membership Committee from 2010 and the last year, I was its Chairman. In this quality, I was able to push forward my vision of diverse and open and inclusive community that goes beyond personal sympathies or aversions. And this is the vision I desire to pursue if you give me your trust.</p><p></p><p>And since it is written "You don't have because you don't ask", with this message I ask you to cast your vote for me.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-41172588989109512032013-11-22T11:13:00.001+01:002013-11-22T12:40:33.646+01:00The Document Foundation elections: an intimacy between you, your choices and (maybe) the NSA guy<p>As many who follow the LibreOffice mailing lists know, soon we will have the elections for the Bord of Directors again. Without doubt, there will be a lot of good candidates and the choice will be difficult. Different competencies, personalities, sensibilities. As many parameters as there could ever be. Nonetheless, there is one parameter that was eliminated from before the first election: the corporate pressure.</p><p>From the very beginning of The Document Foundation, the Steering Committee and the initial Membership Committee knew that while corporations can contribute a lot to open source, they can also in some moments try to use the community bodies for their own interest. That is the reason that all elected bodies of The Document Foundation have the 30 per cent rule, where no more then 30 per cent of any body can have the same affiliation. In the same spirit, the election system was designed the way that it is technically impossible for anybody to know how a given member voted. From the experience with the "old good times" of OpenOffice.org, it was obvious that corporate influence can do a lot of harm and skew the elections in a considerable way. And even if the rule of 30 per cent is in place, it might be hard for a election officer or for a MC member to stand strong before a corporate pressure. And this was the reason why we chose a design that makes it impossible even for the election officer to know whom you voted for. This information is known only to you.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-54474266704951794822013-11-19T15:15:00.001+01:002013-11-19T17:04:10.813+01:00LibreOffice Import filters - what is stewing in the sauce-pan<p>Long time not see, dear friends. But that does not mean that there is nothing to speak about. So, hence a new blog post for those that were wondering what was happenning in the reverse-straight engineering partnership.</p><p>After the moments in August and September, where I transitioned from working on LibreOffice to working on SuSE Linux Enterprise and after some breathing pause to give to the Cesar (or also known as family) what is belonging to Cesar, the activity on LibreOffice related stuff restarted in October. Just this time, during nights, weekends and other free time.</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/keynote_lo.png" alt="Sample Keynote presentation in LibreOffice 4.2" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_keynote_lo.png" /></a></p><p align="center">Sample Keynote presentation in LibreOffice 4.2</p><p>It is with a huge pleasure that I realized that we start to have a vibrant developer community around the libwpd/libwpg family, as well as around Valek's reverse-engineering framework. SUSE Hackweek 10 helped me to produce an initial importer for Freehand file-format. Close to that, David Tardon of RedHat fame added a library to parse Keynote files and a library to convert different e-book file-formats. Laurent Alonso works like a bee on importing Microsoft Works spreadsheets (*.wks). Many exciting things in the pipeline, as you can see.</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/freehand_lo.png" alt="Sample Freehand drawing in LibreOffice 4.2" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_freehand_lo.png" /></a></p><p align="center">Wireframe of shapes from a sample Freehand drawing in LibreOffice 4.2</p><p>With the extension to presentations and spreadsheets, we decided that the time has come to simply break the super-stable libwpd/libwpg API and profit to make it even more future-proof and in the same token solve some of the API issues that were preventing us from importing correctly several features; most notable of which the Visio connectors.</p><p><b><code>librevenge</code></b></p><p>We decided to diminish drastically dupplication of code and we extracted from <code>libwpd</code>, <code>libwpg</code> and from <code>libetonyek</code> the API classes along with the used types. We created a new library, <code>librevenge</code> where we also added as sub-libraries the (structured) stream implementations that used to be in <code>libwpd-stream</code>, as well as several classes that the libraries used to copy and paste between them. The structured stream implementations support now both OLE2 and Zip containers and the relevant libraries assume this. That means that we will have to eventually extend the <i>WPXSvStream</i> implementation in LibreOffice's "writerperfect" module to cater for Zip too.</p><p>A new sub-library, <code>librevenge-generators</code> has the simple implementations of the interface classes that we use to convert documents into html, text, or that we use to see the raw API calls for the purpose of regression testing. The exception is the <i>RVNGSVGDrawingGenerator</i> class. In the current stable branches, all of the libraries that convert graphics file-formats contain an SVG generator and they rely on its presence in several cases for things like fills with vector graphics. This class is thus not part of the <code>librevenge-generators</code> library, but of the base <code>librevenge</code>, which is a hard dependency of all of the converter libraries.</p><p><b>RVNGPropertyList</b></p><p>The base type for passing information using the API callbacks is <i>RVNGPropertyList</i>, which was born from <code>libwpd</code>'s <i>WPXPropertyList</i>. We modified the design of this class the way that each atrribute can have as a value either a simple property or an array of RVNGPropertyList element. This allows us to do more or less all that JSON is able to do. The API classes are even more flexible and future-proof, since extending the information passed in the different callbacks will not modify function signatures.</p><p><b>Quality improvement</b></p><p>Although the relevant libraries were quite extensively regression-tested in the past, the new <code>librevenge</code> extends the coverage of unit tests. We hope that this helps us to keep under control the basic functionalities without having to use the heavy regression tests on each commit.</p><p>Other effort is to avoid to copy in the API calls huge data structures. This effort will result in some performance improvements especially if a document contains a lot of shapes that are filled by different bitmap fills.</p><p><b>When will it be ready?</b></p><p>When it is ready! But seriously, we are trying to take our time and get the APIs right. Like this we intend to prevent gratuitous breakages of binary compatibility in the future. So, it will not be in LibreOffice 4.2 for sure.</p><p>If this is interesting for you, please drop by at <a href="irc://chat.freenode.net/libreoffice-dev" target="_blank"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/" target="_blank"><code>irc.freenode.net</code></a> in order to meet us. We cannot promise you that you will become rich, but we can guarantee you fame and eternal gratitude</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-59799269258850070792013-08-20T15:38:00.003+02:002013-08-20T15:38:23.820+02:00Service announcement: openning of SXW files produced by early versions of wpd2sxw in LibreOffice<p>Just a service announcement for those that might have around still SXW files generated from WordPerfect documents by the <code>wpd2sxw</code> tool version 0.6.x or earlier (years 2004 and before). Those files used to open fine in early OpenOffice.org versions, but they miss a crucial element. That is the reason why LibreOffice, the OpenOffice.org modern successor, will refuse them. Nevertheless, they are not lost!</p><p>LibreOffice development team, in its constant quest of increased user satisfaction, has a workaround for you!</p><p>First grab the <a href="http://people.freedesktop.org/~fridrich/sxw_manifest.zip" target="_blank">zip file with the required manifest</a>. Then get the <code>zipmerge</code> tool that comes with <code>libzip</code>, and merge the manifest into the corresponding SXW file. As an example, this command line could work:</p><p><code>for i in <i><sxw-file-list></i>; do zipmerge temporary_sxw.sxw /path/to/sxw_manifest.zip $i && mv temporary_sxw.sxw $i; done</code></p><p>This way you assure that if the original SXW file already had a manifest, it will not be overwritten by the one from <code>sxw_manifest.zip</code>, which would not be a desirable outcome. Nonetheless, if you only have to repair one SXW file and you checked already that manifest is missing in it using tools like <code>zipinfo</code>, you can quietly use:</p><p><code>zipmerge <i><original-sxw-file></i>.sxw /path/to/sxw_manifest.zip</code></p><p>In order to merge the manifest directly into that file. Naturally, you can merge the manifest from the <code>sxw_manifest.zip</code> into the SXW file using any other zip-manipulation tool you prefer.</p><p>Enjoy and continue using LibreOffice, the free and open source office suite of reference.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-123502097784474242013-08-16T18:57:00.000+02:002013-08-16T18:57:37.431+02:00Extending the Swiss Army knife - an overview about writing of filters for LibreOffice<p>LibreOffice is sometimes regarded as the Swiss army knife when it comes to opening office file-formats. Although it might be a slight exaggeration, it is a point of honour of the development team to try to allow users to load into the suite as many of their documents as possible. Every major release from the first LibreOffice 3.3 came with new and improved import filters, often for file-formats that are under-documented, if any documentation can be found at all. In this article, we would like to present the way import filters interface with LibreOffice and give to an interested developer a starting point for adding her favourite file-format among those LibreOffice is able to open.</p><p><b>Filters creating documents directly into LibreOffice internal structures</b></p><p>In general, an import filter's task is to parse the foreign document, extract from it useful information, and feed it to the application in a way it can understand. Many internal filters, like the MS Word filter, use a direct way of communicating with LibreOffice. They import the document directly into the internal structures that represent those documents. The advantage of this approach is the lack of intermediary: the document is immediately understood by the application and no additional processing is needed. The disadvantage is that this approach requires an intimate knowledge of the internal structures used and has thus a steep learning curve. The next two types of filters will correspond better to a developer that does not want to dive too deep into LibreOffice internals, yet wants to have his work done.</p><p><b>OpenDocument format as an interchange format</b></p><p>Who has not heard about OpenDocument? Hardly anybody ignores its existence. But it is also a convenient interchange format for filter writers. No need in this case to understand the LibreOffice internals apart from some hundred lines of boilerplate code that are documented in various places. It suffices to read the source document and generate a "flat" OpenDocument representation of it. LibreOffice is able to load this kind of representation as if it was loading an ODF document.</p><p><b>XSLT filters</b></p><p>The easiest way to write a filter for an XML-based file-format is using the XSLT filter dialogue. All you need is to have an XSL transform that converts a foreign XML-based file-format to the "flat" ODF, for import filters; and that converts a corresponding ODF XML to the XML used by the foreign file-format, for export filter. Once those transforms exist, the integration with LibreOffice can be done using the user interface.</p><p align="center"><img src="http://people.freedesktop.org/~fridrich/blogs/xmlfiltersettings1.png" /></p><p align="center">Picture 1</p><p>In the Tools menu, chose XML Filter Settings, you will see listed all the XSLT filters that are already present in your LibreOffice installation along with the information about the application that is supposed to receive the resulting ODF document. Other information that can be found is about the direction of the conversion. Is it an import filter, export filter, or a filter that can import and export a foreign file-format.</p><p>If you click at "New", this dialogue will appear.</p><p align="center"><img src="http://people.freedesktop.org/~fridrich/blogs/xmlfiltersettingsnew.png"/></p><p align="center">Picture 2</p><p>In the "General" tab, you will be able to chose the user-visible information about the filter: its name, the application that will receive the converted document (for instance LibreOffice Calc (.ods) for a spreadsheet converted to the OpenDocument Spreadsheet format). This information is also used by LibreOffice to group different types of documents. If you chose presentations in the file-picker and your filter specifies that it is converting into the LibreOffice Impress application, then all files having the file-extension associated with the file-format will be shown in the list.</p><p>In the "Name of file type", you will be able to describe the file-format that your filter will handle and in the "File extension" field, you will need to put semicolon-separated list of possible extensions for files in the given file-format. For instance, the extensions for the files in Microsoft Excel 2003 XML file-format will end typically with extensions xml or xls. You can add a comment in the "Comments" field. This last field is optional and you can leave it empty if you desire.</p><p align="center"><img src="http://people.freedesktop.org/~fridrich/blogs/xmlfiltersettingstransform.png"/></p><p align="center">Picture 3</p><p>The next tab is the actual information about the XSL transformations that will do the conversion. The DocType field makes sense principally for import filters. The XSLT filters typedetection will scan for the string you enter there in the first 4000 bytes of the file. Since the typedetection searches for this string only in those first 4000 bytes, it is necessary to assure that the string one specifies can be found invariably in the very beginning of the file. You can leave the field empty if you desire. Then the typedetection will be done purely on the basis of an extension.</p><p>If you are writing an export filter, you will provide in the "XSLT for export" field the transform that will do the conversion from the OpenDocument XML to the file-format for which you write your filter. If this field remains empty, LibreOffice will know that you filter is not an export filter. The same is valid for the "XSLT for import" field. It will contain the path to the XSLT sheet that does the import transformation. Leaving it empty is telling LibreOffice that your filter is not an import filter. There are already several filters bundled with LibreOffice that do conversion only in one direction. For instance, the XHTML filters or the MediaWiki filter are used only to export to the corresponding file-formats.</p><p>You also have the option to specify the default template for filters that import from file-formats that don't carry style information. For instance, the bundled DocBook filter uses a template to specify styles of different outline levels. If you don't specify the template, there are two possibilities. Either your transform creates a document with full styles, or you rely on the default styles that LibreOffice uses.</p><p>The check-box "The filter needs XSLT 2.0 processor" is to be checked only if your transforms use some exclusive 2.0 features. It is nevertheless advisable to write xslt sheets of the 1.0 version. They are much simpler and, because of the performance issues of other xslt processors out there, LibreOffice uses under the hood libxslt. The fact that libxslt, has only limited support of the 2.0 features is widely offset by the performance improvement that its use brought.</p><p>Now, you are done with the integration of your filter, the dialogue in the Picture 1 allows you to test your transforms, and even to export your filter as an extension package and deploy it on different installations of LibreOffice or to distribute it over our extension web-site <a href="http://extensions.libreoffice.org" target="_blank">http://extensions.libreoffice.org</a></p><p>As you can see, the integration of an XSLT-based filter into LibreOffice is rather simple. That is the biggest advantage of this approach. Nevertheless, there are also some disadvantages. Despite of the migration of the XSLT engine to a relatively fast libxslt, the use of xsl transforms on large document can be relatively slow. Another disadvantage is that the transforms are not really good at converting documents where the concepts of the source and target file-formats cannot be easily mapped.</p><p><b>XFilter framework</b></p><p>The XFilter framework is the other way to integrate import filters with LibreOffice. In fact the previous XSLT-based filters use an intermediary layer that uses this framework too. The advantage of using the XFilter framework directly is the use of higher lever programming languages that allow much easier mapping of incompatible concepts, parsing of documents in several passes, as well as much more complex processing of gathered information. Moreover, this is the way to use if you need to write a filter for a file-format that is not XML-based, since the XSLT-based filters cannot be use to convert binary document file-formats.</p><p>The use of the XFilter framework is a bit more complicated then the use of the XSLT-based filter dialogue. Nevertheless, it is far from being rocket science. We will examine the steps needed for a typical import filter using the example of the recently added Microsoft Publisher filter in LibreOffice 4. For the sake of simplicity, we first start with the configuration files. You will need to craft two xml fragments, one for the filter description and one for the file-type.</p><p>Filter description:</p><p><code><node <font color="blue">oor:name=</font><font color="red">"Publisher Document"</font> <font color="blue">oor:op=</font><font color="red">"replace"</font>><br> <prop <font color="blue">oor:name=</font><font color="red">"Flags"</font>><br> <value>IMPORT ALIEN USESOPTIONS 3RDPARTYFILTER PREFERRED</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"FilterService"</font>><br> <value>com.sun.star.comp.Draw.MSPUBImportFilter</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"UIName"</font>><br> <value <font color="blue">xml:lang=</font><font color="red">"x-default"</font>>Microsoft Publisher <font color="green">97-2010</font></value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"FileFormatVersion"</font>><br> <value><font color="green">0</font></value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"Type"</font>><br> <value>draw_Publisher_Document</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"DocumentService"</font>><br> <value>com.sun.star.drawing.DrawingDocument</value><br> </prop><br></node></code></p><p>The <code><font color="blue">oor:name</font></code> attribute gives the name of the filter used internally. This name is important because the file-type and a corresponding filter are linked using it. As to the flags, I will mention here only two or three. The others can be used just as they are. The IMPORT flag specifies that we are implementing an import filter. For export filters, the flag is EXPORT and both flags are present for a bi-directional filter. The ALIEN flag is indicating that the filter handles a non-native file-format from the point of view of LibreOffice. When used with EXPORT flag, on export to the given file-format, it will trigger a dialogue warning about a possible data loss.</p><p>The FilterService property specifies the service that will be used for converting of the document. It is necessary that it corresponds exactly to the implementation name of your import filter. Since the filter is a so-called UNO component, it uses the java-like naming. The part com.sun.star.comp.Draw indicates that the filter is a component and converts a drawing and the MSPubImportFilter is the actual name of the filter.The UIName indicates a name that will appear in the file-selection dialogue for file-formats where none of the typedetections is able to detect them.The DocumentService property specifies which service will receive the result of the conversion. Here we are converting the Microsoft Publisher files into LibreOffice Draw as a drawing, that is why the document service will be the <code>com.sun.star.drawing.DrawingDocument</code>. If we were converting a text document, the document service would be the <code>com.sun.star.text.TextDocument</code>.</p><p>The Type property specifies the file type that the filter handles. This value is important because it must correspond to the oor:name attribute of the corresponding file-type description. It is necessary that the the name of the file-type starts with the indication of the receiving application. Here we use the <code><font color="red">draw_Publisher_Document</font></code> and for instance for the Wordperfect file-format, we use in LibreOffice the <code><font color="red">writer_WordPerfect_Document</font></code>. But lets profit from this and have a look at the second xml fragment, the file-type one. Here is one that corresponds to our example:</p><p><code><node <font color="blue">oor:name=</font><font color="red">"draw_Publisher_Document"</font> <font color="blue">oor:op=</font><font color="red">"replace"</font>><br> <prop <font color="blue">oor:name=</font><font color="red">"DetectService"</font>><br> <value>com.sun.star.comp.Draw.MSPUBImportFilter</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"Extensions"</font>><br> <value>pub</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"MediaType"</font>><br> <value>application/x-mspublisher</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"Preferred"</font>><br> <value>true</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"PreferredFilter"</font>><br> <value>Publisher Document</value><br> </prop><br> <prop <font color="blue">oor:name=</font><font color="red">"UIName"</font>><br> <value>Microsoft Publisher</value><br> </prop><br></node></code> </p><p>The DetectService specifies a service that is able to determine whether a document is of the given file-format. In our case, the <code>com.sun.star.comp.Draw.MSPUBImportFilter</code> is able to do both, the conversion and the type-detection. In the Extensions property, semi-colon separated values indicate possible extensions for file of the given file-format. In the case of an export filter, the first extension in the list is used for saving with automatic file-extension enabled. The MediaType property basically specifies the mime-type of the file-format. The other element that links the file-format with the corresponding filter is the PreferredFilter property. LibreOffice will invoke the "Publisher Document" to convert the document if the typedetection identifies it as "draw_Publisher_Document". As to the UIName, it specifies the way the document format will be referenced in the list of file-formats in the file-picker.</p><p>Now we finished the crafting of the configuration files. It is time to create a boilerplate C++ code. Our filter not only converts from Microsoft Publisher files, but is also able to determine whether a given document is a file-format it can import. For this purpose, it has to support two services: "com.sun.star.document.ImportFilter" and "com.sun.star.document.ExtendedTypeDetection". If we were implementing an export filter, we would have to support also the service "com.sun.star.document.ExportFilter". Besides the com::sun::star::document::XFilter interface that both are bound to implement ExportFilter service must also implement the com::sun::star::document::XExporter interface and ImportFilter has to implement the com::sun::star::document::XImporter. For initialization, the filter must also implement com::sun::star::lang::XInitialization. And since the filter implements UNO servies, it should also implement the com::sun::star::lang::XServiceInfo interface.</p><p>But, let us concentrate on the interfaces that are specific to the import filter. The XFilter interface has two functions, the filter and cancel. In our example we will implement the cancel() as a do-nothing function. As for the filter function, it is the one that will do the actual filtering.</p><p><code>sal_Bool SAL_CALL MSPUBImportFilter<font color="darkcyan">::filter</font><font color="green">(</font><font color="blue">const</font> Sequence<font color="darkblue"><</font><font color="darkcyan">PropertyValue</font><font color="darkblue">></font> &aDescriptor<font color="green">)</font> <font color="green">{</font></code></p><p>First, we will have to get the reference to the InputStream that represents the document we want to import. The aDescriptor is a sequence of pairs consisting of the value name and the actual value. The operator>>= will extract the value from the UNO Any (that can contain values of different types) into a variable of the requested type.</p><p><code> sal_Int32 nLength = aDescriptor.<font color="darkcyan">getLength</font><font color="green">()</font><font color="darkcyan">;</font><br> <font color="blue">const</font> PropertyValue *pValue = aDescriptor.<font color="darkcyan">getConstArray</font><font color="green">()</font><font color="darkcyan">;</font><br> OUString sURL<font color="darkcyan">;</font><br> Reference <font color="darkblue"><</font>XInputStream<font color="darkblue">></font> xInputStream<font color="darkcyan">;</font><br> <font color="blue">for</font> <font color="green">(</font>sal_Int32 i <font color="darkblue">=</font> <font color="blue">0</font><font color="darkcyan">;</font> i<font color="darkblue"><</font>nLength<font color="darkcyan">;</font> i<font color="darkblue">++</font><font color="green">)</font><br> <font color="blue">if</font> <font color="green">(</font>pValue<font color="green">[</font>i<font color="green">]</font>.<font color="darkcyan">Name</font> <font color="darkblue">==</font> <font color="red">"InputStream"</font><font color="green">)</font><br> pValue<font color="green">[</font>i<font color="green">]</font>.<font color="darkcyan">Value</font> <font color="darkblue">>>=</font> xInputStream<font color="darkcyan">;</font></code></p><p>Next we will have to specify the import service that will receive the converted document in the form of SAX messages. The com.sun.star.comp.Draw.XMLOasisImporter service is a service that receives the OpenDocument Graphics XML.</p><p><code> OUString sXMLImportService <font color="green">(</font><font color="red">"com.sun.star.comp.Draw.XMLOasisImporter"</font><font color="green">)</font><font color="darkcyan">;</font><br> Reference <font color="darkblue"><</font>XDocumentHandler<font color="darkblue">></font> xInternalHandler<font color="green">(</font><br> comphelper<font color="darkcyan">::ComponentContext</font><font color="green">(</font>mxContext<font color="green">)</font>.<font color="darkcyan">createComponent</font><font color="green">(</font>sXMLImportService<font color="green">)</font>,<br> UNO_QUERY<font color="green">)</font><font color="darkcyan">;</font></code></p><p>The XImporter sets up an empty target document for XDocumentHandler to write to.</p><p><code> Reference <font color="darkblue"><</font>XImporter<font color="darkblue">></font> xImporter<font color="green">(</font>xInternalHandler, UNO_QUERY_THROW<font color="green">)</font><font color="darkcyan">;</font><br> xImporter<font color="darkblue">-></font>setTargetDocument<font color="green">(</font>mxDoc<font color="green">)</font><font color="darkcyan">;</font></code></p><p>At this point, there is enough to plug into a filter that will read the xInputStream and write the resulting XML into the xInternalHandler. On success of the filtering operation, the filter function should return true and false on failure. After the implementation of this filter function, we will have to implement XImporter's setTargetDocument function.</p><p><code><font color="blue">void</font> SAL_CALL MSPUBImportFilter<font color="darkcyan">::setTargetDocument</font><font color="green">(</font><font color="blue">const</font> Reference <font color="darkblue"><</font>XComponent<font color="darkblue">></font> & xDoc<font color="green">)</font><br><font color="green">{</font><br> mxDoc <font color="darkblue">=</font> xDoc<font color="darkcyan">;</font><br><font color="green">}</font></code></p><p>In our case we just keep the Reference to XComponent in a member variable that we used in the previous snippet to set up an empty target that receives our imported document. And that would be all for the integration of an Import filter. For an export filter we would have to implement also the XExporter's setSourceDocument that is basically symmetrical to XImporter's setTargetDocument.</p><p>It is good to note that another way of integrating of filters into LibreOffice could be using the com::sun::star::xml::XExportFilter and com::sun::star::xml::XImportFilter interfaces that are grosso-modo equivalent to the described method. The difference is that the FilterService in the configuration xml file will be in this case always com.sun.star.comp.Writer.XmlFilterAdaptor and the actual filter component, as well as the target and source services are specified in the configuration file in the UserData property. But this is just for an anecdote, since the method I described in detail is much more generic.</p><p>When we were creating the xml configuration files, we said that the <code>com.sun.star.comp.Draw.MSPUBImportFilter</code> component is able to do also the type-detection. For that purpose, it must support the com::sun::star::document::XExtendedFilterDetection interface, and thus its detect function.This function should return the string corresponding to the type name in the configuration file if it detects the document and an empty string for the cases when it is not able to identify the document.</p><p><code>OUString SAL_CALL MSPUBImportFilter<font color="darkcyan">::detect</font><font color="green">(</font>Sequence <font color="darkblue"><</font>PropertyValue<font color="blue">></font> &Descriptor<font color="green">)</font><br><font color="green">{</font><br> OUString sTypeName<font color="darkcyan">;</font><br> sal_Int32 nLength <font color="darkblue">=</font> Descriptor.<font color="darkcyan">getLength</font><font color="green">()</font><font color="darkcyan">;</font><br> sal_Int32 location <font color="darkblue">=</font> nLength<font color="darkcyan">;</font><br> <font color="blue">const</font> PropertyValue *pValue <font color="darkblue">=</font> Descriptor.<fontcolor="darkcyan">getConstArray</font><font color="green">()</font><font color="darkcyan">;</font><br> Reference <font color="darkblue"><</font>XInputStream<font color="blue">></font> xInputStream<font color="darkcyan">;</font><br> <font color="blue">for</font> <font color="green">(</font>sal_Int32 i <font color="darkblue">=</font> <font color="blue">0</font><font color="darkcyan">;</font> i<font color="darkblue"><</font>nLength<font color="darkcyan">;</font> <font color="darkblue">++</font>i<font color="green">)</font><br> <font color="blue">if</font> <font color="green">(</font>pValue<font color="green">[</font>i<font color="green">]</font>.<font color="darkcyan">Name</font> <font color="darkblue">==</font> <font color="red">"TypeName"</font><font color="green">)</font><br> location<font color="darkblue">=</font>i<font color="darkcyan">;</font><br> <font color="blue">else if</font> <font color="green">(</font>pValue<font color="green">[</font>i<font color="green">]</font>.<font color="darkcyan">Name</font> <font color="darkblue">==</font> <font color="red">"InputStream"</font><font color="green">)</font><br> pValue<font color="green">[</font>i<font color="green">]</font>.<font color="darkcyan">Value</font> <font color="darkblue">>>=</font> xInputStream<font color="darkcyan">;</font></code></p><p>As in the filter function we need to extract from the sequence the InputStream that we will examine. There is one difference, we will keep the reference of the TypeName property, so that we can fill it with the name of the type in case we detected it. The detect function should fill the variable sTypeName with the right string in case the detection was successful. And it is in this case that we will specify this information to the Descriptor and return the name of the type.</p><p><code> <font color="blue">if</font> <font color="green">(</font>!sTypeName.<font color="darkcyan">isEmpty</font><font color="green">())</font><br> <font color="green">{</font><br> <font color="blue">if</font> <font color="green">(</font>location <font color="darkblue">==</font> Descriptor.<font color="darkcyan">getLength</font><font color="green">())</font><br> <font color="green">{</font><br> Descriptor.<font color="darkcyan">realloc</font><font color="green">(</font>nLength<font color="darkblue">+</font><font color="blue">1</font><font color="green">)</font><font color="darkcyan">;</font><br> Descriptor<font color="green">[</font>location<font color="green">]</font>.<font color="darkcyan">Name</font> <font color="darkblue">=</font> <font color="red">"TypeName"</font><font color="darkcyan">;</font><br> <font color="green">}</font><br> Descriptor<font color="green">[</font>location<font color="green">]</font>.<font color="darkcyan">Value</font> <font color="darkblue"><<=</font> sTypeName<font color="darkcyan">;</font><br> <font color="green">}</font><br> <font color="blue">return</font> sTypeName<font color="darkcyan">;</font><br><font color="green">}</font></code></p><p>It would be not true to say that this is all that is needed to integrate a filter into LibreOffice. There are still some ten to fifty lines of code needed for the implementation of the generic UNO boilerplate, an xml file for the UNO component registration during the build and some makefile changes. Nevertheless, those changes are just trivial and can be done by mimicking existing filters like those in the writerperfect module of the LibreOffice code.</p><p><b>Getting involved</b></p><p>Free software is about people and the LibreOffice projects values highly all contributors, regardless of the size of their contribution. The community is thrilled to welcome anybody that wants to lend hand to make the software better. And why not you? If you think that writing filters for LibreOffice is enough fun for you, there are plenty of dedicated developers ready to help you either on the developer list <a href="mailto:libreoffice@lists.freedesktop.org">libreoffice@lists.freedesktop.org</a> or on IRC at #libreoffice-dev channel of the Freenode server. Just drop by and we will help you to write your first filter. We guarantee that you will enjoy and stick with the project.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-85381546170508860092013-06-21T11:22:00.001+02:002013-06-21T12:08:59.268+02:00LibreOffice import filter for legacy Mac file-formats - smile and say "mwaw"!<p>Attentive reader of this blog remembers that, besides improvements in the most frequently used file-formats, each major release of <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> adds to the list of document file-formats that are freed from the dungeon of vendor lock. In a collaboration with <i>re-lab</i>'s Valek Filippov and (then GSoC student and now Lanedo's LibreOffice developer) Eilidh McAdam, LibreOffice 3.5 brought the possibility to open and see <a href="http://fridrich.blogspot.com/2011/11/it-has-been-long-time-since-i-last-time.html" target="_blank">the most commonly used Visio files</a> to the FLOSS world. LibreOffice 3.6 was able to claim <a href="http://fridrich.blogspot.com/2012/04/libreoffice-coreldraw-import-filter.html" target="_blank">the most comprehensive coverage of CorelDraw file-format</a> with the ability to open even the oldest CorelDraw 1 and 2 files that modern versions of CorelDraw are not able to open any more.</p><p>The latest major release of <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> was also full of goodies. First, the fruitful collaboration of re-lab's Valek Filippov with (then GSoC student and now amazon.com employee) Brennan T. Vincent <a href="http://fridrich.blogspot.ch/2012/06/libreoffice-ms-publisher-import-filter.html" target="_blank">produced the first ever possibility of reading Microsoft Publisher files</a> in the FLOSS world. Second, with the advent of Microsoft Office 2013 and change in the Visio 2013 file-format, <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> extended the coverage of Visio file-format to <a href="http://fridrich.blogspot.ch/2012/12/libreoffice-visio-import-filter-20.html" target="_blank">all files any version of Visio ever produced</a>.</p><p><a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> 4.1 release is approaching quickly. And that is an excellent news for bad teenage poetry and other literary production from the late 80s and early 90s. With the up-coming new release, <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> extends support for a host of pre-OSX MAC text formats. This is a result of a continuous effort to open as many legacy file-formats as possible to our users, and help them to settle for ODF.</p><p>This particular improvement was possible thank to the integration of <a href="http://sourceforge.net/p/libmwaw/wiki/Home/" target="_blank"><code>libmwaw</code></a> written by <a href="http://www.loria.fr/~alonso/" target="_blank">Laurent Alonso</a>, <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> contributor and already co-maintainer of <a href="http://sourceforge.net/projects/libwps/" target="_blank"><code>libwps</code></a> and of the Microsoft Works import filter inside <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a>. The horsepower doing the conversions, <code>libmwaw</code> is one of the libraries from the <a href="http:/libwpd.sf.net" target="_blank"><code>libwpd</code></a> family. In the same way as <a href="http://libwps.sourceforge.net" target="_blank"><code>libwps</code></a>, <code>libmwaw</code> reuses <code>libwpd</code>'s interfaces and the ODF generator classes in <code>libodfgen</code> in order to convert its callbacks into an xml stream in flat ODF file-format. The import filter lives in the module <code>writerperfect</code>.</p><p>The supported file-format include Microsoft Word for Mac from versions 1 to 5.1, Mac versions of Microsoft Works, different versions of ClarisWorks and AppleWorks, to name but a few. The list of supported file-format and of imported features is increasing literally every day. This promises further good news with every minor release of <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> 4.1. More teenage poetry and bad litterature will be freed from the pit of discontinued software products.</p><p>After having found a way to get screenshots of some sample documents in their respective generating application, we are able to satisfy those readers that are hungry for pictures. First is a <a href="http://people.freedesktop.org/~fridrich/blogs/Business_Letter">sample document</a> in Mac Word 5.1 (1992) file-format opened in the originating application and in the up-coming <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> 4.1:</p><p><table align="center"><tr><td><a href="http://people.freedesktop.org/~fridrich/blogs/Business_Letter_Mac_Word_5_1.png" alt="Business Letter in Mac Word 5.1" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Business_Letter_Mac_Word_5_1.png" /></a></td><td> </td><td><a href="http://people.freedesktop.org/~fridrich/blogs/Business_Letter_lo41.png" alt="Business Letter in LibreOffice 4.1" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Business_Letter_lo41.png" /></a></td></tr>
</table></p><p>Following is a <a href="http://people.freedesktop.org/~fridrich/blogs/Business_Letter">simple document</a> with picture produced by Write Now 4.1 from about 1993. It demonstrates the reason why <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> is frequently called the "Swiss Army knife" of file-formats:</p><p><table align="center"><tr><td><a href="http://people.freedesktop.org/~fridrich/blogs/Invitation_Write_Now_4_0.png" alt="Invitation in Write Now 4.0" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Invitation_Write_Now_4_0.png" /></a></td><td> </td><td><a href="http://people.freedesktop.org/~fridrich/blogs/Invitation_lo41.png" alt="Invitation in LibreOffice 4.1" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Invitation_lo41.png" /></a></td></tr>
</table></p><p>Following is an example of conversion of a <a href="http://people.freedesktop.org/~fridrich/blogs/Newsletter">document</a> in MacWrite Pro 1.5 file-format from 1994:</p><p><table align="center"><tr><td><a href="http://people.freedesktop.org/~fridrich/blogs/Newsletter_MacWrite_Pro_1_5.png" alt="Newsletter.back in Mac Write Pro 1.5" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Newsletter_MacWrite_Pro_1_5.png" /></a></td><td> </td><td><a href="http://people.freedesktop.org/~fridrich/blogs/Newsletter_back_lo41.png" alt="NewsLetter.back in LibreOffice 4.1" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Newsletter_back_lo41.png" /></a></td></tr>
</table></p><p>And, last but not least is an example of conversion of a <a href="http://people.freedesktop.org/~fridrich/blogs/Teacher_Letterhead">wordprocessing documents</a> in AppleWorks 6.0 from the late 90s. The software was discontinued by Apple with the end-of-life of their PowerPC series. But <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> can resurrect your documents:</p><p><table align="center"><tr><td><a href="http://people.freedesktop.org/~fridrich/blogs/Teacher_Letterhead_AppleWorks6_0.png" alt="Teacher Letterhead in Apple Works 6.0" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Teacher_Letterhead_AppleWorks6_0.png" /></a></td><td> </td><td><a href="http://people.freedesktop.org/~fridrich/blogs/Teacher_Letterhead_lo41.png" alt="Teacher Letterhead in LibreOffice 4.1" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_Teacher_Letterhead_lo41.png" /></a></td></tr>
</table></p><p>Pretty exciting news! But the most exciting thing is that you can be part of this adventure. Join the fun by <a href="https://www.libreoffice.org/get-help/bug/" target="_blank">submitting bugs</a> or by fixing your personal itches. So, if you want to help, patches can be sent to <a href="mailto:libreoffice@lists.freedesktop.org"><code>libreoffice-dev</code></a> mailing list. And, do not forget to find a way to join the <a href="irc://chat.freenode.net/libreoffice-dev" target="_blank"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/" target="_blank"><code>irc.freenode.net</code></a> in order to meet other developers. We can promise you that you will have a lot of fun in the <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> community.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-49179265593060078672013-05-28T09:18:00.000+02:002013-05-28T18:08:56.167+02:00LibreOffice Google Summer of Code 2013 - selected projects<p>There is now again the period of the year when the results of <a href="https://developers.google.com/open-source/soc/" target="_blank">Google Summer of Code</a> selection are public. As for <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> project, we have got 13 slots this year. We love you, Google! We really do!</p><p>Nonetheless, we had much more good applications then the slots and we had to do hard choices based on a variety of parameters. And the final line-up that came out is:</p><p><table><tr><td width="70%"><b>Project</b></td><td width="5%"> </td><td width="25%"><b>Student</b></td>
<tr><td>Adding alterating row coloring to database ranges and supporting new structured reference syntax</td><td> </td><td>she91</td></tr>
<tr><td>Code completion in the Basic IDE</td><td> </td><td>stalker08</td></tr>
<tr><td>Extend support for Document Management Systems</td><td> </td><td>Cuong Cao Ngo</td></tr>
<tr><td>Implement Firebird SQL connector for LibreOfficeBase</td><td> </td><td>Andrzej Hunt</td></tr>
<tr><td>Implementing an about:config functionality</td><td> </td><td>Efe Gürkan YALAMAN</td></tr>
<tr><td>Implementing Proper Table Styles in Writer</td><td> </td><td>Ivan Nicolae-Alexandru</td></tr>
<tr><td>Impress Remote Control for iOS</td><td> </td><td>LIU Siqi</td></tr>
<tr><td>Improve toolbars in LibreOffice</td><td> </td><td>Prashant Pandey</td></tr>
<tr><td>Improved Android / Impress Remote Control</td><td> </td><td>Artur Dryomov</td></tr>
<tr><td>Slide Layout Extendibility</td><td> </td><td>Vishv Brahmbhatt</td></tr>
<tr><td>Use Widget Layout for the Start Center</td><td> </td><td>Krisztian Pinter</td></tr>
<tr><td>VLC integration into LibreOffice</td><td> </td><td>Minh Ngo</td></tr>
<tr><td>Writer: Border around characters</td><td> </td><td>Zolnai Tamás</td></tr>
</table></p><p>Congratulations to the selected students. We expect you to be bonding hard during the community bonding period that just started. Your presence on IRC and even start of the hacking is required now!</p><p>For the students that unfortunately could not be selected, do not be discouraged. Your <a href="https://wiki.documentfoundation.org/Development/Easy_Hacks" target="_blank">Easy Hack</a> patches made a real difference, sorry it did not work out this time. The <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> community is always welcoming and you can learn a lot just by staying around and working at your pace on your chosen <a href="https://wiki.documentfoundation.org/Development/Easy_Hacks" target="_blank">Easy Hack</a>.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-38854745621046895702012-12-07T17:19:00.001+01:002012-12-12T22:50:27.023+01:00LibreOffice Visio Import Filter: 20 years of drawings opened in your favourite office suiteIt is true that the support of the most used Microsoft Visio file formats in <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> will celebrate 1 year next February. And I will gladly have a birthday talk with any of you who will be freezing in Brussels during the next <a href="http://fosdem.org/" target="_blank">FOSDEM 2013</a>. Nonetheless, even though libvisio was in development for several months already, the Visio story was far from finished when we released that day. As I already mentioned in <a href="http://fridrich.blogspot.ch/2012/11/libreoffice-coreldraw-import-filter.html" target="_blank">another blogpost</a> concerning reverse-engineering of file formats, assessment of a conversion quality in this kind of cases is illusory before real users get to stress-test it with real-life documents.<br />
Since the first release of our filter in <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> 3.5.0, we were improving it thanks to bug reports from our users. It is a big <b>thank you</b> that I would like to say to all those that took the bother to submit reports in our bugzilla. Without you, guys, this filter would be only a moot exercise.<br />
But wait... Do I write this blog now only to thank the people who contributed to the current quality of the filter? Yes to a big extent! Nevertheless, I know that the distinguished readers of this blog would like to have some news. And, yes, we have some news.<br />
The <a href="http://www.freedesktop.org/wiki/Software/libvisio" target="_blank"><code>libvisio</code></a> library underwent heavy re-factoring as we started to understand more and more details about the underlying file-format.<br />
<ol><li>A particular bug report about files imported as empty pages provided us with a document structure that we did never see before. This resulted in a more generic parser and unification on the way we parse master shapes and visible pages.</li>
<li>This re-factoring in its turn allowed us to extend our file-format coverage to all earlier binary Visio file-format versions. We now support all binary Visio documents starting from Visio 1 (released in 1992).</li>
<li>Extending the support to earlier file-format versions allowed us to better understand the development of the file-format, to find more information that we did not parse before, and improve the conversion quality for other binary versions too.</li>
<li>Another re-factoring came with our work to support the <code>XML</code>-based Visio file-formats, namely the "XML Drawing" also known as <code>*.vdx</code>; and the Microsoft Visio 2013 new file-format, known as <code>*.vsdx</code>.</li>
</ol>So the news is that <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> 4.0.0 will be able to open <b>ALL</b> Visio files starting from Visio 1 (release in 1992) until Microsoft Visio 2013 (released just some weeks ago).<br />
And since the readers of this blog are more interested in pictures than in pointless words, here come some candies for your eyes:<br />
<table align="center"><tbody>
<tr><td>File opened in Visio 1.0</td><td> </td><td>The same file opened in LibreOffice 4.0.0 beta1</td></tr>
<tr><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/Visio_Flowchart_vsd.png" target="_blank"><img alt="File in Visio 1.0" src="http://people.freedesktop.org/~fridrich/blogs/thumb_Visio_Flowchart_vsd.png" /></a></td><td> </td><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/Draw_FlowChart_vsd.png" target="_blank"><img alt="File in LibreOffice 4.0.0 beta1" src="http://people.freedesktop.org/~fridrich/blogs/thumb_Draw_FlowChart_vsd.png" /></a></td></tr>
</tbody></table><br />
<table align="center"><tbody>
<tr><td>VSDX File opened in Microsoft Visio 2013</td><td> </td><td>The same file opened in LibreOffice 4.0.0 beta1</td></tr>
<tr><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/Visio_Rack_vsdx.png" target="_blank"><img alt="VSDX File in Microsoft Visio 2013" src="http://people.freedesktop.org/~fridrich/blogs/thumb_Visio_Rack_vsdx.png" /></a></td><td> </td><td align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/Draw_Rack_vsdx.png" target="_blank"><img alt="File in LibreOffice 4.0.0 beta1" src="http://people.freedesktop.org/~fridrich/blogs/thumb_Draw_Rack_vsdx.png" /></a></td></tr>
</tbody></table>So, download the <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a> 4.0.0 beta1 and help us testing the new big release. We are interested in bug reports that help us to improve our quality. And for those that would love to support us with donations, just click here:<br />
<div align="center"><a href="http://donate.libreoffice.org/" target="_blank"><img alt="Donate for LibreOffice" src="http://www.libreoffice.org/assets/Donations/LibOWebsiteBannersDonateEN400b.png" /></a></div><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-35156615255188959972012-11-26T15:57:00.001+01:002012-11-26T16:34:40.121+01:00LibreOffice CorelDraw import filter: improvements by user input<p>It has been a long time without communicating with the distinguished readership of my blog. There was a hard decision to be made between producing code and producing literature. The code won until now. But now I have found a time to lift my head up from the coding, so the literature is back.</p><p>Many of you might be wondering what happened since my <a href="http://fridrich.blogspot.ch/2012/06/libreoffice-coreldraw-import-filter.html" target="_blank">post about the text support in CorelDraw files from last June</a>. Things are going pretty well. Since the CorelDraw import filter was released with <a href="http://www.libreoffice.org" target="_blank">LibreOffice 3.6</a>, the users started to use the feature and report bugs. We were working on fixing them and improving the <a href="http://www.freedesktop.org/wiki/Software/libcdr" target="_blank"><code>libcdr</code>'s</a> quality.</p><p><b>Quick overview of reverse-engineering process</b></p><p>From my discussions with our users and developers on-line and during some of the conferences that I attended, I realize that there is a slight misunderstanding in the large public about how the reverse-engineering works. So, here are some thoughts that may help understand it a bit more:</p><p>At the beginning of the process, there is a file-format. We don't know anything about its internal structure. There is no documentation whatsoever about it. One tries to generate a file in this file-format and examine it in hexadecimal viewer. Next, one tries to operate some little change in the document and examine what changed in the file itself. Eventually after many iterations, one might find regularities and some structure that helps to divide the file into several sections or blocks of more manageable size. It is essential in this phase that one can encode this information into some kind of introspection tool, since a plain hexadecimal viewer is not a very productive tool in the long run. We use for introspection of documents <a href="http://bugware.livejournal.com/" target="_blank">Valek Filippov</a>'s <a href="http://libregraphicsworld.org/blog/entry/whats-new-with-re-lab-and-ole-toy" target="_blank"><code>oletoy</code></a>, a python tool that stores our knowledge about the structure of different file-formats.</p><p>Once there is enough information about how to parse the document structure, the next target becomes to get some visible results. In order to save time and get visible results in a short time, all libraries such as <code>libcdr</code> or <code>libvisio</code>, use the <a href="http://libwpg.git.sourceforge.net/git/gitweb.cgi?p=libwpg/libwpg;a=blob_plain;f=src/lib/WPGPaintInterface.h;hb=refs/heads/STABLE-0-2-0" target="_blank"><code>libwpg</code>'s interface</a>. Reusing this interface means a considerable saving of time, since there are already working generators of ODG and SVG from the callbacks of this interface. Having visible results soon in the development/reverse-engineering cycle also allows visually asses the import results and correct them if necessary. Eventually, one can realize the absence of necessary information and try to go back to reverse-engineering to find it.</p><b>Users' feedback is essential</b></p><p>The support of reverse-engineered file-formats is a constant work-in-progress. A subtle dance between implementation and information digging. In this process, the user feedback is an essential element. The theories about the meaning of some information inside file hold only until a file comes to falsify them. Even a complex file generated by a developer is easily beaten by real life documents. And each file that shows a "weird" bug is advancing the understanding of the file-format. Let us look at this example:</p><p>After the release of <a href="http://www.libreoffice.org" target="_blank">LibreOffice 3.6.1</a>, we got a not so good assessment of the quality of the CorelDraw import filter in the <a href="http://www.heise.de/ct/inhalt/2012/20/76/">heise.de' c't</a> review. Those of you that understand German can delight in the nuanced evaluation:</p><blockquote><p>Ein neuer Import-Filter in Draw öffnet jetzt auch CorelDraw-Dateien, was uns im Test allerdings nur mit sehr einfachen Zeichnungen fehlerfrei gelang. In dieser Form ist er schlicht unbrauchbar.</p></blockquote><p>Which can be mildly translated into English (given the understatements so common in en-GB):</p><blockquote><p>A new import filter in Draw opens now also CorelDraw files, which we managed to do without errors only with very simple drawings. In this form, it is rather unusable.</p></blockquote><p>Since we are really concerned about the quality of our software, we are thankful for any bug report whether it is brought to us in a friendly or other manner. This specific bug report helped us to understand how are stored in newer CorelDraw files chains of matrix transforms. And since a picture speaks louder then thousand words, compare the document c't was refering to opened in LibreOffice 3.6.2 and then in LibreOffice 3.6.3, after we fixed the position bits.</p><table align="center"><tr><td>File opened in Libreoffice 3.6.2</td><td> </td><td>The same file opened in LibreOffice 3.6.3</td></tr>
<tr><td align="center" ><a href="http://people.freedesktop.org/~fridrich/blogs/heise-avant.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_heise-avant.png" alt="File in LibreOffice 3.6.2" /></a></td><td> </td><td align="center" ><a href="http://people.freedesktop.org/~fridrich/blogs/heise-apres.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_heise-apres.png" alt="File in LibreOffice 3.6.2"/></a></td></tr>
</table><p>So feel encouraged to submit bugs against the CorelDraw import filter, or — even better — send us patches for your favorite itch.</p><br />
<div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-63456147472918794792012-07-02T10:52:00.001+02:002012-07-02T10:52:39.867+02:00Susan's Book on Intellectual Property and Access to Education<p>I am happy to announce <a href="http://www.brill.nl/international-copyright-law-and-access-education-developing-countries" target="_blank">the upcoming book of my dear wife</a>. A must read for all interested in intellectual property, in access to copyrighted materials and in development issues.</p><p>This book originates from a PhD thesis defended at the <a href="http://graduateinstitute.ch/" target="_blank">Graduate Institute of International and Development Studies, Geneva, Switzerland</a>. It has been awarded "summa cum laude" mention.</p><p>Check, please, with your libraries whether they know about the book and advise them strongly to purchase it for the biggest good of the humanity :)</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-56550316581572146972012-06-12T10:01:00.000+02:002012-06-12T10:01:00.451+02:00LibreOffice CorelDraw Import filter - text support hatches out<p>Uff, it is done!!!</p><p>We started to work on the text support inside <a href="http://www.freedesktop.org/wiki/Software/libcdr" target="_blank"><code>libcdr</code></a> already before the <a href="http://libre-graphics-meeting.org/2012/" target="_blank">Libre Graphics Meeting in Vienna</a>. We worked hard during the talks and the long evenings after having eaten some portions of <i>Wienerschnitzl</i>.</p><p>Now we are proud to announce that we managed to release yesterday <a href="http://cgit.freedesktop.org/libreoffice/libcdr/" target="_blank"><code>libcdr-0.0.8</code></a> with "basic initial primitive [u]ncomplete" (further BIPU) <b>text support</b>. At the moment, we are supporting only a couple of parameters as a font face and font size and we are able to detect the encoding and produce a corresponding utf-8 string. Far from being perfect, it is nonetheless a milestone, because in the FOSS world, there was no support for CorelDraw text before.</p><p>We know that you prefer to look at nice pictures instead of reading bad text. So, this gives your heart's desires.</p><p>A simple document with text in CorelDraw 7:<p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/fancytext_cdr7.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_fancytext_cdr7.png" alt="fancytext_cdr7.cdr in CorelDraw 7"/></a></p><p>The same document opened in a build of <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> from yesterday:</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/fancytext_draw.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_fancytext_draw.png" alt="fancytext_cdr7.cdr in CorelDraw 7"/></a></p><p>At the moment, <a href="http://www.freedesktop.org/wiki/Software/libcdr" target="_blank"><code>libcdr</code></a> is able to convert text in CorelDraw documents from versions 7 to 16. Nonetheless, we know already roughly how to read it in files of lower versions and we will add the support for next release. In the same way, we will extend our support of other text properties, like font colour, transparency, effects, paragraph alignments, character positions, etc.</p><p>How can I test it? All this goodness will be part of <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> 3.6.0 release. You will be able to test the text support in the 3.6.0 beta2 pre-release. For the brave, any of the <a href="http://dev-builds.libreoffice.org/daily" target="_blank">daily builds</a> that are built from a code checkout after June 11th also include <a href="http://cgit.freedesktop.org/libreoffice/libcdr/" target="_blank"><code>libcdr-0.0.8</code></a> and thus the text support in CorelDraw files.</p><p>As usual, this is a free and open source software project and, as such, it delights in developers that want to help. So, if you feel the itch, patches can be sent to <a href="mailto:libreoffice@lists.freedesktop.org"><code>libreoffice-dev</code></a> mailing list. And, do not forget to find a way to join the <a href="irc://chat.freenode.net/libreoffice-dev" target="_blank"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/" target="_blank"><code>irc.freenode.net</code></a> in order to meet other developers. We can promis you that you will feel at home in the <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> community.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-64697306657219060542012-06-06T15:23:00.000+02:002012-06-12T20:06:55.153+02:00LibreOffice MS Publisher Import filter - young but strong baby<p>As <a href="http://sophiegautier.com/blog/index.php/2012/06/05/157-une-nouvelle-branche-de-printemps-la-36" target="_blank">Sophie Gauthier</a> announced in the language of Voltaire, <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> was branched for the beta phase in view of the 3.6 release. This is a major step in order to bring the features we were working on during the last half a year to the end users. But, it is also oportunity to bring to the main codebase all the nifty nice features that were developed in feature branches and targeted for the next big release, presumably the 3.7.</p><p>It is this way that the first version of our new Microsoft Publisher import filter landed to the master. This filter is developed by Brennan Vincent from the University of Arizona in the frame of the <a href="http://www.google-melange.com/gsoc/org/google/gsoc2012/libreoffice" target="_blank">Google Summer of Code</a>. Although being a work in progress and supporting for the while only the Publisher 2003 file-format, the progress is spectacular. Brennan has been busy like a bee even long before the start of the program. After only two weeks from the official kick-off, we have a first (non-)release, <a href="http://dev-www.libreoffice.org/src/libmspub-0.0.0.tar.gz" target="_blank">libmspub-0.0.0</a>.</p><p>And as the careful readers of this blog already know, an image speaks louder then thousand words, here are the pics:</p><p>A random document from the Internet opened in Microsoft Publisher 2003:</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/mspub_russian_pub2k3.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_mspub_russian_pub2k3.png" alt="Document in Publisher 2003"/></a></p><p>The same document opened in LibreOffice master build from yesterday:</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/mspub_russian_draw.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_mspub_russian_draw.png" alt="The same document in LibreOffice Draw"/></a></p><p>With <a href="http://bugware.livejournal.com/" target="_blank">Valek Filippov</a>, we have a lot of fun mentoring this project. If anybody of the distinguished readership wants to join this effort, the code of <code>libmspub</code> lives in <a href="http://cgit.freedesktop.org/libreoffice/libmspub" target="_blank">LibreOffice freedesktop.org repository</a>. The patches can be sent to <a href="mailto:libreoffice@lists.freedesktop.org"><code>libreoffice-dev</code></a> mailing list. And, do not forget to find a way to join the <a href="irc://chat.freenode.net/libreoffice-dev"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/"><code>irc.freenode.net</code></a> in order to meet other developers.</p><p>You will never regret the decision to get involved in <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a>.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-22114867229227632422012-04-23T21:57:00.001+02:002012-04-23T22:05:31.258+02:00Google Summer of Code 2012 - accepted projects for LibreOffice<p>Google announced today the accepted students for Google Summer of Code 2011.</p><p>The students working on LibreOffice will be:</p><table><tr><td align="left">Student</td><td align="left">Title</td><td align="left">Mentor</td></tr><tr><td align="left">Andrzej Hunt</td><td align="left">Smartphone remote control for LibreOffice Impress</td><td align="left">Muthu Subramanian</td></tr><tr><td align="left">ArturoPL</td><td align="left">Tooling - More and better tests </td><td align="left">Michael Stahl</td></tr><tr><td align="left">Brennan Vincent</td><td align="left">Implementing a Microsoft Publisher import filter for LibreOffice</td><td align="left">Valek Filippov</td></tr><tr><td align="left">Daniel Bankston</td><td align="left">Calc Performance Improvements</td><td align="left">Kohei Yoshida</td></tr><tr><td align="left">Daniel Korostil</td><td align="left">Lightproof improvements</td><td align="left">László Németh</td></tr><tr><td align="left">Gökcen Eraslan</td><td align="left">Signed PDF export</td><td align="left">Stephan Bergmann</td></tr><tr><td align="left">iainb</td><td align="left">Java GUI for Libre-Office Based Android App(s)</td><td align="left">Tor Lillqvist</td></tr><tr><td align="left">Marco Cecchetti</td><td align="left">Enhanced Impress svg export filter</td><td align="left">Thorsten Behrens</td></tr><tr><td align="left">Matúš Kukan</td><td align="left">Telepathy for collaboration</td><td align="left">Eike Rathke (erAck)</td></tr><tr><td align="left">Rafael</td><td align="left">New templates picking UI</td><td align="left">Cédric Bosdonnat</td></tr></table><p>Let the summer start immediately and let quality code fall like a spring rain!</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-14620923965520617222012-04-02T17:37:00.001+02:002012-04-03T13:55:22.208+02:00LibreOffice CorelDraw Import filter - the best file-format coverage in the FOSS world<p>I just realized that has been a long long time since I last <a href="http://fridrich.blogspot.com/2012/01/libreoffice-coreldraw-import-filter.html" target="_blank">blogged</a> about <a href="http://cgit.freedesktop.org/libreoffice/libcdr/" target="_blank"><code>libcdr</code></a> and the CorelDraw import filter in <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a>. Those that know me well can imagine that it is much more fun to write code then to write blogs. Nonetheless, one serious breakthrough happened this weekend and I cannot prevent myself from climbing on the roofs and shout.</p><p>On 20<sup>th</sup> of March 2012, <a href="http://www.corel.com/corel/" target="_blank">Corel</a> released a new version of CorelDraw Graphics Suite X6. We got the information from this <a href="http://en.wikipedia.org/wiki/CDR_%28file_format%29#CDR_file_format" target="_blank">Wikipedia</a> page and downloaded the evaluation version on Friday. Although it was usual to see the file-format mutate a bit with every released version, this release changed the file-format substantially in what concerns the <a href="http://en.wikipedia.org/wiki/Resource_Interchange_File_Format" target="_blank">RIFF chunks</a>. To cut the long story short, we managed to get the last pieces reverse-engineered today and we released libcdr-0.0.6 with support of all 32-bit CorelDraw formats, from version 6 to 16.</p><p>The new release tarball was integrated in <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> which became the first and only FOSS application that supports versions 6, 15 and 16 of the CorelDraw file-format. This goodness will be part of our 3.6 release later this year. For those that do not know fear, the feature can be tested in daily builds that will start to appear tomorrow morning <a href="http://dev-builds.libreoffice.org/daily/" target="_blank">here</a>.</p><p>I know that the distinguished readership prefers pictures to words. Here is this simple document in CorelDraw X6 format:</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/terra_v16_1.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_terra_v16_1.png" alt="Terra in Corel 1"/></a> <a href="http://people.freedesktop.org/~fridrich/blogs/terra_v16_2.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_terra_v16_2.png" alt="Terra in Corel 2"/></a></p><p>Here is the same document opened by LibreOffice Draw:</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/terra_v16_lodraw.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_terra_v16_lodraw.png" alt="Terra in LibreOffice Draw"/></a></p><p>And here is the <code>libcdr</code>-generated SVG opened in Inkscape:</p><p align="center"><a href="http://people.freedesktop.org/~fridrich/blogs/terra_v16_inkscape.png" target="_blank"><img src="http://people.freedesktop.org/~fridrich/blogs/thumb_terra_v16_inkscape.png" alt="Terra in converted to SVG"/></a></p><p>If you are tempted and think that it might be fun to participate in a reverse-engineering endavour, we have with Valek two project proposals for Google Summer of Code 2012. The first is the implementation of <a href="http://wiki.documentfoundation.org/Development/Gsoc/Ideas#Implement_Microsoft_Publisher_Import_filter_for_LibreOffice" target="_blank">MS Publisher import filter for LibreOffice</a> and the second is to help to <a href="http://wiki.documentfoundation.org/Development/Gsoc/Ideas#Improve_and_extend_the_CorelDraw_Import_filter_for_LibreOffice" target="_blank">improve and extend the Corel Draw import filter</a> I am currently blogging about. Try to apply with <a href="http://www.google-melange.com/gsoc/org/google/gsoc2012/libreoffice" target="_blank">LibreOffice</a> and your life will never be the same again.</p><p align="center"><a href="http://www.google-melange.com/gsoc/org/google/gsoc2012/libreoffice" target="_blank"><img src="http://code.google.com/images/GSoC2012_300x200.png"/></a></p><p>Be aware though that the application deadline is the 6<sup>th</sup> of April and you will need to accomplish a simple programing task in order to be eligible. More details in this <a href="http://blog.documentfoundation.org/2012/03/26/libreoffice-google-summer-of-code/" target="_blank">blog</a>.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-44351781194096342622012-03-19T14:30:00.002+01:002012-03-19T14:32:29.589+01:00Google Summer of Code at LibreOffice: become famous and maybe rich!<p align="center"><a href="http://www.google-melange.com/gsoc/org/google/gsoc2012/libreoffice" target="_blank"><img src="http://code.google.com/images/GSoC2012_300x200.png"/></a></p><p>So, just before the weekend, we received the great news that Google chose <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> as a mentoring organisation for <a href="http://code.google.com/soc/" target="_blank">Google Summer of Code</a> again this year. Some of you might remember that last year we had several extremely successful Google Summer of Code projects and that two of our successful students are currently employed working on free and opensource software as a direct consequence of their participation in the program. I had a priviledge to mentor <a href="http://www.derivativezero.com/blog/author/eilidh/" target="_blank">Eilidh McAdam</a> and we implemented a <a href="http://fridrich.blogspot.com/2011/11/it-has-been-long-time-since-i-last-time.html" target="_blank">Visio import filter</a> that is one of the flagship features of LibreOffice 3.5. Eilidh is now employed by <a href="http://www.lanedo.com/" target="_blank">Lanedo</a>.</p><p>This year, we proposed with Valek Filippov two projects related to reverse-engineered file-formats. The first is the implementation of <a href="http://wiki.documentfoundation.org/Development/Gsoc/Ideas#Implement_Microsoft_Publisher_Import_filter_for_LibreOffice" target="_blank">MS Publisher import filter for LibreOffice</a> and the second is to help to <a href="http://wiki.documentfoundation.org/Development/Gsoc/Ideas#Improve_and_extend_the_CorelDraw_Import_filter_for_LibreOffice" target="_blank">improve and extend the Corel Draw import filter</a> that will be part of LibreOffice 3.6 release. Both projects require working knowledge of C++ and a lot of good will. Each of the import filters consists of a standalone library and a glue that plugs the library into LibreOffice. These libraries can be built as system libraries and LibreOffice can use them from the system. The advantage of this approach for a student participating at the development is that there is only a minimum need of recompiling LibreOffice if some substantial part of the glue (that is rather small) changes. Therefore, I encourage all of you who are considering applying with LibreOffice for this year's Google Summer of Code to have a close look at those two projects. As a bonus is that if you are successful, you become famous and eventually rich.</p><p>You can have a look at <a href="http://cgit.freedesktop.org/libreoffice/libcdr/" target="_blank"><code>libcdr</code></a>, the horsepower behind the Corel Draw import filter and at the skeleton of <a href="http://cgit.freedesktop.org/~fridrich/libmspub" target="_blank"><code>libmspub</code></a>, that will be the basis of the Publisher import filter. And don't hesitate to <b>become rich and famous with Google Summer of Code at LibreOffice</b></p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-9955963671386804662012-01-31T15:56:00.001+01:002012-01-31T16:03:22.940+01:00LibreOffice CorelDraw Import filter - don't despise the humble beginnings<p>You might still remember <a href="http://fridrich.blogspot.com/2011/11/it-has-been-long-time-since-i-last-time.html" target="_blank">some</a> <a href="http://fridrich.blogspot.com/2011/07/libreoffice-visio-import-filter-round.html" target="_blank">of</a> <a href="http://fridrich.blogspot.com/2011/06/libreoffice-visio-import-filter-shaping.html" target="_blank">my</a> <a href="http://fridrich.blogspot.com/2011/06/libreoffice-visio-import-filter-first.html" target="_blank">blogs</a> about <a href="http://wiki.documentfoundation.org/ReleaseNotes/3.5#Filters" target="_blank">our new and shiny MS Visio import filter</a> in the upcoming <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> 3.5.0.</p><p>But what about 3.6.0? Is it going to be an exciting version too? Well, the answer depends on what kind of things excite you generally, but for sure, there will be a lot of goodness as usual to make the best free office suite even better.</p><p>In my free time, I have been working for some time already on the next graphics import filter for <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a>. This time it will be a CorelDraw import filter. The horse-power is a library, <a href="http://cgit.freedesktop.org/libreoffice/libcdr/" target="_blank"><code><b>libcdr</b></code></a>. In the same way as <code>libvisio</code>, <code>libcdr</code> reuses the API of <code>libwpg</code> and thus is easily pluggable into <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> reusing all the ODG generator classes of the current <code>writerperfect</code> module. The importer is currently part of the git master tree.</p><p>You might be already shouting: "Where are the screenshots?" I know that a picture speaks louder then hundred words, and so here you are served:</p><p><a href="http://www.picturestoragebin.com/images/990shapes_coreldraw7.png" target="_blank"><img src="http://www.picturestoragebin.com/images/990shapes_coreldraw7_tn.jpg" alt="Shapes in CorelDraw 7"></a></p>Simple and more complex shapes in CorelDraw 7</p><p><a href="http://www.picturestoragebin.com/images/902shapes_our_draw.png" target="_blank"><img src="http://www.picturestoragebin.com/images/902shapes_our_draw_tn.jpg" alt="Shapes in LibreOffice Draw"></a></p><p>The same shapes imported into LibreOffice Draw.</p><p>As you can see, it is an initial implementation, which cannot but get better. If you want to participate in this adventure, you can drop around at our IRC channel <a href="irc://chat.freenode.net/libreoffice-dev"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/"><code>irc.freenode.net</code></a> where a community of smart and friendly developers can direct you.</p><p>Stay tuned for more nice pictures as this project advances.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-1310408486183909672012-01-25T23:19:00.002+01:002012-02-09T16:11:19.051+01:00FOSDEM 2012 - How to make the best of it and become LibreOffice developer<p align="center"><a href="http://www.fosdem.org"><img src="http://www.fosdem.org/promo/going-to" alt="I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting" /></a></p><p><a href="http://fosdem.org/2012/" target="_blank">FOSDEM 2012</a> is just round the corner and, as you might know, <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> will have a <a href="http://blog.documentfoundation.org/2012/01/24/libreoffice-devroom-at-fosdem-2012-in-brussels/" target="_blank">DevRoom</a> this year too. And, as it was already <a href="http://libregraphicsworld.org/blog/entry/whats-coming-at-fosdem-2012" target="_blank">publicized</a>, your servant and Eilidh McAdam of <a href="http://www.freedesktop.org/wiki/Software/libvisio" target="_blank">libvisio</a> fame will attend too. The goal of this event will be to help you to become a <a href="http://www.libreoffice.org" target="_blank">LibreOffice</a> developer, by helping you to get your first contact with the code from inside.</p><p><b>How to prepare for the event?</b></p><p>In order to give as many community members the possibility to speak, the presentations will not take more then 15 minutes each. But we will be there for one-to-one contacts and hacking goodness. If you are interested in contributing to our new Visio import filter, or the upcomming Corel Draw and MS Publisher filters, here is what you can do:</p><ol><li>Find a bug that is bothering you in the current Visio import filter, or some simple feature that the importer currently does not support</li><li>Check out the following libraries:<ul><li>master branch of libwpd (<code>git clone git://libwpd.git.sourceforge.net/gitroot/libwpd/libwpd</code>)</li><li>STABLE-0-2-0 branch of libwpg (<code>git clone -b STABLE-0-2-0 git://libwpg.git.sourceforge.net/gitroot/libwpg/libwpg</code>)</li><li>master branch of libwps (<code>git clone git://libwps.git.sourceforge.net/gitroot/libwps/libwps</code>)</li><li>master branch of libvisio (<code>git clone git://anongit.freedesktop.org/libreoffice/contrib/libvisio</code>), and</li><li>master branch of libcdr (<code>git clone git://anongit.freedesktop.org/libreoffice/libcdr</code>)</li></ul></li><li>Build them as system libraries and install them in the same order.</li><li>Then build LibreOffice according to <a href="http://wiki.documentfoundation.org/Development#Getting_your_first_build_done" target="_blank">these instructions</a>. <b>The important thing is to use those system libraries that you just built.</b> To do so, be sure you added to the configure flags <ul><li><code>--with-system-libwpd</code></li><li><code>--with-system-libwpg</code></li><li><code>--with-system-libwps</code></li><li><code>--with-system-libvisio</code></li><li><code>--with-system-libcdr</code></li></ul></li></ol><p>With this kind of build, you will be ready to make the most from your Brussels weekend. Nevetheless, you can drop around at our IRC channel <a href="irc://chat.freenode.net/libreoffice-dev"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/"><code>irc.freenode.net</code></a> for more information and ideas.</p><p><b>Starting to do it instead of planning to do it ...</b></p><p>... is the best way to enter the FOSS development. That is why your servant and Eilidh will be around to hold your hand with debugging and finding way to implement your favourite features. We will answer your questions about the library design. We will point you to the place in the code where your bug might linger. And for more complicated stuff, we will debug it with you.</p><p>Don't expect us to give you a fish, but we will certainly teach you how to catch it by yourself. And in the same token, you will become a contributor inside a community of smart people that is fun to hang and hack with.</p><p>See you in Brussels the 4<sup>th</sup> and 5<sup>th</sup> of February 2012.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-30901818645158587092012-01-02T15:19:00.002+01:002012-01-02T15:19:41.184+01:00Take a decision to enter FOSS in 2012<p>So, the year changed again and with it come quite often new decisions. Some swear to work out the superfluous kilos, pounds, or whatever standardized measure your country uses, gained too fast during the festivals. If it is your decision, it is for sure good for your body and I wish you success that goes beyond the act of subscribing to a local gym (and never appearing there after first month).</p><p>But this could be also a nice time to take a decision that you were procrastinating with for too long. That one is good for your intellect and programming skills (even though you don't consider yourself a programmer yet). What about starting to contribute to a Free and Open Source Software project (FOSS)?</p><p>Sounds interesting? So I have one for your. It is having a big and growing community. It can accomodate all levels of skills. And the impact you will have is multiplied by the wide addoption of the product itself.</p><p>Well, you must have guessed right by now. I am speaking about the <a href="http://libreoffice.org">LibreOffice</a> project, your natural entry point into the marvelous world of the FOSS.</p><p>Whether you are expert or beginner programmer or C++ is sounding Chinese Traditional for you, just find a way to join channel <a href="irc://chat.freenode.net/libreoffice-dev"><code>#libreoffice-dev</code></a> channel at <a href="http://webchat.freenode.net/"><code>irc.freenode.net</code></a> in order to meet other developers and visit our <a href="http://wiki.documentfoundation.org/Development/Easy_Hacks">Easy Hacks</a> for ideas where to start.</p><p>I promis you that a year from now, you will not regret that you have started. Although, it is quite probable that you will pour a tear over an unused year-pass from the local gym.</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.comtag:blogger.com,1999:blog-13479614.post-21408879628911015592011-11-15T12:34:00.002+01:002011-11-15T15:23:53.730+01:007th ODF Plugfest in Gouda<p>For those that might care, your servant will be attending this week the <a href="http://www.odfplugfest.org/2011-gouda/" target="_blank">ODF Plugfest #7</a> in Gouda (Netherlands).</p><p>I will have on Friday a short presentation of the <a href="http://cgit.freedesktop.org/libreoffice/contrib/libvisio/" target="_blank">best free and open source library for parsing Microsoft Visio Documents</a>. The other exciting thing is that after more then 6 years of common collaboration I will get to meet personally one of my <a href="http://libwpd.sourceforge.net" target="_blank"><code>libwpd</code></a> co-maintainers, <a href="http://abicollab.net/documents/download/462/latest/pdf" target=" _blank">Johannes Marcus Maurer</a> also know as <a href="http://uwog.net/" target="_blank">"uwog"</a>.</p><p>What an exciting time before us!!!</p><div class="blogger-post-footer"><a rel="me" href="https://tooting.ch/@fridrich">Mastodon</a></div>Fridrich Strbahttp://www.blogger.com/profile/01546456836568779001noreply@blogger.com