Import filters and Features

Fridrich Strba fridrich.strba at graduateinstitute.ch
Tue May 28 03:55:54 PDT 2013


Hello, Tomas,

On 27/05/13 21:49, Tomas Hlavaty wrote:
> it was great talking to you at the Linux Tag in Berlin on Saturday.  I
> hope you don't mind me CCing the mailing list, in case somebody else
> finds it interesting and could provide some info.

It was also delight to chat with you in the rainy Berlin. And for sure a 
guy that writes in 2013 in Lisp is a good guy for sure :)

> Inspired by your talk about import filters, I would like to ask what is
> the usual strategy to addressing impedance mismatch between features
> covered in an input file format and LibreOffice internal representation?
> E.g. when a feature is missing in the ODF format?

OK, ODF is not exactly LO's internal representation, there is a subtle 
difference there, but that is not a topic of this e-mail, so we will not 
lose potential reader in writing essay on the differences ;)

Since the filters I was speaking about actually use the flat ODF as an 
exchange file-format (read Linux Magazine 5/13, side 26 onwards).

Besides what Miklos already answered, there are several potential ways 
to handle the conceptual differences:

1) If the file-format allows it and LO only does not implement that part 
of the file-format, we document it and pray that someone will work on 
it. Eventually the prayer is answered or we try to work on it ourselves.

2) If the file-format does not allow that and the internal 
representation does not either, we try to emulate the feature in the 
library. For instance for Visio arrows (line-end markers) we basically 
try to output the paths that ODF uses pre-scaled in the library itself 
according to the line width.

In a similar way one could emulate things like HSK gradients where one 
would have to emulate them with a serie of linear RGB gradients in a way 
that they would look reasonably similar to the HSK ones.

There are also cases where we just cannot express a feature and we drop 
some elements of it, still trying to retain as much information as 
possible so that we keep a need of manual tweaking to minimum.

3) If the internal representation allows something, but the file-format 
does not, we use the ODF extended format where we put our own draft 
features and implement them.

> For example, MS Word have list numbering restarts which ODF doesn't
> have.  In such cases the documents don't import and export correctly.  I
> guess you hit similar issues with your import filters even though they
> were less about word processing and more about picture drawing iirc.  In
> case of my example, I figured out that I can parse the exported DOC file
> (e.g. using <http://logand.com/sw/cl-olefs.html> which I wrote based on
> MS spec and which gives me as a bonus file positions for every data
> record), find out where the numbering restart bit is located in the DOC
> file and set it accordingly.  It would be better though if no such
> postprocessing was needed.  In such case, is it permissible to
> accomodate the "feature" in the internal LibreOffice representation or
> is the internal representation strictly dictated/limited by the ODF
> spec?  If the later, I guess input filters can never be made to work
> correctly?

OK, it might be slightly naive to think that the fidelity might ever be 
100% compared to original. But yes, we try to evolve the standard to 
cover the features used out there.

Btw, I really really really enjoyed our talk and look forward to you 
hanging at #libreoffice-dev at irc.freenode.net :)

Cheers

F.



More information about the LibreOffice mailing list