[Libreoffice] Testing writerfilter import

Cedric Bosdonnat cedric.bosdonnat.ooo at free.fr
Thu May 19 03:52:41 PDT 2011

Hi Miklos,

On Thu, 2011-05-19 at 12:26 +0200, Miklos Vajna wrote:
> Preparing for the GSoC project, I wanted to decide what is an easy way
> to test an import filter. Given that the new RTF filter will be in
> writerfilter, testing docx import is a good example, I guess.

Sure, though we have no unit testing here.

> So opening a docx document and checking the visual result is one way,
> though in case it goes wrong, I don't think it's helpful. One method is
> to export to odt, AFAIK that's lossless. So here is what I tried (build
> from master, I pulled and did an incremental build today):

As you'll work on the tokenizer, I think it would be nice to introduce
some kind of tokens dumper replacing the dmapper that would dump what
goes in the dmapper. That would possibly provide some way to isolate
whether the import problem comes from the tokenizer (specific to each
format) or the domain mapper (that would impact all handled formats).

You would then have a much more reliable way to test that your tokenizer
is working... but that wouldn't help testing the domain mapper. To test
that one, I think that mostly conversions like those you are explaining
are helping.

> (I already heard of the xml dumper for the rendered layout, is there
> something similar for the internal document model?)

Yes, the ODF is a pretty good representation of the internals... though
we could surely implement something nearer from the actual data
structures. Let me know if it would be of any use to create such a
dumper... I'm sure we could come pretty quickly to something useful.


Cédric Bosdonnat
LibreOffice hacker
OOo Eclipse Integration developer

More information about the LibreOffice mailing list