How far can the LO file's contents be stretched?
Johannes Sixt
j6t at kdbg.org
Mon Feb 6 13:30:53 PST 2012
Am 06.02.2012 10:39, schrieb Michael Meeks:
>
> On Sun, 2012-02-05 at 21:50 +0100, Johannes Sixt wrote:
>> The native file format of LO files is simply a ZIP archive with various
>> types of files in it.
>
> Right - although, there is another variant of our native file format
> that is almost as efficient to read/write and that is Flat ODF - try
> saving as '.fodt' eg.
>
>> I intend to do some gymnastics with the ZIP files and the XML in them,
>> and I would like to know how far it can be stretched ;) In particular, I
>> want to explode the archive into git tree, mangle its contents in
>> various ways, and then generate an ODF file from the tree using git-archive.
>
> So - unless you're in love with zipping things, you can use flat odf,
> which prolly diffs, and compresses more pleasantly inside git. We're
> most interested in people helping improve the stability (and similiarity
> of that eg. for automatic styles) during write too.
>
>> 2) Another side effect of the detour via git will be that empty folders
>> will be absent in the package resulting from git-archive:
>
> It would really be rather nice if we didn't create empty directories
> there in the first instance ;-)
>
>> in the result. Again, is it a problem?
>
> It shouldn't be; if it is - we should fix it - patches most welcome.
>
>> 3) Can the styles listed in an <office:automatic-styles> section be
>> renamed? For example:
>
> I believe the names are arbitrary and can be replaced without problems;
> of course - do you want to get a well defined order here based on some
> hash of the attributes or somesuch ? if so, patches most welcome :-)
>
> Looking forward to seeing what you're up to :-)
So here's the full story. I want to keep software manuals in a git
repository. We keep variants of the software+manuals in branches, and
from time to time we want to merge the development progress into the
branches. And that's where this fun task starts.
My first attempt to merge documents was to use the, well, "Merge
documents" feature. But my self-built 3.5.0-rc2 (3a7ae48b) reports
merely "Cannot merge documents". I think it cannot handle conflicting
changes. [*]
Now I'm exploring plan B: textual merging of the XML data. For this, I'm
exploding the zip archive into a git tree so that I can use the git
toolbox. Unfortunately, there are many textual conflicts because the
automatic styles are numbered, and as styles come and go between
document versions, many unnecessary conflicts arise at the point where
the styles are used. To work it around, I would rename the styles based
on a hash of the style definition and sort them (to be able to better
merge the <automatic-styles> section).
But now that you mention flat ODF files (which I didn't know about
before), I better change the plan and use that.
... does some tests ...
Ouch! I have some embeded LO Draw graphics (all converted from MS Word
2000 format). They are only displayed as a big pixelized plug, and are
also exported into PDF as such, not the graphics proper. Although I can
double-click and edit them in LO Draw. That's a show-stopper. It would
be great if there were a simple solution for this.
-- Hannes
[*] If you think that "Merge documents" should work, I can provide
details what I did. I can also run a merge attempt in the debugger to
observe what LO is doing, but I'm not prepared, yet, to fix the merge
procedure.
More information about the LibreOffice
mailing list