[Libreoffice-bugs] [Bug 102616] EDITING: Compare documents on near-identical files flags 99.9% of the contents as different

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Thu May 23 12:23:15 UTC 2019


https://bugs.documentfoundation.org/show_bug.cgi?id=102616

--- Comment #11 from Luke Kendall <luke.kendall at gmail.com> ---
This bug is as bad or worse in Writer 6.2.3.2.

Because I was preparing 4 editions of my book, I have 4 versions of the MS.
I'll need to report several bugs in Writer for this.

Anyway, because of these other bugs, it was important to compare the different
MSes. Writer never was able to provide a useful comparison.  In the end I had
to use MS Office365 to compare .docx versions of the MSes saved from Writer.

This was doing a merge comparison.

The main problems I had with Writer were (in order of seriousness):

1. As reported, basically the entire document is flagged as changed.
   This applied even when the documents used the same paragraph styles (except
for chapter title paragraph style), same page format, and same body text
paragraph style.
2. Undo of a file comparison is not just incomplete, applying Undo can
introduce unexpected changes.  On one occasion, the Undo changed the final body
paragraphs by applying an All Capitals attribute.  My recollection is that it
also left some Chapter Title paragraphs in a strange state, as well as losing
some page breaks.
3. Writer would often crash as soon as the comparison started.
4. After recovery from said crash, the Manage Changes dialog would be open for
every recovered document, and have to be closed separately.

I also noticed that the .docx files produced by Writer are difficult for other
word processors to compare. This was because regardless of how little editing I
had done to my MS, for every real change I had made there would be 10 or 20
"null differences" found by the other word processor.  These usually but not
always appeared as a space character at the end of paragraphs, but sometimes
between words in a paragraph.

I don't know if it's indicative of an underlying problem, but ONLY Office365
could produce useful comparisons of Writer-generated .docx files of the same
root document.  The others I tried tended to produce very large swathes of
changes that were marked as different even though painstaking visual comparison
showed them to be identical.

So I suspect one issue with the Writer file compare algorithm is that it may be
comparing the underlying data structures, not the visible data (detectable by a
user) - by which I mean, for any run of text, the paragraph, character, page
styles, and the literal characters.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20190523/0177accf/attachment.html>


More information about the Libreoffice-bugs mailing list