[Libreoffice-bugs] [Bug 108714] New: DOCX IMPORT: Page break is missing in a specific document
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Fri Jun 23 11:42:56 UTC 2017
https://bugs.documentfoundation.org/show_bug.cgi?id=108714
Bug ID: 108714
Summary: DOCX IMPORT: Page break is missing in a specific
document
Product: LibreOffice
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: filter:docx
Severity: normal
Priority: medium
Component: Writer
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: mikekaganski at hotmail.com
Created attachment 134226
--> https://bugs.documentfoundation.org/attachment.cgi?id=134226&action=edit
A sanitized DOCX that has a page break in Word
This document (a sanitized minimal reproducer which shows a problem of a
real-life document) has a page break between its two paragraphs when open with
Word. LibreOffice doesn't import the page break, showing both paragraphs on one
page.
The reason is that LibreOffice rightfully doesn't accept <w:br> element as a
child of <w:body>.
ECMA-376-1:2016 17.3.3.1 describes br as element of a run content,
and points to CT_Br in §A.1.
CT_Br may appear only as part of EG_RunInnerContent.
In turn, EG_RunInnerContent may appear only inside CT_R.
So, using <w:br> outside of <w:r> produces ill-formed OOXML.
Open XML SDK 2.5 Productivity Tool for Microsoft Office confirms that,
showing OpenXmlUnknownElement error.
However, Word accepts it as direct child of <w:body>. Another Word bug
that provokes third-parties to create ill-formed real-life documents,
and requires LibreOffice to be bug-to-bug compatible.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20170623/2cd7c6a5/attachment-0001.html>
More information about the Libreoffice-bugs
mailing list