[Libreoffice-bugs] [Bug 108849] New: DOCX IMPORT: Extra pages and wrong page sizes in a specific document
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Thu Jun 29 08:20:12 UTC 2017
https://bugs.documentfoundation.org/show_bug.cgi?id=108849
Bug ID: 108849
Summary: DOCX IMPORT: Extra pages and wrong page sizes in a
specific document
Product: LibreOffice
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: filter:docx
Severity: normal
Priority: medium
Component: Writer
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: mikekaganski at hotmail.com
Created attachment 134374
--> https://bugs.documentfoundation.org/attachment.cgi?id=134374&action=edit
A sanitized DOCX that has only 2 pages in Word
The attached test document has only 2 pages in Word, first 15x10 cm, and second
25x20 cm (both landscape), having one paragraph with short text each.
When imported into LibreOffice, it has 4 pages: first (empty) 10x15 portrait,
second (empty) 25x20 landscape, third 15x10 cm landscape (with text "Page 1"),
and fourth Letter-sized (with text "Page 2").
The document is sanitized version of a real-life document generated by a
third-party report generator. It actually is invalid OOXML, with last section
defined in wrong place.
According to ISO/IEC 29500-1:2016(E) 17.6.17 sectPr (Document Final Section
Properties), the final <w:sectPr> must be the last child element of the body
element. Also, this is enforced in schema for CT_Body complex type (Annex A.
(normative) Schemas – W3C XML Schema, A.1 WordprocessingML, page 3866), where
sectPr is a part of <xsd:sequence>, and thus *must* stay at specific place in
sequence, namely being the last element, and be at most one instance.
However, the test document has two sectPr before other body contents.
Unfortunately, MS Word seems to allow this standards-violating content, and
thus encourages creation of non-standard documents by third-party generators.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20170629/3874f1bf/attachment.html>
More information about the Libreoffice-bugs
mailing list