<html>
    <head>
      <base href="https://bugs.documentfoundation.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_UNCONFIRMED "
   title="UNCONFIRMED - DOCX IMPORT: Extra pages and wrong page sizes in a specific document"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=108849">108849</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>DOCX IMPORT: Extra pages and wrong page sizes in a specific document
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>LibreOffice
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Keywords</th>
          <td>filter:docx
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Writer
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>libreoffice-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>mikekaganski@hotmail.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="http://bugs.documentfoundation.org/attachment.cgi?id=134374" name="attach_134374" title="A sanitized DOCX that has only 2 pages in Word">attachment 134374</a> <a href="http://bugs.documentfoundation.org/attachment.cgi?id=134374&action=edit" title="A sanitized DOCX that has only 2 pages in Word">[details]</a></span>
A sanitized DOCX that has only 2 pages in Word

The attached test document has only 2 pages in Word, first 15x10 cm, and second
25x20 cm (both landscape), having one paragraph with short text each.

When imported into LibreOffice, it has 4 pages: first (empty) 10x15 portrait,
second (empty) 25x20 landscape, third 15x10 cm landscape (with text "Page 1"),
and fourth Letter-sized (with text "Page 2").

The document is sanitized version of a real-life document generated by a
third-party report generator. It actually is invalid OOXML, with last section
defined in wrong place.

According to ISO/IEC 29500-1:2016(E) 17.6.17 sectPr (Document Final Section
Properties), the final <w:sectPr> must be the last child element of the body
element. Also, this is enforced in schema for CT_Body complex type (Annex A.
(normative) Schemas – W3C XML Schema, A.1 WordprocessingML, page 3866), where
sectPr is a part of <xsd:sequence>, and thus *must* stay at specific place in
sequence, namely being the last element, and be at most one instance.

However, the test document has two sectPr before other body contents.
Unfortunately, MS Word seems to allow this standards-violating content, and
thus encourages creation of non-standard documents by third-party generators.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>