<html>
    <head>
      <base href="https://bugs.documentfoundation.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_UNCONFIRMED "
   title="UNCONFIRMED - Converting docx in headless mode hangs"
   href="https://bugs.documentfoundation.org/show_bug.cgi?id=122192">122192</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Converting docx in headless mode hangs
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>LibreOffice
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>5.3 all versions
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>LibreOffice
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>libreoffice-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>rb@awave.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Description:
My application runs on a Ubuntu 16.04 web server where uploaded files
automatically get's converted to PDF with doc2pdf which is part of unoconv
which again uses Libreoffice in headless mode. When trying to convert the
corrupted DOCX document hangs with 100% of the CPU utilized and eventually I
have to reboot to recover.

Steps to Reproduce:
1. Have a Word document (DOCX) that is corrupted
2. Try to convert it to PDF: libreoffice --headless --convert-to pdf
broken.docx
3. Trying with a Word document that is not corrupted works fine

Actual Results:
When trying to convert the corrupted DOCX document hangs with 100% of the CPU
utilized and eventually I have to reboot to recover:

javaldx: Could not find a Java Runtime Environment!
Warning: failed to read path from javaldx
W: Unknown node under /registry/extlang: deprecated
W: Unknown node under /registry/grandfathered: comments
W: Unknown node under /registry/grandfathered: comments
Fontconfig warning: ignoring UTF-8: not a valid region tag
convert /home/forge/broken.docx -> /home/forge/broken.pdf using filter :
writer_pdf_Export

Expected Results:
Command exits to shell with an error.


Reproducible: Always


User Profile Reset: No



Additional Info:
I would suggest one of these things would happen:

1. Command exits with an error
2. Set a timeout and if reached, the command
3. Be able to detect if DOCX document is broken

Unfortunately I cannot provide you with the broken Word document because it
contains sensitive information. Trying to censor the sensitive information
would require me to create an new document that is not corrupted.

Originally the question was asked here (and will attach a text version to this
bug report):

<a href="https://ask.libreoffice.org/en/question/174451/converting-docx-in-headless-mode-hangs/">https://ask.libreoffice.org/en/question/174451/converting-docx-in-headless-mode-hangs/</a></pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>