<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Pathological case demonstrating massive slowdown"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=106135">106135</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Pathological case demonstrating massive slowdown
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>poppler
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>general
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>poppler-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>solo@yopmail.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=138921" name="attach_138921" title="before">attachment 138921</a> <a href="attachment.cgi?id=138921&action=edit" title="before">[details]</a></span>
before

>From a bug reported to pdfgrep at <a href="https://gitlab.com/pdfgrep/pdfgrep/issues/25">https://gitlab.com/pdfgrep/pdfgrep/issues/25</a>

    The original file, before.pdf, took pdfgrep only 7 seconds to search.
    I then decompressed and recompressed the file to produce after.pdf. On
    this new file, pdfgrep now takes 80 seconds to search it. I also tested
    this procedure against some ebooks and found much worse results, such as
    an increase from 4s to 250s.

    It looks like this might be poppler related, since timing pdftotext on the
    files also exhibits a 10x difference in performance. But every other pdf
    viewer (Mac OS X Preview and Skim, mupdf, PDF.js) and parser (mutool,
    podofo, pdf-parser.py, pstotext/ghostscript) I tried doesn't exhibit any
    significant performance difference between these two files.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>