<html> <head> <base href="https://bugs.freedesktop.org/" /> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW - Huge spike in CPU and memory usage by tracker extractor due to rogue file" href="https://bugs.freedesktop.org/show_bug.cgi?id=85196#c2">Comment # 2</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - Huge spike in CPU and memory usage by tracker extractor due to rogue file" href="https://bugs.freedesktop.org/show_bug.cgi?id=85196">bug 85196</a> from <a class="email" href="mailto:ajohnson@redneon.com" title="Adrian Johnson <ajohnson@redneon.com>"> Adrian Johnson</a> <pre>The PDF is drawing the dots in the chart with the unicode character U+22C5 DOT OPERATOR. If you have enough memory and patience the file will be successfully processed. On my machine it takes 202 seconds and has peak memory usage of 2.7GB. The output file contains over 100,000 U+22C5 characters. I recall a discussion a few years ago about improving the efficiency of the text extraction: <a href="http://lists.freedesktop.org/archives/poppler/2010-November/006646.html">http://lists.freedesktop.org/archives/poppler/2010-November/006646.html</a> I'm not sure what happened to those patches.</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>