[poppler] poppler really slow when reading some documents

Mon Jan 2 18:46:30 PST 2006

Hi,

I've noticed that poppler based pdf readers are incredibly slow when rendering
the serial ata specification. (Available
http://www.serialata.org/specifications.asp:
http://www.serialata.org/docs/serialata10a.pdf)
Slow means, that there is not even a single page displayed within 20
minutes. Acroread can display the document within seconds. ;-)

I've filed bug https://bugs.freedesktop.org/show_bug.cgi?id=5484.

I know that there are some more speed related bugs, but I'll opening a
new one, because I've found the reason why _this_ document is so slow.
Checking if this
also solves the other speed problems should be done later.

Ok, first I've tried strace -f:
...
[pid   414] mmap2(NULL, 299008, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa5cb7000
[pid   414] munmap(0xa5c6e000, 299008)  = 0
[pid   414] mmap2(NULL, 299008, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa5c6e000
[pid   414] munmap(0xa5cb7000, 299008)  = 0
...

It seems, that there are many memory allocations and deallocations.
Sysprof (big thanks to Soren for this very cool tool) proves this assumption:
Most time is spend in memcpy calls.

Because there are several memcpy calls in evince/poppler I've started
evince with gdb, halted it after evice has displayed its screen with
empty pages.

#0  0xa714e147 in memcpy () from /lib/libc.so.6
#1  0xa748d3f9 in GooString::append () from /usr/lib/libpoppler.so.0
#2  0xa745e39d in Lexer::getObj () from /usr/lib/libpoppler.so.0
#3  0xa746756a in Parser::shift () from /usr/lib/libpoppler.so.0
#4  0xa7467a09 in Parser::getObj () from /usr/lib/libpoppler.so.0
#5  0xa7472248 in XRef::fetch () from /usr/lib/libpoppler.so.0
#6  0xa7463b8f in Object::fetch () from /usr/lib/libpoppler.so.0
#7  0xa74647bd in OutlineItem::readItemList () from /usr/lib/libpoppler.so.0
#8  0xa7464ad8 in OutlineItem::open () from /usr/lib/libpoppler.so.0
#9  0xa74efd89 in poppler_index_iter_get_child ()
   from /usr/lib/libpoppler-glib.so.0
#10 0x0808e1a3 in build_tree (pdf_document=0x822d530, model=0x82d09a8,
    parent=0xa6be72e0, iter=0x82fd350) at ev-poppler.cc:751
#11 0x0808e1bd in build_tree (pdf_document=0x822d530, model=0x82d09a8,
    parent=0xa6be7360, iter=0x82f9058) at ev-poppler.cc:753
#12 0x0808e1bd in build_tree (pdf_document=0x822d530, model=0x82d09a8,
    parent=0x0, iter=0x82d0930) at ev-poppler.cc:753
#13 0x0808e290 in pdf_document_links_get_links_model (document_links=0x822d530)
    at ev-poppler.cc:776
#14 0x0808c333 in ev_document_links_get_links_model (document_links=0x822d530)
    at ev-document-links.c:67
#15 0x08060c8f in ev_job_links_run (job=0x82c9160) at ev-jobs.c:227
#16 0x0805f6cc in handle_job (job=0x82c9160) at ev-job-queue.c:98
#17 0x0805f926 in ev_render_thread (data=0x0) at ev-job-queue.c:187
#18 0xa72c8aba in g_thread_create_proxy (data=0x8108050) at gthread.c:564
#19 0xa7216167 in start_thread () from /lib/libpthread.so.0

I've looked deeper in the GooString::append function and the operation with the
most costs is the resize function which calls the problematic memcpy.

This function resizes the array for the string object. It seems that a very
simple algorithm is used:
If the requested len < 256 then the size is increased every 16 bytes and else
every 256 bytes.

The advantage is cleary that very less bytes are wasted. The
disadvantage is, especially for large strings composed (e.g. by
"append"ing) out of many small ones, that there are many memcpy calls.
Every 256 bytes the strings grows, the complete string is copied.

I've not done a lot of research, but it seems that a kind of
increasing of the delta depending on the size would make sense.
Poppler does this already, but even 256 bytes are far too small for
large strings.

It seems that the desired document uses very large strings and so the
current algorithm scales not very well.

Now there are two possibilities to solve the problem:

1. do the memory allocation in poppler, deal with different chunk sizes:
static inline int size(int len) {
  int delta;

  if (len < 0xff+1) {
      delta = 0xf;
  } else if (len < 0xfff+1) {
      delta = 0xff;
  } else if (len < 0xffff+1) {
      delta = 0xfff;
  } else if (len < 0xfffff+1) {
      delta = 0xffff;
  } else {
      delta = 0xfffff;
  }

  return ((len + 1) + delta) & ~delta;
}

Using this approach it is possible to open the document on my system
in 7 seconds.

2. use realloc and let the lib decide which size leads to a new
allocation (and memcpy) is needed
inline void GooString::resize(int length1) {
  if (!s) {
    s = (char*)malloc(length1+1);
  } else if (length1 != length) {
    s = (char*)realloc(s, length1+1);
  }
}

Using this simple algorithm opens the document on my system in 14 seconds.

If I assume that the document would be loaded in 20 minutes (I don't
know because I've not waited until the unpatched poppler lib has
finished reading it), there is a speedup of 170. ;-)

Sure, it is possible to use more sophisticated algorithms here...
Nevertheless I kindly ask  you to change the "size" function in the
mentioned way. I've attached a patch to my bug report.

Thanks in advance.

Best regards,
Christian