[poppler] [RFC] Small string optimization in GooString

Adam Reichold adam.reichold at t-online.de
Mon Jun 29 05:40:07 PDT 2015


Hello,

I did notice today that GooString uses the small string optimization but
uses a hard-coded size for the static buffer, so that the final object
size depends on the architecture on which Poppler is built.

For example, in contrast to the source code comment, on my amd64
machine, sizeof(GooString) is actually 40 instead of the desired 32,
which could mean that we are wasting (48 - 40) / 48 = 16.7 % of the
memory allocated for small strings.

The first part of the attached patch adds a private helper class
GooString::MemoryLayout to simulate the memory layout of GooString that
the compiler will generate and uses this to compute the static buffer
size based on the desired final object size.

However, if this is done using 32 bytes the small buffer is shortened
from 24 to 16 bytes on LP64 systems. Hence the second part of the patch
puts the static buffer and the pointer to a dynamic buffer into a union
to save space which means that small strings will have 24/40 bytes
within 32/48-byte GooString objects.

Deciding whether a GooString is allocated statically or dynamically is
done using the sign of the length field, i.e. effectively using of one
its bits as a flag. The changes this implies throughout the GooString
class can mostly be encapsulated using some inline helper methods, but
the code will become more branch-heavy due to the necessary checks and
decisions.

I am still working on measuring the effects this has on the runtime but
would already like to ask for comments whether the increased complexity
of the second part of the patch seems sensible. I think the first part
of the patch should be included in some form (maybe using a different
technique than the helper class) if the behaviour of malloc implied by
the comment is still valid.

Some simple measurements using cold runs of a release build of the
stress-poppler-dir program from the Qt tests with a directory of some
mathematical documents and the call to Page::renderToImage replaced by
calls to Page::textList and Page::search yields the following:

[layout variant (static buffer size/final object size)]

current (24/40):  278 s
adjusted (16/32): 277 s
adjusted (24/48): 274 s
compact (24/32):  272 s
compact (40/48):  274 s

I would preliminarily interpret this as no significant change in runtime
and would therefore argue that the memory usage improvements should be
implemented, probably using the compact 24/32 layout after further tests
and benchmarks.

I will try runnning the complete regression test suite to make sure
there are none, since this does seem like a high-risk change. Maybe this
also yields some more definite performance numbers. I would greatly
appreciate any advice on how to benchmark this most effectively.

Best regards, Adam.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: goo-string-improvements.patch
Type: text/x-patch
Size: 17066 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20150629/242d0074/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20150629/242d0074/attachment.sig>


More information about the poppler mailing list