[poppler] [RFC] Small string optimization in GooString

Adam Reichold adam.reichold at t-online.de
Mon Jun 29 12:26:36 PDT 2015


Hello William,

Am 29.06.2015 um 20:48 schrieb William Bader:
> Does slightly increasing or decreasing STR_STATIC_SIZE make a
> difference? Maybe the compiler is aligning on a larger boundary than the
> calculation assumes, or maybe a large static area increases cache misses
> by not allowing adjacent small strings to fit in the same cache block.

Increasing STR_STATIC_SIZE so that the final object size goes from 32 to
48 did not change the basic behaviour for insertion of short and long C
strings.

The alignment of data within the GooString class should be considered by
the size computation. (sizeof(GooString) changes as intended) The
alignment of GooString itself does not change in between the variants.
(__alignof__(GooString) stays at 8)

Alignment does probably play some role as insertion is faster for
heap-allocated than for stack-allocated GooString instances. But this as
already true for the current code and does not change with the new
memory layout.

> Are there ways to reduce the use of dynamic strings like what I did in
> .https://bugs.freedesktop.org/show_bug.cgi?id=89096 ?

There are certainly such opportunities [1] [2], but I'd say the small
string optimization is used exaclty so that users of the class get such
optimizations almost without effort. And of course, one could optimize
such places and improve the small string optimization at the same time.

Of course, in this case, my primary aim is to prevent wasting 8 bytes of
memory for each dynamically allocated small string. Even if this can be
done without any speed improvement, it would be a win IMHO.

Best regards, Adam.

[1] For example, I found calls of GooString::clear immediately followed
by GooString::insert which should probably be replaced by a single call
to GooString::Set.

[2] I also suspect that it might be benefical if GooString::format and
GooString::formatv would return a value instead of a pointer since that
would be prevent a lot of dynamic allocation of small strings in the
first place. But this would imply changing all 115 or so call sites
instead of a localized change within GooString... (It would also make
sense IMHO as GooString already is a handle to a value instead of a
value itself.)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20150629/cfdd9518/attachment.sig>


More information about the poppler mailing list