[poppler] Big image splash support

Martin (gzlist) gzlist at googlemail.com
Tue Sep 1 10:34:11 UTC 2020


On Thu, 20 Aug 2020 at 12:43, Martin (gzlist) <gzlist at googlemail.com> wrote:
>
> There is an argument that all widths, heights, and coordinates should
> be size_t instead of int, but to limit the size and pain of the
> change, keeping the current interfaces using int seems reasonable.
...
> For the splash code, what I've implemented and propose landing is
> casting to size_t on index into a buffer, generally where
> multiplication happens.

I've posted a couple of initial patches to gitlab, with most of the
rest of the changes depending on the one that adds new buffer
accessors to SplashBitmap. One extra change is that to support the
negative rowSize convention where bottom-up data layout stores a
pointer to the start of the last pixel row rather than to the first,
the buffer offset type uses ptrdiff_t instead of size_t.

Albert has mentioned in review that he's not clear on why the casts
are there, and is not convinced about using size_t for height and
width values. So, I'll have another go at explaining this.

    int w = 1 << 16;
    int h = 1 << 16;
    memset(some_ptr, 0, w * h);

What happens? The third parameter to [memset] is size_t but the
[multiplication] is happening as int. On a LP64 machine, this is
signed [integer overflow] and undefined behaviour.

This can be fixed by assigning into a wider type:

    - int w = 1 << 16;
    + size_t w = (int)(1 << 16);

Or by casting to the result type in the operation:

    - memset(some_ptr, 0, w * h);
    + memset(some_ptr, 0, (size_t)w * h);

Assigning or casting both values would be a little clearer, but the
promotion to the wider type happens regardless.

Because this is fiddly to get right, and verbose when spelled out with
c++ static_cast<>() syntax, I've [introduced helpers] for getting at
the buffer data to minimise the changes required. The general rule is
sizes should be promoted to size_t and memory offsets should be
promoted to ptrdiff_t.

One final note, I don't expect any performance impact, but
microbenchmarks aren't going to be very informative as the compiler
optimisations have a big effect here.

Martin


[memset] memset manpage
https://man7.org/linux/man-pages/man3/memset.3.html
[multiplication] How to Write Multiplies Correctly in C Code
https://www.ti.com/lit/an/spra683/spra683.pdf
[integer overflow] Understanding Integer Overflow in C/C++
https://wdtz.org/files/tosem15.pdf
[introduced helpers] splash: New buffer accessors for SplashBitmap
https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/623


More information about the poppler mailing list