[Pixman] [PATCH] Resolve implementation-defined behaviour for division rounded to -infinity

Wed Aug 26 02:22:22 PDT 2015

On Fri, 14 Aug 2015 15:06:07 +0100
Ben Avison <bavison at riscosopen.org> wrote:

> The previous implementations of DIV and MOD relied upon the built-in / and %
> operators performing round-to-zero. This is true for C99, but rounding is
> implementation-defined for C89 when divisor and/or dividend is negative, and
> I believe Pixman is still supposed to support C89.

Do you have a practical example of an existing C89 compiler, which
differs from C99 when handling '/' and '%' operators?

My understanding is that C compilers just used to emit a single
division instruction as implemented by the target processor. This
provides the best performance, but is not perfectly portable in
theory. But in practice, after decades of evolution, all the
remaining (major) CPU architectures happen to handle integer
division rounding in the same way (round-to-zero). And C99 just
promoted the de-facto standard to the official standard status
(regardless of whatever was the official justification).

Right now pixman also assumes two's complement representation
of negative numbers and is doing right shifts with integer types
(pixman_fixed_t). In theory this all is implementation defined.
In practice, non-two's complement systems are already extinct
and they are even not supported by GCC:
   https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html

The pixman way of dealing with this stuff is to make sure that all
the assumptions are verified by the test suite. If you are worried
about / and % operators behaviour, then it's best to make sure that
it has proper coverage in the pixman tests and will report an error
if somebody ever tries to build pixman for an unusual system with
an unusual compiler. As an example, we used to assume that powerpc
is always big endian and x86 is always little endian. Now it turned
out that little endian powerpc systems actually exist. This did not
cause any serious troubles for distro maintainers and users because
the test suite was able to catch this problem:
   https://bugs.freedesktop.org/show_bug.cgi?id=81229
And it was reasonably easy to workaround (by disabling vmx
optimizations) and then add support for the little endian variant.
Should we start worrying about a hypothetical big endian x86
variant right now? Maybe not yet.

Over years, pixman has evolved into a rather hostile environment
for bugs. And this did not happen magically itself, but is a result
of taking care to adjust the test suite to catch even more bugs
and trying more corner cases. One more example, again powerpc
related. We got a bug report:
    http://lists.freedesktop.org/archives/pixman/2013-August/002871.html
It was only reproducible on power7 system, so the test suite was
obviously not good enough to detect this reliably. We found the
root cause of the bug, fixed it:
    http://cgit.freedesktop.org/pixman/commit/?id=b6c5ba06f0c5c0bd8d186e7a4879fd3b33e7e13f
And also extended the test suite with a more reliable test:
    http://cgit.freedesktop.org/pixman/commit/?id=0438435b9c915b61af21446b2cb2f77a2b98a3b9
Now if anything like this ever happens again (on powerpc or
any other architecture), we should get it detected.

There are also compiler bugs. In fact, pixman happens to be a compiler
bug magnet. We have found and reported a handful of bugs to GCC and
Clang. Here is an example of the last one:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64172
Based on our experience, compiler bugs are not so uncommon. So when
somebody wants to compile pixman for some unusual system with an
unusual or very new version of the compiler, we can't be sure if
the resulting binary is going to work fine unless it passes the
tests.

> ---
>  pixman/pixman-private.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/pixman/pixman-private.h b/pixman/pixman-private.h
> index 73108a0..80506be 100644
> --- a/pixman/pixman-private.h
> +++ b/pixman/pixman-private.h
> @@ -889,12 +889,12 @@ pixman_list_move_to_front (pixman_list_t *list, pixman_link_t *link)
>  #endif
>  
>  /* Integer division that rounds towards -infinity */
> -#define DIV(a, b)					   \
> -    ((((a) < 0) == ((b) < 0)) ? (a) / (b) :                \
> -     ((a) - (b) + 1 - (((b) < 0) << 1)) / (b))
> +#define DIV(a, b) \
> +    ((a) / (b) - ((a) % (b) != 0 && ((a) % (b) < 0) != ((b) < 0) ? 1 : 0))
>  /* Modulus that produces the remainder wrt. DIV */
> -#define MOD(a, b) ((a) < 0 ? ((b) - ((-(a) - 1) % (b))) - 1 : (a) % (b))
> +#define MOD(a, b) \
> +    ((a) % (b) + ((a) % (b) != 0 && ((a) % (b) < 0) != ((b) < 0) ? (b) : 0))
>  
>  #define CLIP(v, low, high) ((v) < (low) ? (low) : ((v) > (high) ? (high) : (v)))

I'm not saying that this patch is wrong, but how can we be sure that it
is doing the right thing? The new variant of this code still relies on /
and % operators (which are implementation-defined) and uses them with
negative numbers. A more in-depth explanation would be useful, or a
confirmation that it fixes a real problem on a real system.

Moreover, previously we assumed that / and % operators are rounding
towards zero and had special DIV and MOD macro variants, which are
rounding towards minus infinity. If we are really worried about
rounding in general, should we review all the uses of / and %
operators in the code too? And for the sake of consistency introduce
new macro variants, which are rounding towards zero?

There is also additional concern about the performance. Is the new
code slower/faster than the old one?

To sum it up. I think that we really should just rely on the test
suite to take care of verifying the rounding behaviour (and improve
the test suite if it is not good enough to catch this type of problems).
If the rounding behaviour is confirmed to be a real problem on some
real hardware with a real compiler, then we can deal with it.

-- 
Best regards,
Siarhei Siamashka