[Pixman] testsuite fails on power7

Lennart Sorensen lsorense at csclub.uwaterloo.ca
Mon Sep 16 09:37:06 PDT 2013


On Fri, Aug 30, 2013 at 12:26:44AM +0300, Siarhei Siamashka wrote:
> I'm not really familiar with the Altivec intrinsics. They might provide
> some syntax sugar (which also might be compiler specific). But the
> intrinsics are converted to the Altivec instructions in the generated
> code in the end. There are two Altivec manuals (for assembly and
> intrinsics) here:
> 
> http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPEM.pdf
> http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf
> 
> Reads outside the malloc area are fine. If we need to read only a single
> last pixel of the image, then reading the whole 16 byte chunk it
> belongs to is also fine. It is not going to cause any segfaults (the
> needed bytes and the extra bytes from this 16 byte chunk all belong to
> the same memory page). And merely reading obviously can't corrupt
> memory. I think similar tricks are used by glibc also for x86, that's
> why valgrind ships with a list of suppressions for false positives.
> 
> But any writes outside of the malloc area are really bad. Even if they
> write back the same value that was read from this memory location just
> a few instructions ago.

It certainly seems bad.

> The possible solution is to add some extra code before the VMX combiner
> loops to align the destination to 16 bytes boundary. Just like it is
> done for SSE2:
> 
>     http://cgit.freedesktop.org/pixman/tree/pixman/pixman-sse2.c?id=pixman-0.30.2#n663

Hmm, having some issues trying to figure out what that code is trying
to do.  I really wish people could use meaningful variable names.
That is, meaningful to other people too.

> You can try to disable most of the VMX combiners by commenting out
> the pointers initialization here:
> 
>    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-vmx.c?id=pixman-0.30.2#n1622
> 
> For debugging purposes keep just one combiner, which can still reliably
> trigger the problem. Then try to fix this problem. And then apply the
> same fix to the rest of the VMX code.
> 
> We may also have troubles accessing memory before the malloc area. The
> address of the allocated memory block also should be 16 bytes aligned
> to workaround the problem. So just allocating 16 bytes extra is not
> enough. You can try using memalign/posix_memalign to test this. In any
> case, that's only a test to investigate/confirm the problem. It might
> be not worth wasting time.
>  
> > I am not sure if the vec_ld is implemented in the compiler or libc,
> 
> The intrinsics are converted to assembly instructions by the compiler.
> 
> > and I can't remember if I still used the same gcc version when testing
> > with libc 2.17.  I am using gcc 4.6 from Debian wheezy at the moment.
> > I am pretty sure I tried with 4.7 as well with no change in behaviour.
> 
> I suspect that the only relevant difference between glibc versions
> affecting this bug could be the malloc implementation.
> 
> Some other possible sources of problems are the OpenMP implementation
> and TLS. But if everything works fine with VMX disabled, then they are
> probably not at fault here.

OK I found some time to play with it a bit.

If I comment out:
    imp->combine_32[PIXMAN_OP_OVER] = vmx_combine_over_u;

then affine-test passes.

If I also comment out:
    imp->combine_32[PIXMAN_OP_ADD] = vmx_combine_add_u;

then scaling-test also passes.

I can leave the rest enabled, so either those are fine, or they just
aren't in use in those tests.

Now I did just try turning off openMP, and it appears to solve the problem.
Interesting.  So for some reason without vmx it is fine with openMP,
but vmx + openMP = crash.

And wow is it ever a lot slower without openMP for running the tests.

I still can't make sense of why upgrading libc from 2.13 to 2.17 makes
the crashes go away.  I don't even have to recompile after changing the
version of libc, it just stops crashing.

So it seems openmp + vmx code + libc 2.13 = memory corruption and crash.
Change any one of them, and it doesn't crash.

Maybe this one will just have to remain a mystery.

-- 
Len Sorensen


More information about the Pixman mailing list