[Pixman] testsuite fails on power7
Lennart Sorensen
lsorense at csclub.uwaterloo.ca
Mon Sep 16 12:31:18 PDT 2013
On Mon, Sep 16, 2013 at 12:37:06PM -0400, Lennart Sorensen wrote:
> On Fri, Aug 30, 2013 at 12:26:44AM +0300, Siarhei Siamashka wrote:
> > I'm not really familiar with the Altivec intrinsics. They might provide
> > some syntax sugar (which also might be compiler specific). But the
> > intrinsics are converted to the Altivec instructions in the generated
> > code in the end. There are two Altivec manuals (for assembly and
> > intrinsics) here:
> >
> > http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPEM.pdf
> > http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf
> >
> > Reads outside the malloc area are fine. If we need to read only a single
> > last pixel of the image, then reading the whole 16 byte chunk it
> > belongs to is also fine. It is not going to cause any segfaults (the
> > needed bytes and the extra bytes from this 16 byte chunk all belong to
> > the same memory page). And merely reading obviously can't corrupt
> > memory. I think similar tricks are used by glibc also for x86, that's
> > why valgrind ships with a list of suppressions for false positives.
> >
> > But any writes outside of the malloc area are really bad. Even if they
> > write back the same value that was read from this memory location just
> > a few instructions ago.
>
> It certainly seems bad.
>
> > The possible solution is to add some extra code before the VMX combiner
> > loops to align the destination to 16 bytes boundary. Just like it is
> > done for SSE2:
> >
> > http://cgit.freedesktop.org/pixman/tree/pixman/pixman-sse2.c?id=pixman-0.30.2#n663
>
> Hmm, having some issues trying to figure out what that code is trying
> to do. I really wish people could use meaningful variable names.
> That is, meaningful to other people too.
>
> > You can try to disable most of the VMX combiners by commenting out
> > the pointers initialization here:
> >
> > http://cgit.freedesktop.org/pixman/tree/pixman/pixman-vmx.c?id=pixman-0.30.2#n1622
> >
> > For debugging purposes keep just one combiner, which can still reliably
> > trigger the problem. Then try to fix this problem. And then apply the
> > same fix to the rest of the VMX code.
> >
> > We may also have troubles accessing memory before the malloc area. The
> > address of the allocated memory block also should be 16 bytes aligned
> > to workaround the problem. So just allocating 16 bytes extra is not
> > enough. You can try using memalign/posix_memalign to test this. In any
> > case, that's only a test to investigate/confirm the problem. It might
> > be not worth wasting time.
> >
> > > I am not sure if the vec_ld is implemented in the compiler or libc,
> >
> > The intrinsics are converted to assembly instructions by the compiler.
> >
> > > and I can't remember if I still used the same gcc version when testing
> > > with libc 2.17. I am using gcc 4.6 from Debian wheezy at the moment.
> > > I am pretty sure I tried with 4.7 as well with no change in behaviour.
> >
> > I suspect that the only relevant difference between glibc versions
> > affecting this bug could be the malloc implementation.
> >
> > Some other possible sources of problems are the OpenMP implementation
> > and TLS. But if everything works fine with VMX disabled, then they are
> > probably not at fault here.
>
> OK I found some time to play with it a bit.
>
> If I comment out:
> imp->combine_32[PIXMAN_OP_OVER] = vmx_combine_over_u;
>
> then affine-test passes.
>
> If I also comment out:
> imp->combine_32[PIXMAN_OP_ADD] = vmx_combine_add_u;
>
> then scaling-test also passes.
>
> I can leave the rest enabled, so either those are fine, or they just
> aren't in use in those tests.
>
> Now I did just try turning off openMP, and it appears to solve the problem.
> Interesting. So for some reason without vmx it is fine with openMP,
> but vmx + openMP = crash.
>
> And wow is it ever a lot slower without openMP for running the tests.
>
> I still can't make sense of why upgrading libc from 2.13 to 2.17 makes
> the crashes go away. I don't even have to recompile after changing the
> version of libc, it just stops crashing.
>
> So it seems openmp + vmx code + libc 2.13 = memory corruption and crash.
> Change any one of them, and it doesn't crash.
>
> Maybe this one will just have to remain a mystery.
And to make it even more annoying to track down:
It doesn't fail on a power6, only on a power7. power7 machines are
known to have found numerous powerpc memory barrier bugs in code
(including compiler and library code), where earlier generations let
you get away with stuff that the architecture didn't actually allow,
but which usually worked.
So it seems to be a bug trigged by: vmx code used with openmp with libc
2.13 on power7. Change any one of those 4 things, and the bug vanishes.
In fact if I run the test that fails using 'strace', the bug also
disappears probably due to timing impacts. Makes it pretty darn hard
to find. As far as I recall, even running it under gdb made the problem
vanish.
--
Len Sorensen
More information about the Pixman
mailing list