[Mesa-dev] [Bug 94955] Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)

Mon Apr 18 17:56:12 UTC 2016

https://bugs.freedesktop.org/show_bug.cgi?id=94955

--- Comment #7 from David Lonie <david.lonie at kitware.com> ---
> Comment # 1 on bug 94955 from Brian Paul
> (In reply to David Lonie from comment #0)
> > ==32054== Conditional jump or move depends on uninitialised value(s)
> > ==32054==    at 0x5367CF7: util_framebuffer_state_equal (u_framebuffer.c:58)
> > ==32054==    by 0x5444AFE: llvmpipe_set_framebuffer_state
> > (lp_state_surface.c:54)
> > ==32054==    by 0x53561DA: util_blitter_blit_generic (u_blitter.c:1694)
> > ==32054==    by 0x5356819: util_blitter_blit (u_blitter.c:1813)
> > ==32054==    by 0x544602C: lp_blit (lp_surface.c:117)
> > ==32054==    by 0x51705F7: st_CopyTexSubImage (st_cb_texture.c:2672)
> > ==32054==    by 0x50B2B03: copytexsubimage_by_slice (teximage.c:3459)
> > ==32054==    by 0x50B330D: copyteximage (teximage.c:3644)
> > ==32054==    by 0x50B3476: _mesa_CopyTexImage2D (teximage.c:3680)
> > ==32054==    by 0x4D340E: ??? (in /usr/bin/glretrace)
> > ==32054==    by 0x40CCCC: ??? (in /usr/bin/glretrace)
> > ==32054==    by 0x40D2A7: ??? (in /usr/bin/glretrace)
>
> This one looks easy to fix.  Though, I wasn't able to reproduce the valgrind
> warning here with piglit's copytexsubimage test which definitely hits the same
> code path.

I've poked at it, and this patch seems to do the trick for me:

--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -1573,6 +1573,8 @@ void util_blitter_blit_generic(struct blitter_context
*blitter,
    fb_state.nr_cbufs = blit_depth || blit_stencil ? 0 : 1;
    fb_state.cbufs[0] = NULL;
    fb_state.zsbuf = NULL;
+   fb_state.samples = 0;
+   fb_state.layers = 0;

    if (blit_depth || blit_stencil) {
       pipe->bind_blend_state(pipe, ctx->blend[0][0]);

> Comment # 2 on bug 94955 from Roland Scheidegger
> (In reply to Brian Paul from comment #1)
> > (In reply to David Lonie from comment #0)
> But in any case, I HIGHLY doubt these two are the reason for any random
> segfaults, I certainly don't see any evidence here. So, a backtrace of the
> actual crash would probably be more useful.

Glad the other two seem harmless. If these aren't likely to cause a segfault
I'll keep poking around. The backtrace is difficult to obtain because by the
time it crashes the stack is corrupt and the resulting backtrace is
meaningless.

Since the issue is a stack corruption, it makes it difficult to step through,
since the crashes happen somewhat randomly, and when it does all debugger state
is lost since the stack is nonsense. Hence why I thought the memory errors
would be at fault, but it looks like this is going to be trickier and more
subtle than it seemed.

I'll keep looking and update when I find something else.

> Comment # 3 on bug 94955 from Emil Velikov
> (In reply to David Lonie from comment #0)
> > which confuses my linker/loader ;) Another bug?)
> > 
> Indeed it is - your colleague (?) Chuck Atkins is working on that one. See
> bug#94086.

So he is! Glad it's a known issue, that one took some work to track down ;)

> Comment # 4 on bug 94955 from Roland Scheidegger
> FWIW the trace has some issues, namely it requests a 4.5 context thus needs
> overrides to run. Not sure if this actually causes problems.

The 4.5 override was needed for my apitrace replay context. The actual
segfaults are happening on a machine using a different (I believe 3.2?)
context.

Is there information around that details how to get a better apitrace for you
folks? I have another segfaulting test I could capture for you.

> Comment # 6 on bug 94955 from Roland Scheidegger
> So, I'm going to mark this bug as fixed. Two minor issues have been addressed
> (well actually the fb one could have real consequences resulting in unnecessary
> state updates), and I'm not seeing any random segfaults in any case. Even if
> valgrind is right about the uninitialized values in the jit code that won't
> cause crashes (valgrind would say invalid read/write for anything which could
> crash). Feel free to open a new bug if you see crashes (preferably with a
> backtrace) or misrenderings.

Sounds good to me. Thanks so much for the fast replies and thorough checking of
the valgrind reports, even if they turned out to be bogus!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160418/72849e97/attachment-0001.html>