[Piglit] Could use your help with a bug involving piglit, waffle, and egl.

Chad Versace chad.versace at linux.intel.com
Mon Sep 16 11:00:24 PDT 2013


CC'ing Piglit.

On 09/14/2013 08:54 AM, Paul Berry wrote:
> I'm investigating a failure in spec/OES_fixed_point/attribute-arrays,
> specifically the command line
> "/home/pberry/.platform/piglit-mesa/piglit/build/bin/oes_fixed_point-attribute-arrays
> -auto -fbo".  It's segfaulting during piglit/waffle initialization due to
> Mesa accessing freed memory.  This only started happening for me recently,
> however I suspect that's because the access to freed memory makes it a
> heisenbug, and the root cause has probably been around for a long time.
>
> What's interesting about this test is that it's a GLES1 test being run in
> -fbo mode, which means that piglit first starts initializing things
> assuming it's going to run the test with fbo's, then at some point it
> figures out that it can't (because fbo's are unsupported), so it tears down
> its configuration and starts a new configuration to test using a window.
>
> While establishing the new configuration, waffle calls eglMakeCurrent().
> Deep in the bowels of Mesa's implementation of this function, it decides
> that it needs to flush the context that was previously current.  But that
> context refers to data structures that were freed when piglit tore down its
> old configuration (specifically, it refers to brw->bufmgr, which was freed
> in response to a call to eglTerminate()).
>
> I've been studying the egl calls made by piglit/waffle during this test and
> I believe they look like this (I may be missing a few but I think I found
> most of them):
>
> Initial setup:
> - eglGetDisplay()
> - eglInitialize() (causes intel_init_bufmgr() to be called, which creates
> bufmgr 1)
> - eglQueryString()
> - eglChooseConfig()
> - eglBindAPI()
> - eglCreateContext() (causes brwCreateContext() to be called, which creates
> context 1)
> - eglGetConfigAttrib()
> - eglCreateWindowSurface()
> - eglMakeCurrent()
> Initial teardown:
> - eglDestroySurface()
> - eglDestroyContext() (interestingly, does not cause intelDestroyContext to
> be called, perhaps because the context is still current?)
> - eglTerminate() (causes intelDestroyScreen() to be called, which frees
> bufmgr 1)
> Second setup:
> - eglGetDisplay()
> - eglInitialize() (causes intel_init_bufmgr() to be called, which creates
> bufmgr 2)
> - eglQueryString()
> - eglChooseConfig()
> - eglBindAPI()
> - eglCreateContext() (causes brwCreateContext() to be called, which creates
> context 2)
> - eglGetConfigAttrib()
> - eglCreateWindowSurface()
> - eglMakeCurrent() (at this point Mesa tries to flush context 1, which
> causes a segfault beause this causes it to try to access bufmgr 1, which
> has already been freed)
>
> So, my questions are:
> - Does it look like piglit/waffle is making an allowed sequence of EGL
> calls?  (In other words, is the bug in Mesa or piglit/waffle?)
> - If the bug is in Mesa, what should be happening instead?  I assume that
> at some point Mesa should have made the current context non-current (and
> destroyed it, perhaps), but I'm not sure when that should have happened,
> nor what code should have been responsible for doing it.
>
> Thanks in advance, Chad.  I hope you're enjoying your business travel!

The sequence of EGL calls is legal. The bug is in Mesa. After discovering
the bug many months ago, I posted a test to the Piglit list, but it was
ignored and then forgotten. (Gerrit please!) I'll repost the test in the
next day or so after rebasing it.

See the comments [1] in my test to see why the sequence of EGL calls is legal.

[1] 
http://cgit.freedesktop.org/~chadversary/piglit/tree/tests/egl/spec/egl-1.4/egl-terminate-then-unbind-context.c?h=egl-terminate-then-unbind#n26

If I correctly understand the EGL spec quote above, the call to eglMakeCurrent that currently
segfaults should instead flush the queued-to-be-destroyed context and then promptly destroy it.


More information about the Piglit mailing list