[Mesa-dev] [PATCH] egl/dri2: dri2_make_current: Make sure display is initialized before using it
Nicolas Boichat
drinkcat at chromium.org
Wed Jul 20 06:51:08 UTC 2016
On Mon, Jul 18, 2016 at 4:37 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
> On 18 July 2016 at 04:19, Nicolas Boichat <drinkcat at chromium.org> wrote:
>> On Fri, Jul 15, 2016 at 9:03 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>>> On 15 July 2016 at 09:28, Nicolas Boichat <drinkcat at chromium.org> wrote:
>>>> android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls
>>>> eglTerminate, followed by eglReleaseThread. In that case, the
>>>> display will not be initialized when eglReleaseThread calls
>>>> MakeCurrent with NULL parameters, to unbind the context, which
>>>> causes a a segfault in drv->API.MakeCurrent (dri2_make_current),
>>>> either in glFlush or in a latter call.
>>>>
>>>> A similar case is observed in this bug:
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=69622, where the test
>>>> call eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL).
>>>>
>>>> This patch:
>>>> 1. Validates that the display is initialized, if ctx/dsurf/rsurf
>>>> are not all NULL.
>>>> 2. Does not call glFlush/unBindContext is there is no display.
>>>> 3. However, it still goes through the normal path as
>>>> drv->API.DestroyContext decrements the reference count on the
>>>> context, and frees the structure.
>>>>
>>>> Cc: "11.2 12.0" <mesa-stable at lists.freedesktop.org>
>>>> Signed-off-by: Nicolas Boichat <drinkcat at chromium.org>
>>>> ---
>>>>
>>>> I did not actually test the case reported on the bug, only the CTS
>>>> test, would be nice if someone can try it out (Rhys? Chad?).
>>>>
>>>> Applies after "egl/dri2: dri2_make_current: Set EGL error if bindContext
>>>> fails", but wanted to keep the 2 patches separate as they are different
>>>> problems and discussions.
>>>>
>>>> src/egl/drivers/dri2/egl_dri2.c | 13 ++++++++++---
>>>> 1 file changed, 10 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
>>>> index 2e97d85..3269e52 100644
>>>> --- a/src/egl/drivers/dri2/egl_dri2.c
>>>> +++ b/src/egl/drivers/dri2/egl_dri2.c
>>>> @@ -1166,7 +1166,8 @@ dri2_destroy_context(_EGLDriver *drv, _EGLDisplay *disp, _EGLContext *ctx)
>>>> struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
>>>>
>>>> if (_eglPutContext(ctx)) {
>>>> - dri2_dpy->core->destroyContext(dri2_ctx->dri_context);
>>>> + if (dri2_dpy)
>>> Not sure if this is the right place/we need this.
>>>
>>> When calling eglDestroyContext (after eglTerminate or not) this cannot
>>> happen since the _EGL_CHECK_CONTEXT macro will dive into the _eglCheck
>>> functions and will bail out since the since the dpy is no initialized.
>>>
>>> The only case we can reach this is from dri2_make_current. See below for more.
>>
>> Yes, that's correct.
>>
>>>> @@ -1189,6 +1190,10 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
>>>> __DRIdrawable *ddraw, *rdraw;
>>>> __DRIcontext *cctx;
>>>>
>>>> + /* Check that the display is initialized */
>>>> + if ((ctx != NULL || dsurf != NULL || rsurf != NULL) && !dri2_dpy)
>>>> + return _eglError(EGL_NOT_INITIALIZED, "eglMakeCurrent error");
>>>> +
>>> Was doing to say "the first few lines in eglMakeCurrent) should
>>> already handle these" only to realise that dri2_dpy doesn't wrap
>>> around/subclass the respective _EGLfoo type (_EGLDisplay), like the
>>> dri2_egl_{surface,context...} :-\
>>>
>>> At the same time, dri2_make_current cannot (should not really) do
>>> anything in the !dri2_dpy case, since we cannot:
>>> - call unbind the old ctx bind the new one, or even restore/rebind
>>> the original ctx
>>> - free any of the ctx/surf resources. The latter should already be
>>> gone, as upon eglTerminate walk over the ctx/surf lists and tear them
>>> all down (see _eglReleaseDisplayResources).
>>>
>>> Thus something like the following should be fine imho.
>>>
>>> if (!dri2_dpy)
>>> /* copy/reword some my earlier comment in here */
>>> return 0;
>>
>> Yes, that was my first approach, but we'd still leak the current dri2_ctx.
>>
>> If I trace the reference counting right:
>> 1. eglCreateContext => API.CreateContext =>
>> - _eglInitResource => RefCount = 1
>> - _eglLinkContext => RefCount = 2
>> 2. eglMakeCurrent => API.MakeCurrent => _eglBindContext =>
>> eglGetContext => RefCount = 3
>> 3. eglTerminate => API.Terminate => _eglReleaseDisplayResources =>
>> - _eglUnlinkContext => _eglUnlinkResource => Refcount = 2
>> - API.DestroyContext => RefCount = 1 (so context structure is not freed)
>> 4. eglReleaseThread (or eglMakeCurrent(dpy, NULL, NULL, NULL)) =>
>> API.MakeCurrent(drv, disp, NULL, NULL, NULL) => API.DestroyContext =>
>> Refcount = 0 and structure is freed (with my current patch).
>>
>> Well, the structure is freed, but driDestroyContext is never called
>> (so pcp->driScreenPriv->driver->DestroyContext is not called), which,
>> looks a bit concerning...
>>
> This is exactly what made me question the proposed approach.
>
>> So one way around this is to get eglTerminate to destroy the current> context(s). But then the spec says
>> (https://www.khronos.org/registry/egl/sdk/docs/man/html/eglTerminate.xhtml):
>> If contexts or surfaces associated with display is current to any
>> thread, they are not released until they are no longer current as a
>> result of eglMakeCurrent.
>>
> The way I see/read it:
> - 'leaking' ctx/surf resources if the user has not called
> [eglMakeCurrent(... !NULL, !NULL) + eglTerminate +] eglReleaseThread
> or eglMakeCurrent(...NULL,NULL) [+ eglTerminate] is expected.
> - we need to defer dpy vtbls (and active dri dpy connection?) in
> eglReleaseThread or ctx/surf will leak
>
> IMHO the better approach is to defer the dpy teardown (in the case of
> missing eglMakeCurrent(...NULL,NULL)) to eglReleaseThread. Since that
> one is more evasive (but shouldn't be too crazy) we could get this
> patch, and revert to 'if (!dri2_dpy) return;' once everything else is
> handled.
>
> Not sure when we'll get time for that one, so we want an inline
> XXX/TODO or two that roughly explains the idea.
Turns out this patch is quite leaky in this CTS test:
android.opengl.cts.WrapperTest#testThreadCleanup, which basically runs
eglInitialize, eglMakeCurrent, eglTerminate, eglReleaseThread in
separate thread, 1000 times in a row. (before, it would crash, with
this patch, it leaks a lot of memory)
I managed to add some reference counting to dri2_dpy instead, I'll
upload a patch soon.
> Related: from a very brief look we seems to be leaking the
> _EGLsync/_EGLimage resources.
>
> Regards,
> Emil
More information about the mesa-dev
mailing list