[Intel-gfx] [PATCH] drm/i915: Release ctx->syncobj on final put, not on ctx close
Daniel Vetter
daniel at ffwll.ch
Tue Aug 10 10:13:16 UTC 2021
On Mon, Aug 09, 2021 at 08:47:14PM +0200, Daniel Vetter wrote:
> On Sun, Aug 8, 2021 at 2:56 AM Jason Ekstrand <jason at jlekstrand.net> wrote:
> >
> > On August 6, 2021 15:18:59 Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> >
> >> gem context refcounting is another exercise in least locking design it
> >> seems, where most things get destroyed upon context closure (which can
> >> race with anything really). Only the actual memory allocation and the
> >> locks survive while holding a reference.
> >>
> >> This tripped up Jason when reimplementing the single timeline feature
> >> in
> >>
> >> commit 00dae4d3d35d4f526929633b76e00b0ab4d3970d
> >> Author: Jason Ekstrand <jason at jlekstrand.net>
> >> Date: Thu Jul 8 10:48:12 2021 -0500
> >>
> >> drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)
> >>
> >> We could fix the bug by holding ctx->mutex, but it's cleaner to just
> >
> >
> > What bug is this fixing, exactly?
>
> Oh lol I was all busy ranting and not explaining :-) I found it while
> auditing other context stuff, so that other patch has the longer
> commit message with more history, but that patch is also now tied into
> the vm-dercuify, so short summary: You put the syncobj in context
> close (i.e. CTX_DESTRY ioctl or close(drmfd)), not in the final
> kref_put. Which means you're open to a use-after-free if you race
> against an execbuf. ctx->vm is equally broken (but for other ioctl),
> once this fix here is merged I send out the ctx->vm fix because that's
> tied into the vm-dercuify now due to conflicts.
CI caught more, so just explaining what I'm fixing here isn't going to be
enough.
The troubel is that drm_syncobj_put is now called from very awkward
places, and I need to see whether we can fix that. Or whether we need more
work_struct (like we have already for i915_address_space) for the final
release.
-Daniel
> -Daniel
>
> >
> > --Jason
> >
> >>
> >> make the context object actually invariant over its _entire_ lifetime.
> >>
> >> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> >> Fixes: 00dae4d3d35d ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)")
> >> Cc: Jason Ekstrand <jason at jlekstrand.net>
> >> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >> Cc: Matthew Brost <matthew.brost at intel.com>
> >> Cc: Matthew Auld <matthew.auld at intel.com>
> >> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> >> Cc: "Thomas Hellström" <thomas.hellstrom at intel.com>
> >> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
> >> Cc: Dave Airlie <airlied at redhat.com>
> >> ---
> >> drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 +++---
> >> 1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >> index 754b9b8d4981..93ba0197d70a 100644
> >> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >> @@ -940,6 +940,9 @@ void i915_gem_context_release(struct kref *ref)
> >> trace_i915_context_free(ctx);
> >> GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
> >>
> >> + if (ctx->syncobj)
> >> + drm_syncobj_put(ctx->syncobj);
> >> +
> >> mutex_destroy(&ctx->engines_mutex);
> >> mutex_destroy(&ctx->lut_mutex);
> >>
> >> @@ -1159,9 +1162,6 @@ static void context_close(struct i915_gem_context *ctx)
> >> if (vm)
> >> i915_vm_close(vm);
> >>
> >> - if (ctx->syncobj)
> >> - drm_syncobj_put(ctx->syncobj);
> >> -
> >> ctx->file_priv = ERR_PTR(-EBADF);
> >>
> >> /*
> >> --
> >> 2.32.0
> >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list