[Intel-gfx] [PATCH] drm/i915: Fix a VMA UAF for multi-gt platform
Andi Shyti
andi.shyti at linux.intel.com
Mon Jun 5 20:27:16 UTC 2023
Hi Nirmoy,
On Mon, Jun 05, 2023 at 10:10:21PM +0200, Nirmoy Das wrote:
> Ensure correct handling of closed VMAs on multi-gt platforms to prevent
> Use-After-Free. Currently, when GT0 goes idle, closed VMAs that are
> exclusively added to GT0's closed_vma link (gt->closed_vma) and
> subsequently freed by i915_vma_parked(), which assumes the entire GPU is
> idle. However, on platforms with multiple GTs, such as MTL, GT1 may
> remain active while GT0 is idle. This causes GT0 to mistakenly consider
> the closed VMAs in its closed_vma list as unnecessary, potentially
> leading to Use-After-Free issues if a job for GT1 attempts to access a
> freed VMA.
>
> Although we do take a wakeref for GT0 but it happens later, after
> evaluating VMAs. To mitigate this, it is necessary to hold a GT0 wakeref
> early.
hooray! this is great, Nirmoy! I will give it a shot.
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Cc: Chris Wilson <chris.p.wilson at intel.com>
> Cc: Andi Shyti <andi.shyti at linux.intel.com>
> Cc: Andrzej Hajda <andrzej.hajda at intel.com>
> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> ---
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 5fb459ea4294..adcf8837dfe6 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2692,6 +2692,7 @@ static int
> eb_select_engine(struct i915_execbuffer *eb)
> {
> struct intel_context *ce, *child;
> + struct intel_gt *gt;
> unsigned int idx;
> int err;
>
> @@ -2715,10 +2716,16 @@ eb_select_engine(struct i915_execbuffer *eb)
> }
> }
> eb->num_batches = ce->parallel.number_children + 1;
> + gt = ce->engine->gt;
>
> for_each_child(ce, child)
> intel_context_get(child);
> intel_gt_pm_get(ce->engine->gt);
> + /* Keep GT0 active on MTL so that i915_vma_parked() doesn't
> + * free VMAs while execbuf ioctl is validating VMAs.
> + */
> + if (gt != to_gt(gt->i915))
you can use gt->info.id
> + intel_gt_pm_get(to_gt(ce->engine->gt->i915));
>
> if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> err = intel_context_alloc_state(ce);
> @@ -2757,6 +2764,9 @@ eb_select_engine(struct i915_execbuffer *eb)
> return err;
>
> err:
> + if (ce->engine->gt != to_gt(ce->engine->gt->i915))
if (gt->info.id)
gt is already ce->engine->gt
> + intel_gt_pm_get(to_gt(ce->engine->gt->i915));
> +
> intel_gt_pm_put(ce->engine->gt);
> for_each_child(ce, child)
> intel_context_put(child);
> @@ -2770,6 +2780,8 @@ eb_put_engine(struct i915_execbuffer *eb)
> struct intel_context *child;
>
> i915_vm_put(eb->context->vm);
> + if (eb->gt != to_gt(eb->gt->i915))
> + intel_gt_pm_put(to_gt(eb->gt->i915));
this wakeref going up and down is a bit ugly... Perhaps we can
add some flag about the GT type in the info structure. MTL is a
weird multi-gt platform and, indeed, you can't shut down GT0
without affecting GT1.
For now it's OK, though, as to test it.
Andi
> intel_gt_pm_put(eb->gt);
> for_each_child(eb->context, child)
> intel_context_put(child);
> --
> 2.39.0
More information about the Intel-gfx
mailing list