[Intel-gfx] [PATCH v2] drm/i915: Fix a VMA UAF for multi-gt platform
Nirmoy Das
nirmoy.das at intel.com
Wed Jun 7 07:44:40 UTC 2023
On 6/6/2023 10:56 PM, Andi Shyti wrote:
> Hi Nirmoy,
>
> On Tue, Jun 06, 2023 at 10:27:55PM +0200, Nirmoy Das wrote:
>> Ensure correct handling of closed VMAs on multi-gt platforms to prevent
>> Use-After-Free. Currently, when GT0 goes idle, closed VMAs that are
>> exclusively added to GT0's closed_vma link (gt->closed_vma) and
>> subsequently freed by i915_vma_parked(), which assumes the entire GPU is
>> idle. However, on platforms with multiple GTs, such as MTL, GT1 may
>> remain active while GT0 is idle. This causes GT0 to mistakenly consider
>> the closed VMAs in its closed_vma list as unnecessary, potentially
>> leading to Use-After-Free issues if a job for GT1 attempts to access a
>> freed VMA.
>>
>> Although we do take a wakeref for GT0 but it happens later, after
>> evaluating VMAs. To mitigate this, it is necessary to hold a GT0 wakeref
>> early.
>>
>> v2: Use gt id to detect multi-tile(Andi)
>> Fix the incorrect error path.
>>
>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>> Cc: Chris Wilson <chris.p.wilson at intel.com>
>> Cc: Andi Shyti <andi.shyti at linux.intel.com>
>> Cc: Andrzej Hajda <andrzej.hajda at intel.com>
>> Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy at intel.com>
>> Tested-by: Andi Shyti <andi.shyti at linux.intel.com>
>> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> I wonder if we need any Fixes tag here, maybe this:
>
> Fixes: d93939730347 ("drm/i915: Remove the vma refcount")
No, vma refcount is not enough unfortunately. Issue is i915_vma_parked()
expects only one GT.
>
>> ---
>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 3aeede6aee4d..c2a67435acfa 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -2683,6 +2683,7 @@ static int
>> eb_select_engine(struct i915_execbuffer *eb)
>> {
>> struct intel_context *ce, *child;
>> + struct intel_gt *gt;
>> unsigned int idx;
>> int err;
>>
>> @@ -2706,10 +2707,16 @@ eb_select_engine(struct i915_execbuffer *eb)
>> }
>> }
>> eb->num_batches = ce->parallel.number_children + 1;
>> + gt = ce->engine->gt;
>>
>> for_each_child(ce, child)
>> intel_context_get(child);
>> intel_gt_pm_get(ce->engine->gt);
>> + /* Keep GT0 active on MTL so that i915_vma_parked() doesn't
>> + * free VMAs while execbuf ioctl is validating VMAs.
>> + */
>> + if (gt->info.id)
>> + intel_gt_pm_get(to_gt(gt->i915));
>>
>> if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
>> err = intel_context_alloc_state(ce);
>> @@ -2748,6 +2755,9 @@ eb_select_engine(struct i915_execbuffer *eb)
>> return err;
>>
>> err:
>> + if (gt->info.id)
>> + intel_gt_pm_put(to_gt(gt->i915));
>> +
>> intel_gt_pm_put(ce->engine->gt);
>> for_each_child(ce, child)
>> intel_context_put(child);
>> @@ -2761,6 +2771,8 @@ eb_put_engine(struct i915_execbuffer *eb)
>> struct intel_context *child;
>>
>> i915_vm_put(eb->context->vm);
>> + if (eb->gt->info.id)
>> + intel_gt_pm_put(to_gt(eb->gt->i915));
>> intel_gt_pm_put(eb->gt);
> I would add a comment up here, just not to scare people when they
> see this.
I can add a comment pairing comment mentioning
eb_select_engine().
> Reviewed-by: Andi Shyti <andi.shyti at linux.intel.com>
Thanks,
Nirmoy
>
> Andi
>
>> for_each_child(eb->context, child)
>> intel_context_put(child);
>> --
>> 2.39.0
More information about the Intel-gfx
mailing list