[Intel-gfx] [BrownBag] drm/i915/gtt: Neuter the deferred unbind callback from gen6_ppgtt_cleanup

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri May 24 09:08:06 UTC 2019


On 24/05/2019 10:01, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-24 09:57:42)
>>
>> On 24/05/2019 09:51, Tvrtko Ursulin wrote:
>>>
>>> On 24/05/2019 09:36, Chris Wilson wrote:
>>>> Quoting Tvrtko Ursulin (2019-05-24 09:31:45)
>>>>>
>>>>> On 24/05/2019 09:29, Chris Wilson wrote:
>>>>>> Quoting Tvrtko Ursulin (2019-05-24 09:23:40)
>>>>>>>
>>>>>>> On 24/05/2019 09:17, Chris Wilson wrote:
>>>>>>>> Quoting Tvrtko Ursulin (2019-05-24 09:13:14)
>>>>>>>>>
>>>>>>>>> On 24/05/2019 07:45, Chris Wilson wrote:
>>>>>>>>>> Having deferred the vma destruction to a worker where we can
>>>>>>>>>> acquire the
>>>>>>>>>> struct_mutex, we have to avoid chasing back into the now destroyed
>>>>>>>>>> ppgtt. The pd_vma is special in having a custom unbind function
>>>>>>>>>> to scan
>>>>>>>>>> for unused pages despite the VMA itself being notionally part of
>>>>>>>>>> the
>>>>>>>>>> GGTT. As such, we need to disable that callback to avoid a
>>>>>>>>>> use-after-free.
>>>>>>>>>>
>>>>>>>>>> This unfortunately blew up so early during boot that CI declared
>>>>>>>>>> the
>>>>>>>>>> machine unreachable as opposed to being the major failure it
>>>>>>>>>> was. Oops.
>>>>>>>>>>
>>>>>>>>>> Fixes: d3622099c76f ("drm/i915/gtt: Always acquire struct_mutex
>>>>>>>>>> for gen6_ppgtt_cleanup")
>>>>>>>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>>>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>>>>> Cc: Tomi Sarvela <tomi.p.sarvela at intel.com>
>>>>>>>>>> ---
>>>>>>>>>>       drivers/gpu/drm/i915/i915_gem_gtt.c | 28
>>>>>>>>>> ++++++++++++++++++++++++++++
>>>>>>>>>>       1 file changed, 28 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>>>>> b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>>>>> index 8d8a4b0ad4d9..266baa11df64 100644
>>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>>>>> @@ -1847,6 +1847,33 @@ static void
>>>>>>>>>> gen6_ppgtt_cleanup_work(struct work_struct *wrk)
>>>>>>>>>>           kfree(work);
>>>>>>>>>>       }
>>>>>>>>>> +static int nop_set_pages(struct i915_vma *vma)
>>>>>>>>>> +{
>>>>>>>>>> +     return -ENODEV;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static void nop_clear_pages(struct i915_vma *vma)
>>>>>>>>>> +{
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static int nop_bind(struct i915_vma *vma,
>>>>>>>>>> +                 enum i915_cache_level cache_level,
>>>>>>>>>> +                 u32 unused)
>>>>>>>>>> +{
>>>>>>>>>> +     return -ENODEV;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static void nop_unbind(struct i915_vma *vma)
>>>>>>>>>> +{
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static const struct i915_vma_ops nop_vma_ops = {
>>>>>>>>>> +     .set_pages = nop_set_pages,
>>>>>>>>>> +     .clear_pages = nop_clear_pages,
>>>>>>>>>> +     .bind_vma = nop_bind,
>>>>>>>>>> +     .unbind_vma = nop_unbind,
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>>       static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
>>>>>>>>>>       {
>>>>>>>>>>           struct gen6_hw_ppgtt *ppgtt =
>>>>>>>>>> to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
>>>>>>>>>> @@ -1855,6 +1882,7 @@ static void gen6_ppgtt_cleanup(struct
>>>>>>>>>> i915_address_space *vm)
>>>>>>>>>>           /* FIXME remove the struct_mutex to bring the locking
>>>>>>>>>> under control */
>>>>>>>>>>           INIT_WORK(&work->base, gen6_ppgtt_cleanup_work);
>>>>>>>>>>           work->vma = ppgtt->vma;
>>>>>>>>>> +     work->vma->ops = &nop_vma_ops;
>>>>>>>>>
>>>>>>>>> Could we use some asserts before overriding the vma ops? Like
>>>>>>>>> GEM_BUG_ON(vma->pages)? And something for still bound?
>>>>>>>>
>>>>>>>> It technically still is bound as it is in the GGTT but currently
>>>>>>>> unpinned -- that will be checked on destroy, it's just we also get an
>>>>>>>> unbind callback. vma->pages doesn't exist for this (set to ERR_PTR).
>>>>>>>
>>>>>>> If we are getting the unbind callback and we nop-ed it, who will
>>>>>>> actually do it's job?
>>>>>>
>>>>>> The callback is just a hook for us to prune within the ppgtt.
>>>>>> It still is removed from GGTT by i915_vma_unbind().
>>>>>
>>>>> So it needs GEM_BUG_ON(ppgtt->scan_for_unused_pt) before overriding the
>>>>> unbind?
>>>>
>>>> No. They get freed by the cleanup itself. The scan is just an
>>>> opportunistic prune if either the context/mm is evicted but still alive.
>>>
>>> Then the same assert in gen6_ppgtt_cleanup_work? :)
>>
>> Okay ppgtt is gone so can't do it.. annoying.. Cleanup seems to support
>> your claims but I think we need a BFC (big fat comment) above the vma
>> ops override to explains this. With that:
> 
> It has FIXME! I really do hope this is short term...

Ok, that's good enough.

Regards,

Tvrtko


More information about the Intel-gfx mailing list