[Intel-gfx] [PATCH 7/9] drm/i915: stop using ttm_bo_wait

Christian König ckoenig.leichtzumerken at gmail.com
Tue Dec 6 18:06:22 UTC 2022


Am 06.12.22 um 19:03 schrieb Matthew Auld:
> On 05/12/2022 19:58, Christian König wrote:
>> Am 30.11.22 um 15:06 schrieb Daniel Vetter:
>>> On Wed, 30 Nov 2022 at 14:03, Tvrtko Ursulin
>>> <tvrtko.ursulin at linux.intel.com> wrote:
>>>> On 29/11/2022 18:05, Matthew Auld wrote:
>>>>> On Fri, 25 Nov 2022 at 11:14, Tvrtko Ursulin
>>>>> <tvrtko.ursulin at linux.intel.com> wrote:
>>>>>>
>>>>>> + Matt
>>>>>>
>>>>>> On 25/11/2022 10:21, Christian König wrote:
>>>>>>> TTM is just wrapping core DMA functionality here, remove the 
>>>>>>> mid-layer.
>>>>>>> No functional change.
>>>>>>>
>>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 ++++++---
>>>>>>>     1 file changed, 6 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
>>>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>>>> index 5247d88b3c13..d409a77449a3 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>>>>> @@ -599,13 +599,16 @@ i915_ttm_resource_get_st(struct 
>>>>>>> drm_i915_gem_object *obj,
>>>>>>>     static int i915_ttm_truncate(struct drm_i915_gem_object *obj)
>>>>>>>     {
>>>>>>>         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>>>>>>> -     int err;
>>>>>>> +     long err;
>>>>>>>
>>>>>>>         WARN_ON_ONCE(obj->mm.madv == I915_MADV_WILLNEED);
>>>>>>>
>>>>>>> -     err = ttm_bo_wait(bo, true, false);
>>>>>>> -     if (err)
>>>>>>> +     err = dma_resv_wait_timeout(bo->base.resv, 
>>>>>>> DMA_RESV_USAGE_BOOKKEEP,
>>>>>>> +                                 true, 15 * HZ);
>>>>>> This 15 second stuck out a bit for me and then on a slightly 
>>>>>> deeper look
>>>>>> it seems this timeout will "leak" into a few of i915 code paths. 
>>>>>> If we
>>>>>> look at the difference between the legacy shmem and ttm backend I 
>>>>>> am not
>>>>>> sure if the legacy one is blocking or not - but if it can block I 
>>>>>> don't
>>>>>> think it would have an arbitrary timeout like this. Matt your 
>>>>>> thoughts?
>>>>> Not sure what is meant by leak here, but the legacy shmem must also
>>>>> wait/block when unbinding each VMA, before calling truncate. It's the
>>>> By "leak" I meant if 15s timeout propagates into some code paths 
>>>> visible
>>>> from userspace which with a legacy backend instead have an indefinite
>>>> wait. If we have that it's probably not very good to have this
>>>> inconsistency, or to apply an arbitrary timeout to those path to 
>>>> start with.
>>>>
>>>>> same story for the ttm backend, except slightly more complicated in
>>>>> that there might be no currently bound VMA, and yet the GPU could
>>>>> still be accessing the pages due to async unbinds, kernel moves etc,
>>>>> which the wait here (and in i915_ttm_shrink) is meant to protect
>>>>> against. If the wait times out it should just fail gracefully. I 
>>>>> guess
>>>>> we could just use MAX_SCHEDULE_TIMEOUT here? Not sure if it really
>>>>> matters though.
>>>> Right, depends if it can leak or not to userspace and diverge between
>>>> backends.
>>> Generally lock_timeout() is a design bug. It's either
>>> lock_interruptible (or maybe lock_killable) or try_lock, but
>>> lock_timeout is just duct-tape. I haven't dug in to figure out what
>>> should be here, but it smells fishy.
>>
>> Independent of this discussion could I get an rb for removing 
>> ttm_bo_wait() from i915?
>>
>> Exactly hiding this timeout inside TTM is what always made me quite 
>> nervous here.
>
> There are also a few places in i915 calling bo_wait_ctx(), which 
> appears to just wrap ttm_bo_wait(). I guess that should also be 
> converted to dma_resv_wait_timeout()? Or what is the story with that?

If you don't want the ctx timeout then yes, calling 
dma_resv_wait_timeout() instead is the right approach.

>
> Anyway,
> Reviewed-by: Matthew Auld <matthew.auld at intel.com>

Thanks, going to push this through drm-misc-next.

Regards,
Christian.

>
>>
>> Regards,
>> Christian.
>>
>>> -Daniel
>>



More information about the amd-gfx mailing list