[PATCH 1/3] drm/i915: audit bo->resource usage

Christian König christian.koenig at amd.com
Wed Aug 31 11:03:13 UTC 2022


Am 31.08.22 um 12:37 schrieb Matthew Auld:
> [SNIP]
>>>
>>> That hopefully just leaves i915_ttm_shrink(), which is swapping out 
>>> shmem ttm_tt and is calling ttm_bo_validate() with empty placements 
>>> to force the pipeline-gutting path, which importantly unpopulates 
>>> the ttm_tt for us (since ttm_tt_unpopulate is not exported it 
>>> seems). But AFAICT it looks like that will now also nuke the 
>>> bo->resource, instead of just leaving it in system memory. My 
>>> assumption is that when later calling ttm_bo_validate(), it will 
>>> just do the bo_move_null() in i915_ttm_move(), instead of 
>>> re-populating the ttm_tt and then potentially copying it back to 
>>> local-memory?
>>
>> Well you do ttm_bo_validate() with something like GTT domain, don't 
>> you? This should result in re-populating the tt object, but I'm not 
>> 100% sure if that really works as expected.
>
> AFAIK for domains we either have system memory (which uses ttm_tt and 
> might be shmem underneath) or local-memory. But perhaps i915 is doing 
> something wrong here, or abusing TTM in some way. I'm not sure tbh.
>
> Anyway, I think we have two cases here:
>
> - We have some system memory only object. After doing 
> i915_ttm_shrink(), bo->resource is now NULL. We then call 
> ttm_bo_validate() at some later point, but here we don't need to copy 
> anything, but it also looks like ttm_bo_handle_move_mem() won't 
> populate the ttm_tt or us either, since mem_type == TTM_PL_SYSTEM. It 
> looks like i915_ttm_move() was taking care of this, but now we just 
> call ttm_bo_move_null().
>
> - We have a local-memory only object, which was evicted to shmem, and 
> then swapped out by the shrinker like above. The bo->resource is NULL. 
> However this time when calling ttm_bo_validate() we need to actually 
> do a copy in i915_ttm_move(), as well as re-populate the ttm_tt. 
> i915_ttm_move() was taking care of this, but now we just call 
> ttm_bo_move_null().
>
> Perhaps i915 is doing something wrong in the above two cases?

Mhm, as far as I can see that should still work.

See previously you should got a transition from SYSTEM->GTT in 
i915_ttm_move() to re-create your backing store. Not you get 
NULL->SYSTEM which is handled by ttm_bo_move_null() and then SYSTEM->GTT.

If you just validated to SYSTEM memory before I think the tt object 
wouldn't have been populated either.

Regards,
Christian.

>
>>
>> Thanks,
>> Christian.
>>
>>>
>>>>
>>>> I've been considering to replacing the ttm_bo_type with a bunch of 
>>>> behavior flags for a bo. I'm hoping that this will clean things up 
>>>> a bit.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>>>>       caching = i915_ttm_select_tt_caching(obj);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c 
>>>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>> index 9a7e50534b84bb..c420d1ab605b6f 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>> @@ -560,7 +560,7 @@ int i915_ttm_move(struct ttm_buffer_object 
>>>>>>> *bo, bool evict,
>>>>>>>       bool clear;
>>>>>>>       int ret;
>>>>>>> -    if (GEM_WARN_ON(!obj)) {
>>>>>>> +    if (GEM_WARN_ON(!obj) || !bo->resource) {
>>>>>>>           ttm_bo_move_null(bo, dst_mem);
>>>>>>>           return 0;
>>>>>>>       }
>>>>>>
>>>>
>>



More information about the dri-devel mailing list