[Intel-gfx] [PATCH 1/3] drm/i915: audit bo->resource usage
Matthew Auld
matthew.auld at intel.com
Wed Aug 31 12:06:25 UTC 2022
On 31/08/2022 12:03, Christian König wrote:
> Am 31.08.22 um 12:37 schrieb Matthew Auld:
>> [SNIP]
>>>>
>>>> That hopefully just leaves i915_ttm_shrink(), which is swapping out
>>>> shmem ttm_tt and is calling ttm_bo_validate() with empty placements
>>>> to force the pipeline-gutting path, which importantly unpopulates
>>>> the ttm_tt for us (since ttm_tt_unpopulate is not exported it
>>>> seems). But AFAICT it looks like that will now also nuke the
>>>> bo->resource, instead of just leaving it in system memory. My
>>>> assumption is that when later calling ttm_bo_validate(), it will
>>>> just do the bo_move_null() in i915_ttm_move(), instead of
>>>> re-populating the ttm_tt and then potentially copying it back to
>>>> local-memory?
>>>
>>> Well you do ttm_bo_validate() with something like GTT domain, don't
>>> you? This should result in re-populating the tt object, but I'm not
>>> 100% sure if that really works as expected.
>>
>> AFAIK for domains we either have system memory (which uses ttm_tt and
>> might be shmem underneath) or local-memory. But perhaps i915 is doing
>> something wrong here, or abusing TTM in some way. I'm not sure tbh.
>>
>> Anyway, I think we have two cases here:
>>
>> - We have some system memory only object. After doing
>> i915_ttm_shrink(), bo->resource is now NULL. We then call
>> ttm_bo_validate() at some later point, but here we don't need to copy
>> anything, but it also looks like ttm_bo_handle_move_mem() won't
>> populate the ttm_tt or us either, since mem_type == TTM_PL_SYSTEM. It
>> looks like i915_ttm_move() was taking care of this, but now we just
>> call ttm_bo_move_null().
>>
>> - We have a local-memory only object, which was evicted to shmem, and
>> then swapped out by the shrinker like above. The bo->resource is NULL.
>> However this time when calling ttm_bo_validate() we need to actually
>> do a copy in i915_ttm_move(), as well as re-populate the ttm_tt.
>> i915_ttm_move() was taking care of this, but now we just call
>> ttm_bo_move_null().
>>
>> Perhaps i915 is doing something wrong in the above two cases?
>
> Mhm, as far as I can see that should still work.
>
> See previously you should got a transition from SYSTEM->GTT in
> i915_ttm_move() to re-create your backing store. Not you get
> NULL->SYSTEM which is handled by ttm_bo_move_null() and then SYSTEM->GTT.
What is GTT here in TTM world? Also I'm not seeing where there is this
SYSTEM->GTT transition? Maybe I'm blind. Just to be clear, i915 is only
calling ttm_bo_validate() once when acquiring the pages, and we don't
call it again, unless it was evicted (and potentially swapped out).
>
> If you just validated to SYSTEM memory before I think the tt object
> wouldn't have been populated either.
>
> Regards,
> Christian.
>
>>
>>>
>>> Thanks,
>>> Christian.
>>>
>>>>
>>>>>
>>>>> I've been considering to replacing the ttm_bo_type with a bunch of
>>>>> behavior flags for a bo. I'm hoping that this will clean things up
>>>>> a bit.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>>>> caching = i915_ttm_select_tt_caching(obj);
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>> index 9a7e50534b84bb..c420d1ab605b6f 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>> @@ -560,7 +560,7 @@ int i915_ttm_move(struct ttm_buffer_object
>>>>>>>> *bo, bool evict,
>>>>>>>> bool clear;
>>>>>>>> int ret;
>>>>>>>> - if (GEM_WARN_ON(!obj)) {
>>>>>>>> + if (GEM_WARN_ON(!obj) || !bo->resource) {
>>>>>>>> ttm_bo_move_null(bo, dst_mem);
>>>>>>>> return 0;
>>>>>>>> }
>>>>>>>
>>>>>
>>>
>
More information about the Intel-gfx
mailing list