[Intel-gfx] [PATCH 1/3] drm/i915: audit bo->resource usage
Christian König
christian.koenig at amd.com
Wed Aug 31 12:35:43 UTC 2022
Am 31.08.22 um 14:06 schrieb Matthew Auld:
> On 31/08/2022 12:03, Christian König wrote:
>> Am 31.08.22 um 12:37 schrieb Matthew Auld:
>>> [SNIP]
>>>>>
>>>>> That hopefully just leaves i915_ttm_shrink(), which is swapping
>>>>> out shmem ttm_tt and is calling ttm_bo_validate() with empty
>>>>> placements to force the pipeline-gutting path, which importantly
>>>>> unpopulates the ttm_tt for us (since ttm_tt_unpopulate is not
>>>>> exported it seems). But AFAICT it looks like that will now also
>>>>> nuke the bo->resource, instead of just leaving it in system
>>>>> memory. My assumption is that when later calling
>>>>> ttm_bo_validate(), it will just do the bo_move_null() in
>>>>> i915_ttm_move(), instead of re-populating the ttm_tt and then
>>>>> potentially copying it back to local-memory?
>>>>
>>>> Well you do ttm_bo_validate() with something like GTT domain, don't
>>>> you? This should result in re-populating the tt object, but I'm not
>>>> 100% sure if that really works as expected.
>>>
>>> AFAIK for domains we either have system memory (which uses ttm_tt
>>> and might be shmem underneath) or local-memory. But perhaps i915 is
>>> doing something wrong here, or abusing TTM in some way. I'm not sure
>>> tbh.
>>>
>>> Anyway, I think we have two cases here:
>>>
>>> - We have some system memory only object. After doing
>>> i915_ttm_shrink(), bo->resource is now NULL. We then call
>>> ttm_bo_validate() at some later point, but here we don't need to
>>> copy anything, but it also looks like ttm_bo_handle_move_mem() won't
>>> populate the ttm_tt or us either, since mem_type == TTM_PL_SYSTEM.
>>> It looks like i915_ttm_move() was taking care of this, but now we
>>> just call ttm_bo_move_null().
>>>
>>> - We have a local-memory only object, which was evicted to shmem,
>>> and then swapped out by the shrinker like above. The bo->resource is
>>> NULL. However this time when calling ttm_bo_validate() we need to
>>> actually do a copy in i915_ttm_move(), as well as re-populate the
>>> ttm_tt. i915_ttm_move() was taking care of this, but now we just
>>> call ttm_bo_move_null().
>>>
>>> Perhaps i915 is doing something wrong in the above two cases?
>>
>> Mhm, as far as I can see that should still work.
>>
>> See previously you should got a transition from SYSTEM->GTT in
>> i915_ttm_move() to re-create your backing store. Not you get
>> NULL->SYSTEM which is handled by ttm_bo_move_null() and then
>> SYSTEM->GTT.
>
> What is GTT here in TTM world? Also I'm not seeing where there is this
> SYSTEM->GTT transition? Maybe I'm blind. Just to be clear, i915 is
> only calling ttm_bo_validate() once when acquiring the pages, and we
> don't call it again, unless it was evicted (and potentially swapped out).
Well GTT means TTM_PL_TT.
And calling it only once is perfectly fine, TTM will internally see that
we need two hops to reach TTM_PL_TT and so does the NULL->SYSTEM
transition and then SYSTEM->TT.
As far as I can see that should work like it did before.
Christian.
>
>>
>> If you just validated to SYSTEM memory before I think the tt object
>> wouldn't have been populated either.
>>
>> Regards,
>> Christian.
>>
>>>
>>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>>
>>>>>>
>>>>>> I've been considering to replacing the ttm_bo_type with a bunch
>>>>>> of behavior flags for a bo. I'm hoping that this will clean
>>>>>> things up a bit.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>>>> caching = i915_ttm_select_tt_caching(obj);
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>>> index 9a7e50534b84bb..c420d1ab605b6f 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>>>>>>>>> @@ -560,7 +560,7 @@ int i915_ttm_move(struct ttm_buffer_object
>>>>>>>>> *bo, bool evict,
>>>>>>>>> bool clear;
>>>>>>>>> int ret;
>>>>>>>>> - if (GEM_WARN_ON(!obj)) {
>>>>>>>>> + if (GEM_WARN_ON(!obj) || !bo->resource) {
>>>>>>>>> ttm_bo_move_null(bo, dst_mem);
>>>>>>>>> return 0;
>>>>>>>>> }
>>>>>>>>
>>>>>>
>>>>
>>
More information about the Intel-gfx
mailing list