[Intel-xe] [PATCH v4 4/6] drm/xe/buddy: remove the virtualized start

Mon Mar 20 08:44:01 UTC 2023

On 3/17/23 3:02 PM, Matthew Auld wrote:
> On 17/03/2023 11:52, Gwan-gyeong Mun wrote:
>>
>>
>> On 3/14/23 10:58 AM, Matthew Auld wrote:
>>> Hopefully not needed anymore. We can add a .compatible() hook once we
>>> need to differentiate between mappable and non-mappable vram. If the
>>> allocation is not contiguous then the start value is kind of
>>> meaningless, so rather just mark as invalid.
>>>
>>> In upstream, TTM wants to eventually remove the ttm_resource.start
>>> usage.
>> Hi Matt,
>> If there is a related rfc or discussion, could you please share a 
>> reference?
> 
> I think it was maybe discussed off list. But there have been patches 
> that try to move in that general direction like: 544432703b2f ("drm/ttm: 
> Add new callbacks to ttm res mgr"). Which gets rid of the main user in 
> ttm for base.start (drivers still make use of it though). Better is to 
> keep it at the resource manager level. The "virtualized" start here is 
> basically trying to workaround the fact that [res.start, res.size] can't 
> really represent a non-contiguous allocations.
> 
> For example, if you look at that patch, ttm was using [res.start, 
> res.size] to help determine if the existing resource was compatible with 
> the new placement. Basically if you want to move a resource from A -> B, 
> where both are the same VRAM, without the above it would always turn the 
> move into a noop, since A & B are the same resource type, and are 
> therefore "compatible".
> 
> However with small-bar, you also have mappable vs non-mappable VRAM, and 
> you might want to move between them, so the "virtualized" start ensures 
> that [res.start, res.end] <= mappable size, even for non-contiguous 
> allocations, if the entire resource is mappable, and vice versa. With 
> the .compatible() hook that is all pushed into the resource manager, so 
> we don't need such tricks. See also the i915 equivalent: da6198afb01d 
> ("drm/i915/ttm: remove the virtualized start hack").
> 
It seems that the part that is difficult to fully understand with this 
patch alone is explained well. Thank you for a detailed description.

And referring to the explanation, the commit where the use of 
ttm_resource.start in ttm_resource_compatible() was deleted was 
confirmed in the commit below.
commit 6d3c900c12d72667341bcff338c252e22728b942 ("drm/ttm: Switch to 
using the new res callback")

Not directly related to this patch, contiguous allocation (using 
TTM_PL_FLAG_CONTIGUOUS flag) allocates resources from one place?

Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun at intel.com>
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>> Reviewed-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_bo.c           |  6 ++++++
>>>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 29 ++++++++++++++--------------
>>>   2 files changed, 20 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>>> index 73a7f2cd4ad8..346573f406c2 100644
>>> --- a/drivers/gpu/drm/xe/xe_bo.c
>>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>>> @@ -662,6 +662,12 @@ static int xe_bo_move(struct ttm_buffer_object 
>>> *ttm_bo, bool evict,
>>>                   void *new_addr = gt->mem.vram.mapping +
>>>                       (new_mem->start << PAGE_SHIFT);
>>> +                if (XE_WARN_ON(new_mem->start == 
>>> XE_BO_INVALID_OFFSET)) {
>> Are there really any use cases for moving to memory allocated with 
>> xe_ttm_gtt_mgr_func()?
> 
> The above is only for pinned VRAM objects, which are always contiguous, 
> and therefore have a valid start value. I just added the above sanity 
> check to double check that, and make the error more obvious if we ever 
> trigger it.
> yes, that makes sense.
>>
>> Br,
>> G.G.
>>> +                    ret = -EINVAL;
>>> +                    xe_device_mem_access_put(xe);
>>> +                    goto out;
>>> +                }
>>> +
>>>                   XE_BUG_ON(new_mem->start !=
>>>                         bo->placements->fpfn);
>>> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c 
>>> b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>>> index 159ca7105df1..bafcadaed6b0 100644
>>> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>>> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>>> @@ -54,7 +54,6 @@ static int xe_ttm_vram_mgr_new(struct 
>>> ttm_resource_manager *man,
>>>       struct xe_ttm_vram_mgr_resource *vres;
>>>       u64 size, remaining_size, lpfn, fpfn;
>>>       struct drm_buddy *mm = &mgr->mm;
>>> -    struct drm_buddy_block *block;
>>>       unsigned long pages_per_block;
>>>       int r;
>>> @@ -186,24 +185,24 @@ static int xe_ttm_vram_mgr_new(struct 
>>> ttm_resource_manager *man,
>>>               list_splice_tail(trim_list, &vres->blocks);
>>>       }
>>> -    vres->base.start = 0;
>>> -    list_for_each_entry(block, &vres->blocks, link) {
>>> -        unsigned long start;
>>> +    if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
>>> +        xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
>>> +        vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
>>> -        start = drm_buddy_block_offset(block) +
>>> -            drm_buddy_block_size(mm, block);
>>> -        start >>= PAGE_SHIFT;
>>> +    /*
>>> +     * For some kernel objects we still rely on the start when io 
>>> mapping
>>> +     * the object.
>>> +     */
>>> +    if (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) {
>>> +        struct drm_buddy_block *block = list_first_entry(&vres->blocks,
>>> +                                 typeof(*block),
>>> +                                 link);
>>> -        if (start > PFN_UP(vres->base.size))
>>> -            start -= PFN_UP(vres->base.size);
>>> -        else
>>> -            start = 0;
>>> -        vres->base.start = max(vres->base.start, start);
>>> +        vres->base.start = drm_buddy_block_offset(block) >> PAGE_SHIFT;
>>> +    } else {
>>> +        vres->base.start = XE_BO_INVALID_OFFSET;
>>>       }
>>> -    if (xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
>>> -        vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
>>> -
>>>       *res = &vres->base;
>>>       return 0;