[PATCH v3 2/2] drm/xe: Opportunistically skip TLB invalidaion on unbind

Tue Jun 17 17:36:06 UTC 2025


On 16-06-2025 13:03, Ghimiray, Himal Prasad wrote:
> 
> 
> On 16-06-2025 13:03, Matthew Brost wrote:
>> On Mon, Jun 16, 2025 at 12:44:38PM +0530, Ghimiray, Himal Prasad wrote:
>>>
>>>
>>> On 16-06-2025 12:00, Matthew Brost wrote:
>>>> If a range or VMA is invalidated and scratch page is disabled, there
>>>> is no reason to issue a TLB invalidation on unbind, skip TLB
>>>> innvalidation is this condition is true. This is an opportunistic check
>>>> as it is done without the notifier lock, thus it possible for the range
>>>> to be invalidated after this check is performed.
>>>>
>>>> This should improve performance of the SVM garbage collector, for
>>>> example, xe_exec_system_allocator --r many-stride-new-prefetch, went
>>>> ~20s to ~9.5s on a BMG.
>>>>
>>>> v2:
>>>>    - Use helper for valid check (Thomas)
>>>> v3:
>>>>    - Avoid skipping TLB invalidation if PTEs are removed at a higher
>>>>      level than the range
>>>>    - Never skip TLB invalidations for VMA
>>>>    - Drop Himal's RB
>>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_pt.c | 31 ++++++++++++++++++++++++++++++-
>>>>    1 file changed, 30 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>>>> index 9c30111e8786..b6df8995e8c1 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pt.c
>>>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>>>> @@ -1995,6 +1995,32 @@ static int unbind_op_prepare(struct xe_tile 
>>>> *tile,
>>>>        return 0;
>>>>    }
>>>> +static bool
>>>> +xe_pt_op_check_range_skip_invalidation(struct 
>>>> xe_vm_pgtable_update_op *pt_op,
>>>> +                       struct xe_svm_range *range)
>>>> +{
>>>> +    struct xe_vm_pgtable_update *update = pt_op->entries;
>>>> +
>>>> +    XE_WARN_ON(!pt_op->num_entries);
>>>> +
>>>> +    /*
>>>> +     * We can't skip the invalidation if we are removing PTEs that 
>>>> span more
>>>> +     * than the range, do some checks to ensure we are removing 
>>>> PTEs that
>>>> +     * are invalid.
>>>> +     */
>>>> +
>>>> +    if (pt_op->num_entries > 1)
>>>> +        return false;
>>>> +
>>>> +    if (update->pt->level == 0)
>>>> +        return true;
>>>> +
>>>> +    if (update->pt->level == 1)
>>>> +        return xe_svm_range_size(range) >= SZ_2M;
>>>
>>>> = or == ? Dont think ranges can be greater than 2 MiB.
>>>
>>
>> This is future-proofing. For example, if we add a SZ_8M entry because
>> profiling an application shows it helps, this code won't break. I also
>> assume we will never fault in 1G ranges, so there's no need for a
>> level-2 or 1G check.
> 
> Makes sense. Looks good to me see no issue here.

Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>

> 
>>
>> Matt
>>
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>>    static int unbind_range_prepare(struct xe_vm *vm,
>>>>                    struct xe_tile *tile,
>>>>                    struct xe_vm_pgtable_update_ops *pt_update_ops,
>>>> @@ -2023,7 +2049,10 @@ static int unbind_range_prepare(struct xe_vm 
>>>> *vm,
>>>>                         range->base.itree.last + 1);
>>>>        ++pt_update_ops->current_op;
>>>>        pt_update_ops->needs_svm_lock = true;
>>>> -    pt_update_ops->needs_invalidation = true;
>>>> +    pt_update_ops->needs_invalidation |= xe_vm_has_scratch(vm) ||
>>>> +        xe_vm_has_valid_gpu_mapping(tile, range->tile_present,
>>>> +                        range->tile_invalidated) ||
>>>> +        !xe_pt_op_check_range_skip_invalidation(pt_op, range);
>>>>        xe_pt_commit_prepare_unbind(XE_INVALID_VMA, pt_op->entries,
>>>>                        pt_op->num_entries);
>>>
>>>
>