[PATCH v4 0/7] Use DRM scheduler for delayed GT TLB invalidations

Fri Aug 22 12:39:18 UTC 2025

On 11/08/2025 20:25, Matthew Brost wrote:
> On Mon, Aug 11, 2025 at 09:31:08AM +0100, Tvrtko Ursulin wrote:
>>
>> On 24/07/2025 20:12, Matthew Brost wrote:
>>> MIME-Version: 1.0
>>> Content-Type: text/plain; charset=UTF-8
>>> Content-Transfer-Encoding: 8bit
>>>
>>>         Use the DRM scheduler for delayed GT TLB invalidations, which properly
>>> fixes the issue raised in [1]. GT TLB fences have their own dma-fence
>>> context, so even if the invalidations are ordered, the dma-resv/DRM
>>> scheduler cannot squash the fences. This results in O(M*N*N) complexity
>>> in the garbage collector, where M is the number of ranges in the garbage
>>> collector and N is the number of pending GT TLB invalidations. After
>>> this change, the resulting complexity in O(M*C) where C is the number of
>>> TLB invalidation contexts (i.e., number of exec queue, GTs tuples) with
>>> an invalidation in flight.
>>
>> Does this series improve the performance of the TLB invalidations in
>> general?
>>
>> I am asking because we are currently investigating a problem with Google
>> Chrome which apparently suffers a lot from latencies caused by TLB
>> invalidations piling up after a lot of tab switching causes a lot of vm bind
>> ioctl activity.
> 
> It doesn't improve performance of the actual invalidation rather the
> time to create a VM unbind job as it avoids O(M*N*N) complexity
> explosion. I originally saw this in a case of many SVM unbinds (i.e. a
> free / unmap is called on a large piece of memory) but many VM unbinds
> in short period of time could also cause this complexity explosion too.

You are saying it is just CPU utilisation and it does not show up as 
submission latency, or it can?

The soft lockup splat from 
https://patchwork.freedesktop.org/patch/658370/?series=150188&rev=1 is 
due drm_sched_job_add_resv_dependencies() literally taking 26 second 
beacuase of a huge number of fences in the resv or something else?

> By making all invalidations from the same (queue, GT) share a dma_fence
> context, dma-resv and drm_sched can coalesce them into a single
> dependenct per (queue, GT). That keeps the dependency count bounded and
> eliminated the SVM panics I was seeing under heavy unbind stress.
> 
>>
>> Also, is it related to
>> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2162 ?
>>
> 
> Maybe. In this case, a flood of TLB invalidations overwhelmed the GuC,
> leaving it no time to do anything else (e.g., schedule contexts). It’s a
> multi-process test, so the driver-side complexity explosion likely
> doesn’t apply as this explosion is seen within a single VM. What we need
> is to hold TLB invalidations in the driver and coalesce them to issue
> fewer operations. I have patches for this, but I haven’t cleaned them up
> for posting yet—we’re in the middle of other TLB-invalidation refactors.

If you would be able to send something out, or share a branch, even if 
not completely clean that would be cool. Because I should soon have a 
reproducer for the Chrome tab switching issue and could report back on 
the real world effects of both.

Regards,

Tvrtko

>>> Admittedly, it's quite a lot of code, but the series includes extensive
>>> kernel documentation and clear code comments. It introduces a generic
>>> dependency scheduler that can be reused in the future and is logically
>>> much cleaner than the previous open-coded solution for delaying GT TLB
>>> invalidations until a bind job completes.
>>>
>>> v2:
>>>    - Various cleanup covered in detail in change logs
>>>    - Use a per-GT ordered workqueue as DRM scheduler workqueue
>>>    - Remove unused ftrace points
>>> v3:
>>>    - Address Stuarts feedback
>>>    - Fix kernel doc, minor cleanups
>>> v4:
>>>    - Rebase
>>>
>>> Matt
>>>
>>> [1] https://patchwork.freedesktop.org/patch/658370/?series=150188&rev=1
>>>
>>> Matthew Brost (7):
>>>     drm/xe: Explicitly mark migration queues with flag
>>>     drm/xe: Add generic dependecy jobs / scheduler
>>>     drm/xe: Create ordered workqueue for GT TLB invalidation jobs
>>>     drm/xe: Add dependency scheduler for GT TLB invalidations to bind
>>>       queues
>>>     drm/xe: Add GT TLB invalidation jobs
>>>     drm/xe: Use GT TLB invalidation jobs in PT layer
>>>     drm/xe: Remove unused GT TLB invalidation trace points
>>>
>>>    drivers/gpu/drm/xe/Makefile                 |   2 +
>>>    drivers/gpu/drm/xe/xe_dep_job_types.h       |  29 +++
>>>    drivers/gpu/drm/xe/xe_dep_scheduler.c       | 143 ++++++++++
>>>    drivers/gpu/drm/xe/xe_dep_scheduler.h       |  21 ++
>>>    drivers/gpu/drm/xe/xe_exec_queue.c          |  48 ++++
>>>    drivers/gpu/drm/xe/xe_exec_queue_types.h    |  15 ++
>>>    drivers/gpu/drm/xe/xe_gt_tlb_inval_job.c    | 274 ++++++++++++++++++++
>>>    drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h    |  34 +++
>>>    drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   8 +
>>>    drivers/gpu/drm/xe/xe_gt_types.h            |   2 +
>>>    drivers/gpu/drm/xe/xe_migrate.c             |  42 ++-
>>>    drivers/gpu/drm/xe/xe_migrate.h             |  13 +
>>>    drivers/gpu/drm/xe/xe_pt.c                  | 178 +++++--------
>>>    drivers/gpu/drm/xe/xe_trace.h               |  16 --
>>>    14 files changed, 700 insertions(+), 125 deletions(-)
>>>    create mode 100644 drivers/gpu/drm/xe/xe_dep_job_types.h
>>>    create mode 100644 drivers/gpu/drm/xe/xe_dep_scheduler.c
>>>    create mode 100644 drivers/gpu/drm/xe/xe_dep_scheduler.h
>>>    create mode 100644 drivers/gpu/drm/xe/xe_gt_tlb_inval_job.c
>>>    create mode 100644 drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h
>>>
>>