[PATCH v2 0/4] drm/xe: Fix races on fdinfo

Lucas De Marchi lucas.demarchi at intel.com
Tue Oct 29 21:43:47 UTC 2024


The current reading of engine utilization has same races. This should
fix most of them while also drastically reducing the update rate needed
on "normal apps".

I left tests/xe_drm_fdinfo --r utilization-single-full-load-destroy-queue
running on 2 systems and saw no failures after 100 iterations about
execution cycles being 0.

There are still issues calculating the percentage load - while I have
one additional patch to "fix" it on an idle system, I still can
consistently reproduce the issue in a LNL machine by overloading the CPU
with `stress --cpu $(nproc)`. So I will leave that for later since it's
a different issue not related to killing the exec queue.

Lucas De Marchi (4):
  drm/xe: Add trace to lrc timestamp update
  drm/xe: Stop accumulating LRC timestamp on job_free
  drm/xe: Reword exec_queue.lock doc
  drm/xe: Wait on killed exec queues

 drivers/gpu/drm/xe/Makefile          |  1 +
 drivers/gpu/drm/xe/xe_device_types.h | 11 ++++--
 drivers/gpu/drm/xe/xe_drm_client.c   |  7 ++++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 10 ++++++
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 --
 drivers/gpu/drm/xe/xe_lrc.c          |  3 ++
 drivers/gpu/drm/xe/xe_trace_lrc.c    |  9 +++++
 drivers/gpu/drm/xe/xe_trace_lrc.h    | 52 ++++++++++++++++++++++++++++
 8 files changed, 90 insertions(+), 5 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_trace_lrc.c
 create mode 100644 drivers/gpu/drm/xe/xe_trace_lrc.h

-- 
2.47.0



More information about the Intel-xe mailing list