[PATCH 0/6] Allow to extend the timeout without jobs disappearing
Luben Tuikov
luben.tuikov at amd.com
Wed Nov 25 03:17:02 UTC 2020
Hi guys,
This series of patches implements a pending list for
jobs which are in the hardware, and a done list for
tasks which are done and need to be freed.
It implements a second thread, dedicated to freeing
tasks from the done list. The main scheduler thread no
longer frees (cleans up) done tasks by polling the head
of the pending list (drm_sched_get_cleanup_task() is
now gone)--it only pushes tasks down to the GPU. As
tasks complete and call their DRM callback, their
fences are signalled and tasks are queued to the done
list and the done thread woken up to free them. This
can take place concurrently with the main scheduler
thread pushing tasks down to the GPU.
When a task times out, the timeout function prototype
now is made to return a value back to DRM. The reason
for this is that the GPU driver has intimate knowledge
of the hardware and can pass back information to DRM on
what to do. Whether to attempt to abort the task (by
say calling a driver abort function, etc., as the
implementation dictates), or whether the task needs
more time. Note that the task is not moved away from
the pending list, unless it is no longer in the GPU.
(The pending list holds tasks which are pending from
DRM's point of view, i.e. the GPU has control over
them--that could be things like DMA is active, CU's are
active, for the task, etc.)
The idea really is that what DRM wants to know is
whether the task is in the GPU or not. So now
drm_sched_backend_ops::timedout_job() returns
0 of the task is no longer with the GPU, or 1
if the task needs more time.
Tested up to patch 5. Running with patch 6 seems to
make X/GDM just sleep, and I'm looking into this now.
This series applies to drm-misc-next.
Luben Tuikov (6):
drm/scheduler: "node" --> "list"
gpu/drm: ring_mirror_list --> pending_list
drm/scheduler: Job timeout handler returns status
drm/scheduler: Essentialize the job done callback
drm/amdgpu: Don't hardcode thread name length
drm/sched: Make use of a "done" thread
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
drivers/gpu/drm/scheduler/sched_main.c | 275 ++++++++++----------
include/drm/gpu_scheduler.h | 43 ++-
6 files changed, 186 insertions(+), 152 deletions(-)
--
2.29.2.154.g7f7ebe054a
More information about the amd-gfx
mailing list