[PATCH 0/5] Allow to extend the timeout without jobs disappearing (v2)

Luben Tuikov luben.tuikov at amd.com
Fri Dec 4 03:17:17 UTC 2020


Hi guys,

This series of patches implements a pending list for
jobs which are in the hardware, and a done list for
tasks which are done and need to be freed.

As tasks complete and call their DRM callback, their
fences are signalled and tasks are added to the done
list and the main scheduler thread woken up. The main
scheduler thread then frees them up.

When a task times out, the timeout function prototype
now returns a value back to DRM. The reason for this is
that the GPU driver has intimate knowledge of the
hardware and can pass back information to DRM on what
to do. Whether to attempt to abort the task (by say
calling a driver abort function, etc., as the
implementation dictates), or whether the task needs
more time. Note that the task is not moved away from
the pending list, unless it is no longer in the GPU.
(The pending list holds tasks which are pending from
DRM's point of view, i.e. the GPU has control over
them--that could be things like DMA is active, CU's are
active, for the task, etc.)

The idea really is that what DRM wants to know is
whether the task is in the GPU or not. So now
drm_sched_backend_ops::timedout_job() returns
DRM_TASK_STATUS_COMPLETE if the task is no longer with
the GPU, or DRM_TASK_STATUS_ALIVE if the task needs
more time.

This series applies to drm-misc-next at 0a260e731d6c.

Tested and works, but I get a lot of
WARN_ON(bo->pin_count)) from ttm_bo_release()
for the VCN ring of amdgpu.

Cc: Alexander Deucher <Alexander.Deucher at amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky at amd.com>
Cc: Christian König <christian.koenig at amd.com>
Cc: Daniel Vetter <daniel.vetter at ffwll.ch>

Luben Tuikov (5):
  drm/scheduler: "node" --> "list"
  gpu/drm: ring_mirror_list --> pending_list
  drm/scheduler: Essentialize the job done callback
  drm/scheduler: Job timeout handler returns status (v2)
  drm/sched: Make use of a "done" list (v2)

 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |   8 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c     |  10 +-
 drivers/gpu/drm/lima/lima_sched.c           |   4 +-
 drivers/gpu/drm/panfrost/panfrost_job.c     |   9 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 345 +++++++++++---------
 drivers/gpu/drm/v3d/v3d_sched.c             |  32 +-
 include/drm/gpu_scheduler.h                 |  38 ++-
 9 files changed, 255 insertions(+), 201 deletions(-)

-- 
2.29.2.404.ge67fbf927d



More information about the amd-gfx mailing list