[PATCH] drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader
Christian König
christian.koenig at amd.com
Wed Sep 4 07:56:17 UTC 2024
Am 04.09.24 um 09:12 schrieb Srinivasan Shanmugam:
> This commit replaces the use of amdgpu_job_submit_direct which submits
> the job to the ring directly, with drm_sched_entity in the cleaner
> shader job submission process. The change allows the GPU scheduler to
> manage the cleaner shader job.
>
> - The job is then submitted to the GPU using the
> drm_sched_entity_push_job function, which allows the GPU scheduler to
> manage the job.
>
> This change improves the reliability of the cleaner shader job
> submission process by leveraging the capabilities of the GPU scheduler.
>
> Fixes: f70111466165 ("drm/amdgpu: Add sysfs interface for running cleaner shader")
> Cc: Christian König <christian.koenig at amd.com>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam at amd.com>
> Suggested-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 32 ++++++++++++-------------
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 1 +
> 2 files changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index b779d47a546a..149939c4eaed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -1397,14 +1397,22 @@ static ssize_t amdgpu_gfx_get_available_compute_partition(struct device *dev,
> static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
> {
> struct amdgpu_device *adev = ring->adev;
> - long timeout = msecs_to_jiffies(1000);
> + struct drm_gpu_scheduler *sched = &ring->sched;
> struct dma_fence *f = NULL;
Please drop that default assignment of f.
> struct amdgpu_job *job;
> struct amdgpu_ib *ib;
> int i, r;
>
> - r = amdgpu_job_alloc_with_ib(adev, NULL, NULL,
> - 64, AMDGPU_IB_POOL_DIRECT,
> + /* Initialize the scheduler entity */
> + r = drm_sched_entity_init(&adev->gfx.entity, DRM_SCHED_PRIORITY_NORMAL,
> + &sched, 1, NULL);
> + if (r) {
> + dev_err(adev->dev, "Failed setting up GFX kernel entity.\n");
> + goto err;
> + }
> +
> + r = amdgpu_job_alloc_with_ib(ring->adev, &adev->gfx.entity, NULL,
> + 64, 0,
> &job);
> if (r)
> goto err;
> @@ -1416,24 +1424,16 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
> ib->ptr[i] = ring->funcs->nop;
> ib->length_dw = ring->funcs->align_mask + 1;
>
> - r = amdgpu_job_submit_direct(job, ring, &f);
> + f = amdgpu_job_submit(job);
> + r = dma_fence_wait(f, false);
Where is f assigned?
> if (r)
> - goto err_free;
> -
> - r = dma_fence_wait_timeout(f, false, timeout);
> - if (r == 0)
> - r = -ETIMEDOUT;
> - else if (r > 0)
> - r = 0;
> -
> - amdgpu_ib_free(adev, ib, f);
> + goto err;
> dma_fence_put(f);
>
> + /* Clean up the scheduler entity */
> + drm_sched_entity_destroy(&adev->gfx.entity);
> return 0;
>
> -err_free:
> - amdgpu_job_free(job);
> - amdgpu_ib_free(adev, ib, f);
> err:
> return r;
> }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index 5644e10a86a9..3c268476dec8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -472,6 +472,7 @@ struct amdgpu_gfx {
> struct mutex kfd_sch_mutex;
> u64 kfd_sch_req_count[MAX_XCP];
> bool kfd_sch_inactive[MAX_XCP];
> + struct drm_sched_entity entity;
You can't put that into the amdgpu_gfx structure. The sysfs might be
called concurrently.
I suggest to put it on the stack.
Christian.
> };
>
> struct amdgpu_gfx_ras_reg_entry {
More information about the amd-gfx
mailing list