[PATCH V2] drm/amdgpu: make compute timeouts consistent
Alex Deucher
alexdeucher at gmail.com
Tue Jul 15 13:45:05 UTC 2025
Ping?
On Tue, Jun 24, 2025 at 10:32 PM Alex Deucher <alexander.deucher at amd.com> wrote:
>
> For kernel compute queues, align the timeout with
> other kernel queues (10 sec). This had previously
> been set higher for OpenCL when it used kernel
> queues, but now OpenCL uses KFD user queues which
> don't have a timeout limitation. This also aligns
> with SR-IOV which already used a shorter timeout.
>
> Additionally the longer timeout negatively impacts
> the user experience with kernel queues for interactive
> applications.
>
> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> ---
>
> V2: fix documentation as well
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++----------
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +++++-----
> 2 files changed, 7 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index c8a6b3689deae..58a0ee99eb2f8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4171,18 +4171,10 @@ static int amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev)
> int ret = 0;
>
> /*
> - * By default timeout for non compute jobs is 10000
> - * and 60000 for compute jobs.
> - * In SR-IOV or passthrough mode, timeout for compute
> - * jobs are 60000 by default.
> + * By default timeout for jobs is 10 sec
> */
> - adev->gfx_timeout = msecs_to_jiffies(10000);
> + adev->compute_timeout = adev->gfx_timeout = msecs_to_jiffies(10000);
> adev->sdma_timeout = adev->video_timeout = adev->gfx_timeout;
> - if (amdgpu_sriov_vf(adev))
> - adev->compute_timeout = amdgpu_sriov_is_pp_one_vf(adev) ?
> - msecs_to_jiffies(60000) : msecs_to_jiffies(10000);
> - else
> - adev->compute_timeout = msecs_to_jiffies(60000);
>
> if (strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) {
> while ((timeout_setting = strsep(&input, ",")) &&
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 7e3fa76227033..7bc326d0b4000 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -362,12 +362,12 @@ module_param_named(svm_default_granularity, amdgpu_svm_default_granularity, uint
> * The second one is for Compute. The third and fourth ones are
> * for SDMA and Video.
> *
> - * By default(with no lockup_timeout settings), the timeout for all non-compute(GFX, SDMA and Video)
> - * jobs is 10000. The timeout for compute is 60000.
> + * By default(with no lockup_timeout settings), the timeout for all jobs is 10000.
> */
> -MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: for bare metal 10000 for non-compute jobs and 60000 for compute jobs; "
> - "for passthrough or sriov, 10000 for all jobs. 0: keep default value. negative: infinity timeout), format: for bare metal [Non-Compute] or [GFX,Compute,SDMA,Video]; "
> - "for passthrough or sriov [all jobs] or [GFX,Compute,SDMA,Video].");
> +MODULE_PARM_DESC(lockup_timeout,
> + "GPU lockup timeout in ms (default: 10000 for all jobs. "
> + "0: keep default value. negative: infinity timeout), format: for bare metal [Non-Compute] or [GFX,Compute,SDMA,Video]; "
> + "for passthrough or sriov [all jobs] or [GFX,Compute,SDMA,Video].");
> module_param_string(lockup_timeout, amdgpu_lockup_timeout, sizeof(amdgpu_lockup_timeout), 0444);
>
> /**
> --
> 2.49.0
>
More information about the amd-gfx
mailing list