[PATCH v3 3/4] drm/amdgpu: use drm_file_err in logging to also dump process information
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Wed Apr 16 07:26:40 UTC 2025
On 15/04/2025 19:43, Sunil Khatri wrote:
> add process and pid information in the userqueue error
> logging to make it more useful in resolving the error
> by logs.
>
> Sample log:
> [ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed out waiting for fence f=000000001c74d978 for comm:Xwayland pid:3427
> [ 42.444669] [drm:amdgpu_userqueue_suspend [amdgpu]] *ERROR* Not suspending userqueue, timeout waiting for comm:Xwayland pid:3427
> [ 42.824729] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed out waiting for fence f=0000000074407d3e for comm:systemd-logind pid:1058
> [ 42.825082] [drm:amdgpu_userqueue_suspend [amdgpu]] *ERROR* Not suspending userqueue, timeout waiting for comm:systemd-logind pid:1058
>
> Signed-off-by: Sunil Khatri <sunil.khatri at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> index 1867520ba258..05c1ee27a319 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -43,7 +43,7 @@ amdgpu_userqueue_cleanup(struct amdgpu_userq_mgr *uq_mgr,
> if (f && !dma_fence_is_signaled(f)) {
> ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100));
> if (ret <= 0) {
> - DRM_ERROR("Timed out waiting for fence f=%p\n", f);
> + drm_file_err(uq_mgr->file, "Timed out waiting for fence f=%p\n", f);
You decided to leave %p after all?
> return;
> }
> }
> @@ -440,7 +440,8 @@ amdgpu_userqueue_resume_all(struct amdgpu_userq_mgr *uq_mgr)
> }
>
> if (ret)
> - DRM_ERROR("Failed to map all the queues\n");
> + drm_file_err(uq_mgr->file, "Failed to map all the queue\n");
You lost the plural by accident.
I am also not sure "all the queues" makes sense in this context versus
"all queues" but it's inconsequential really.
> +
> return ret;
> }
>
> @@ -598,7 +599,8 @@ amdgpu_userqueue_suspend_all(struct amdgpu_userq_mgr *uq_mgr)
> }
>
> if (ret)
> - DRM_ERROR("Couldn't unmap all the queues\n");
> + drm_file_err(uq_mgr->file, "Couldn't unmap all the queues\n");
> +
> return ret;
> }
>
> @@ -615,7 +617,7 @@ amdgpu_userqueue_wait_for_signal(struct amdgpu_userq_mgr *uq_mgr)
> continue;
> ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100));
> if (ret <= 0) {
> - DRM_ERROR("Timed out waiting for fence f=%p\n", f);
> + drm_file_err(uq_mgr->file, "Timed out waiting for fence f=%p\n", f);
> return -ETIMEDOUT;
> }
> }
> @@ -634,13 +636,13 @@ amdgpu_userqueue_suspend(struct amdgpu_userq_mgr *uq_mgr,
> /* Wait for any pending userqueue fence work to finish */
> ret = amdgpu_userqueue_wait_for_signal(uq_mgr);
> if (ret) {
> - DRM_ERROR("Not suspending userqueue, timeout waiting for work\n");
> + drm_file_err(uq_mgr->file, "Not suspending userqueue, timeout waiting\n");
> return;
> }
>
> ret = amdgpu_userqueue_suspend_all(uq_mgr);
> if (ret) {
> - DRM_ERROR("Failed to evict userqueue\n");
> + drm_file_err(uq_mgr->file, "Failed to evict userqueue\n");
> return;
It is pre-existing but strikes me as odd that failure to
amdgpu_userqueue_suspend_all() logs a failure to *evict* instead of
suspend (as the previous log does). Anyway, I did not look at the
surrounding code so just thinking out loud.
Regards,
Tvrtko
> }
>
More information about the dri-devel
mailing list