[PATCH] drm/amdgpu: correctly report gpu recover status

Quan, Evan Evan.Quan at amd.com
Thu Dec 19 01:48:58 UTC 2019


Hi Christian,

Here is some background for this change:
I'm debugging a random failure issue on baco reset.
I used a while loop to run the continuous baco reset tests and hope it can exit immediately on failure occurred.
However, due to wrong return value, it did not. And as you can image, the failure scene was ruined.

I can add this "seq_printf(m, "gpu recover %d\n", r);".
But still what I care more(which is also the easiest way to me) is the correct return value of the API.

Regards,
Evan
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> Sent: Wednesday, December 18, 2019 5:57 PM
> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: correctly report gpu recover status
> 
> Am 18.12.19 um 04:25 schrieb Evan Quan:
> > Knowing whether gpu recovery was performed successfully or not is
> > important for our BACO development.
> >
> > Change-Id: I0e3ca4dcb65a053eb26bc55ad7431e4a42e160de
> > Signed-off-by: Evan Quan <evan.quan at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +---
> >   1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > index e9efee04ca23..5dff5c0dd882 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > @@ -743,9 +743,7 @@ static int amdgpu_debugfs_gpu_recover(struct
> seq_file *m, void *data)
> >   	struct amdgpu_device *adev = dev->dev_private;
> >
> >   	seq_printf(m, "gpu recover\n");
> > -	amdgpu_device_gpu_recover(adev, NULL);
> > -
> > -	return 0;
> > +	return amdgpu_device_gpu_recover(adev, NULL);
> 
> NAK, what we could do here is the following:
> 
> r = amdgpu_device_gpu_recover(....);
> seq_printf(m, "gpu recover %d\n", r);
> 
> But returning the error code from the GPU recovery to userspace doesn't make
> to much sense.
> 
> Christian.
> 
> >   }
> >
> >   static const struct drm_info_list amdgpu_debugfs_fence_list[] = {



More information about the amd-gfx mailing list