[PATCH 1/1] drm/amdgpu: Use device wedged event

André Almeida andrealmeid at igalia.com
Fri Dec 13 15:56:50 UTC 2024



Em 13/12/2024 11:36, Raag Jadav escreveu:
> On Fri, Dec 13, 2024 at 11:15:31AM -0300, André Almeida wrote:
>> Hi Christian,
>>
>> Em 13/12/2024 04:34, Christian König escreveu:
>>> Am 12.12.24 um 20:09 schrieb André Almeida:
>>>> Use DRM's device wedged event to notify userspace that a reset had
>>>> happened. For now, only use `none` method meant for telemetry
>>>> capture.
>>>>
>>>> Signed-off-by: André Almeida <andrealmeid at igalia.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/ drm/amd/amdgpu/amdgpu_device.c
>>>> index 96316111300a..19e1a5493778 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -6057,6 +6057,9 @@ int amdgpu_device_gpu_recover(struct
>>>> amdgpu_device *adev,
>>>>            dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
>>>>        atomic_set(&adev->reset_domain->reset_res, r);
>>>> +
>>>> +    drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);
>>>
>>> That looks really good in general. I would just make the
>>> DRM_WEDGE_RECOVERY_NONE depend on the value of "r".
>>>
>>
>> Why depend or `r`? A reset was triggered anyway, regardless of the success
>> of it, shouldn't we tell userspace?
> 
> A failed reset would perhaps result in wedging, atleast that's how i915
> is handling it.
> 

Right, and I think this raises the question of what wedge recovery 
method should I add for amdgpu... Christian?



More information about the amd-gfx mailing list