[RFC PATCH 0/1] drm/amdgpu: Fix NULL-deref in amdgpu_device_fini_sw()

Zhang Boyang zhangboyang.id at gmail.com
Fri Sep 30 21:41:09 UTC 2022


Hi,

There are several reports of "Fatal error during GPU init" will cause
NULL-deref in amdgpu_device_fini_sw(). Although the NULL-deref is result
instead of reason, this NULL-deref will confuse user.

https://lore.kernel.org/lkml/a8bce489-8ccc-aa95-3de6-f854e03ad557@suddenlinkmail.com/
https://lore.kernel.org/lkml/AT9WHR.3Z1T3VI9A2AQ3@att.net/

This is probably because "adev" is not fully initialized when
amdgpu_device_init() failed. Thus subsequent amdgpu_device_fini_sw()
will try to release "adev->reset_domain" and cause NULL-deref.

This patch fixes this problem by guarding the code with an "if".
However, I'm new to this module and I didn't fully understand the code,
so please review my code carefully.

Best Regards,
Zhang Boyang




More information about the amd-gfx mailing list