[PATCH v3] drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure

Christian König christian.koenig at amd.com
Fri Dec 17 08:49:10 UTC 2021


Am 17.12.21 um 03:26 schrieb Leslie Shi:
> [Why]
> In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it
> will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well
> to release mapped VRAM. However, in the following release callback, driver stills visits the
> unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs.
>
> [How]
> call amdgpu_device_unmap_mmio() if device is unplugged to prevent invalid memory address in
> vcn_v3_0_sw_fini() when GPU initialization failure.
>
> Signed-off-by: Leslie Shi <Yuliang.Shi at amd.com>

Looks sane to me, but Andrey should probably nood as well.

Acked-by: Christian König <christian.koenig at amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f31caec669e7..f6a47b927cfd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3899,7 +3899,9 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   
>   	amdgpu_gart_dummy_page_fini(adev);
>   
> -	amdgpu_device_unmap_mmio(adev);
> +	if (drm_dev_is_unplugged(adev_to_drm(adev)))
> +		amdgpu_device_unmap_mmio(adev);
> +
>   }
>   
>   void amdgpu_device_fini_sw(struct amdgpu_device *adev)



More information about the amd-gfx mailing list