[PATCH 4/4] drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready.
Deucher, Alexander
Alexander.Deucher at amd.com
Mon Oct 21 13:24:15 UTC 2019
> -----Original Message-----
> From: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> Sent: Friday, October 18, 2019 4:49 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Chen, Guchun <Guchun.Chen at amd.com>; Zhou1, Tao
> <Tao.Zhou1 at amd.com>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; noreply-confluence at amd.com; Quan,
> Evan <Evan.Quan at amd.com>; Grodzovsky, Andrey
> <Andrey.Grodzovsky at amd.com>
> Subject: [PATCH 4/4] drm/amdgpu: Move amdgpu_ras_recovery_init to
> after SMU ready.
>
> For Arcturus the I2C traffic is done through SMU tables and so we must
> postpone RAS recovery init to after they are ready which is in
> amdgpu_device_ip_hw_init_phase2.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 13 +++++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 11 -----------
> 2 files changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 17cfdaf..c40e9a5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1850,6 +1850,19 @@ static int amdgpu_device_ip_init(struct
> amdgpu_device *adev)
> if (r)
> goto init_failed;
>
> + /*
> + * retired pages will be loaded from eeprom and reserved here,
> + * it should be called after amdgpu_device_ip_hw_init_phase2 since
> + * for some ASICs the RAS EEPROM code relies on SMU fully
> functioning
> + * for I2C communication which only true at this point.
> + * recovery_init may fail, but it can free all resources allocated by
> + * itself and its failure should not stop amdgpu init process.
> + *
> + * Note: theoretically, this should be called before all vram allocations
> + * to protect retired page from abusing
> + */
> + amdgpu_ras_recovery_init(adev);
> +
> if (adev->gmc.xgmi.num_physical_nodes > 1)
> amdgpu_xgmi_add_device(adev);
> amdgpu_amdkfd_device_init(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 2e85a51..1045c3f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1721,17 +1721,6 @@ int amdgpu_ttm_init(struct amdgpu_device
> *adev) #endif
>
> /*
> - * retired pages will be loaded from eeprom and reserved here,
> - * it should be called after ttm init since new bo may be created,
> - * recovery_init may fail, but it can free all resources allocated by
> - * itself and its failure should not stop amdgpu init process.
> - *
> - * Note: theoretically, this should be called before all vram allocations
> - * to protect retired page from abusing
> - */
> - amdgpu_ras_recovery_init(adev);
> -
> - /*
> *The reserved vram for firmware must be pinned to the specified
> *place on the VRAM, so reserve it early.
> */
> --
> 2.7.4
More information about the amd-gfx
mailing list