[PATCH 2/4] drm/amdgpu: Clarify error when hitting bad page threshold

Luben Tuikov luben.tuikov at amd.com
Tue Oct 19 18:47:04 UTC 2021


Reviewed-by: Luben Tuikov <luben.tuikov at amd.com>

Regards,
Luben

On 2021-10-19 13:50, Kent Russell wrote:
> Change the error message when the bad_page_threshold is reached,
> explicitly stating that the GPU will not be initialized.
>
> Cc: Luben Tuikov <luben.tuikov at amd.com>
> Cc: Mukul Joshi <Mukul.Joshi at amd.com>
> Signed-off-by: Kent Russell <kent.russell at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 8270aad23a06..7bb506a0ebd6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -1111,7 +1111,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>  			*exceed_err_limit = true;
>  			dev_err(adev->dev,
>  				"RAS records:%d exceed threshold:%d, "
> -				"maybe retire this GPU?",
> +				"GPU will not be initialized. Replace this GPU or increase the threshold",
>  				control->ras_num_recs, ras->bad_page_cnt_threshold);
>  		}
>  	} else {



More information about the amd-gfx mailing list