[PATCH v2] drm/amd/amdgpu: set the default value of noretry to 1 for some dGPUs
Felix Kuehling
felix.kuehling at amd.com
Tue Oct 13 15:32:08 UTC 2020
Do you have more details about those test failures. In theory that test
should pass with noretry=0. If it fails, I'd rather look into the
problem than hiding it with a workaround.
Regards,
Felix
Am 2020-10-13 um 11:13 a.m. schrieb Chengming Gui:
> noretry = 0 cause some dGPU's kfd page fault tests fail,
> so set noretry to 1 for these special ASICs:
> vega20/navi10/navi14/ARCTURUS
>
> v2:merge raven and default case due to the same setting
>
> Signed-off-by: Chengming Gui <Jack.Gui at amd.com>
> Change-Id: I3be70f463a49b0cd5c56456431d6c2cb98b13872
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 36604d751d62..3b7b9a5e9749 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -425,20 +425,28 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
> struct amdgpu_gmc *gmc = &adev->gmc;
>
> switch (adev->asic_type) {
> - case CHIP_RAVEN:
> - /* Raven currently has issues with noretry
> - * regardless of what we decide for other
> - * asics, we should leave raven with
> - * noretry = 0 until we root cause the
> - * issues.
> + case CHIP_VEGA20:
> + case CHIP_NAVI10:
> + case CHIP_NAVI14:
> + case CHIP_ARCTURUS:
> + /*
> + * noretry = 0 will cause kfd page fault tests fail
> + * for some ASICs, so set default to 1 for these ASICs.
> */
> if (amdgpu_noretry == -1)
> - gmc->noretry = 0;
> + gmc->noretry = 1;
> else
> gmc->noretry = amdgpu_noretry;
> break;
> + case CHIP_RAVEN:
> default:
> - /* default this to 0 for now, but we may want
> + /* Raven currently has issues with noretry
> + * regardless of what we decide for other
> + * asics, we should leave raven with
> + * noretry = 0 until we root cause the
> + * issues.
> + *
> + * default this to 0 for now, but we may want
> * to change this in the future for certain
> * GPUs as it can increase performance in
> * certain cases.
More information about the amd-gfx
mailing list