[PATCH] drm/amdgpu: fix sdma doorbell init ordering on APUs

Christian König ckoenig.leichtzumerken at gmail.com
Thu Oct 20 05:59:54 UTC 2022


Am 20.10.22 um 05:48 schrieb Alex Deucher:
> Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> uncovered a bug in amdgpu that required a reordering of the driver
> init sequence to avoid accessing a special register on the GPU
> before it was properly set up leading to an PCI AER error.  This
> reordering uncovered a different hw programming ordering dependency
> in some APUs where the SDMA doorbells need to be programmed before
> the GFX doorbells. To fix this, move the SDMA doorbell programming
> back into the soc15 common code, but use the actual doorbell range
> values directly rather than the values stored in the ring structure
> since those will not be initialized at this point.
>
> This is a partial revert, but with the doorbell assignment
> fixed so the proper doorbell index is set before it's used.
>
> Fixes: e3163bc8ffdfdb ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega")
> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> Cc: skhan at linuxfoundation.org

Acked-by: Christian König <christian.koenig at amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  5 -----
>   drivers/gpu/drm/amd/amdgpu/soc15.c     | 21 +++++++++++++++++++++
>   2 files changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index 298fa11702e7..1122bd4eae98 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -1417,11 +1417,6 @@ static int sdma_v4_0_start(struct amdgpu_device *adev)
>   		WREG32_SDMA(i, mmSDMA0_CNTL, temp);
>   
>   		if (!amdgpu_sriov_vf(adev)) {
> -			ring = &adev->sdma.instance[i].ring;
> -			adev->nbio.funcs->sdma_doorbell_range(adev, i,
> -				ring->use_doorbell, ring->doorbell_index,
> -				adev->doorbell_index.sdma_doorbell_range);
> -
>   			/* unhalt engine */
>   			temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
>   			temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 183024d7c184..e3b2b6b4f1a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -1211,6 +1211,20 @@ static int soc15_common_sw_fini(void *handle)
>   	return 0;
>   }
>   
> +static void soc15_sdma_doorbell_range_init(struct amdgpu_device *adev)
> +{
> +	int i;
> +
> +	/* sdma doorbell range is programed by hypervisor */
> +	if (!amdgpu_sriov_vf(adev)) {
> +		for (i = 0; i < adev->sdma.num_instances; i++) {
> +			adev->nbio.funcs->sdma_doorbell_range(adev, i,
> +				true, adev->doorbell_index.sdma_engine[i] << 1,
> +				adev->doorbell_index.sdma_doorbell_range);
> +		}
> +	}
> +}
> +
>   static int soc15_common_hw_init(void *handle)
>   {
>   	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> @@ -1230,6 +1244,13 @@ static int soc15_common_hw_init(void *handle)
>   
>   	/* enable the doorbell aperture */
>   	soc15_enable_doorbell_aperture(adev, true);
> +	/* HW doorbell routing policy: doorbell writing not
> +	 * in SDMA/IH/MM/ACV range will be routed to CP. So
> +	 * we need to init SDMA doorbell range prior
> +	 * to CP ip block init and ring test.  IH already
> +	 * happens before CP.
> +	 */
> +	soc15_sdma_doorbell_range_init(adev);
>   
>   	return 0;
>   }



More information about the amd-gfx mailing list