[PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET
Xiao, Jack
Jack.Xiao at amd.com
Fri Dec 22 03:04:05 UTC 2023
[AMD Official Use Only - General]
I can see the soft reset can be called from amdgpu_device_gpu_recover and pci error handler, they have called amdgpu_device_lock_reset_domain to acquire a reset lock before soft reset.
Regards,
Jack
-----Original Message-----
From: Christian König <ckoenig.leichtzumerken at gmail.com>
Sent: Wednesday, December 20, 2023 10:06 PM
To: Xiao, Jack <Jack.Xiao at amd.com>; Alex Deucher <alexdeucher at gmail.com>
Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>
Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET
Well not the reset lock, but there should only be a single reset queue which this runs on.
Regards,
Christian.
Am 20.12.23 um 10:49 schrieb Xiao, Jack:
> [AMD Official Use Only - General]
>
> It's already protected by the reset lock. In my understanding, soft reset should not run in parallel.
>
> Regards,
> Jack
>
> -----Original Message-----
> From: Alex Deucher <alexdeucher at gmail.com>
> Sent: Wednesday, December 20, 2023 1:04 AM
> To: Xiao, Jack <Jack.Xiao at amd.com>
> Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>
> Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before
> access CP_VMID_RESET
>
> On Tue, Dec 19, 2023 at 4:30 AM Jack Xiao <Jack.Xiao at amd.com> wrote:
>> It's required to take the gfx mutex before access to CP_VMID_RESET,
>> for there is a race condition with CP firmware to write the register.
>>
>> Signed-off-by: Jack Xiao <Jack.Xiao at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> index bdcf96df69e6..ae3370d34d11 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> @@ -4518,6 +4518,22 @@ static int gfx_v11_0_soft_reset(void *handle)
>> }
>> }
>>
> We probably want a CPU mutex or spinlock to protect this as well unless this is already protected by the reset lock.
>
> Alex
>
>> + /* Try to require the gfx mutex before access to CP_VMID_RESET */
>> + for (i = 0; i < adev->usec_timeout; i++) {
>> + /* Request with MeId=2, PipeId=0 */
>> + tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, 1);
>> + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4);
>> + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
>> + if (RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX) == tmp)
>> + break;
>> + udelay(1);
>> + }
>> +
>> + if (i >= adev->usec_timeout) {
>> + printk("Failed to require the gfx mutex during soft reset\n");
>> + return -EINVAL;
>> + }
>> +
>> WREG32_SOC15(GC, 0, regCP_VMID_RESET, 0xfffffffe);
>>
>> // Read CP_VMID_RESET register three times.
>> @@ -4526,6 +4542,10 @@ static int gfx_v11_0_soft_reset(void *handle)
>> RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>> RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>>
>> + /* release the gfx mutex */
>> + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, REQUEST, 0);
>> + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
>> +
>> for (i = 0; i < adev->usec_timeout; i++) {
>> if (!RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) &&
>> !RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE))
>> --
>> 2.41.0
>>
More information about the amd-gfx
mailing list