[PATCH v2 2/2] drm/amdgpu: Don't flush HDP on A+A

Tue Jun 1 14:45:33 UTC 2021

On 2021-06-01 3:05 a.m., Christian König wrote:
> Am 01.06.21 um 02:06 schrieb Eric Huang:
>> With XGMI connection flushing HDP on PCIe is unnecessary,
>> it is also to optimize memory allocation latency.
>>
>> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.h | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 3 ++-
>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 1 +
>>   drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c   | 3 +++
>>   4 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.h
>> index 7ec99d591584..1ca23f2f51d2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.h
>> @@ -44,6 +44,7 @@ struct amdgpu_hdp {
>>       struct ras_common_if            *ras_if;
>>       const struct amdgpu_hdp_funcs        *funcs;
>>       const struct amdgpu_hdp_ras_funcs    *ras_funcs;
>> +    bool    no_flush;
>>   };
>>     int amdgpu_hdp_ras_late_init(struct amdgpu_device *adev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index aaa2574ce9bc..f31eae2931f3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -226,7 +226,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>       if (!(adev->flags & AMD_IS_APU))
>>   #endif
>>       {
>> -        if (ring->funcs->emit_hdp_flush)
>> +        if (ring->funcs->emit_hdp_flush &&
>> +            !adev->hdp.no_flush)
>
> This still emits the flush through MMIO.

As matter of fact, it doesn't, because amdgpu_asic_flush_hdp will check 
the flag again in hdp_v4_0.c. I even think the check here is unnecessary 
for previous asic, because asic other than Aldeberan A+A has to flush 
tlbs according to hardware specific.
>
> What you need to do is to initialize the hdp.no_flush field for all 
> asics and architectures and then use that here in the if above this one.

I don't understand why it should be for all asics. Currently only 
Aldeberan A+A need no TLB flush, we don't have to consider other asics. 
And hdp.no_flush is a common flag for all asics, Is there an issue in 
other asics for this flag on tlb flush?
>
>> amdgpu_ring_emit_hdp_flush(ring);
>>           else
>>               amdgpu_asic_flush_hdp(adev, ring);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index 2749621d5f63..6e1eab615914 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -1223,6 +1223,7 @@ static int gmc_v9_0_early_init(void *handle)
>>           adev->gmc.xgmi.supported = true;
>>           adev->gmc.xgmi.connected_to_cpu =
>> adev->smuio.funcs->is_host_gpu_xgmi_supported(adev);
>> +        adev->hdp.no_flush = adev->gmc.xgmi.connected_to_cpu;
>>       }
>>         gmc_v9_0_set_gmc_funcs(adev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c
>> index 74b90cc2bf48..e1b2face8656 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/hdp_v4_0.c
>> @@ -40,6 +40,9 @@
>>   static void hdp_v4_0_flush_hdp(struct amdgpu_device *adev,
>>                   struct amdgpu_ring *ring)
>>   {
>> +    if (adev->hdp.no_flush)
>> +        return;
>> +
>
> Just to be clear once more, this approach is a NAK.
>
> Checks like this should not be in the hardware specific function.

As I mention above, TLB flush should be specific for Asic for my opinion.

Regards,
Eric
>
> Regards,
> Christian.
>
>>       if (!ring || !ring->funcs->emit_wreg)
>>           WREG32_NO_KIQ((adev->rmmio_remap.reg_offset + 
>> KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2, 0);
>>       else
>