[PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr

Fri Oct 13 09:21:03 UTC 2017

Yes I tried smp_mb but it doesn’t help…
We will follow up this issue continuously until fix the root cause.
— 
Sincerely Yours,
Pixel

On 13/10/2017, 5:17 PM, "Christian König" <ckoenig.leichtzumerken at gmail.com> wrote:

>Am 13.10.2017 um 10:26 schrieb Pixel Ding:
>> From: pding <Pixel.Ding at amd.com>
>>
>> The polling memory was standalone in VRAM before, so the HDP flush
>> introduced latency that hides a VM fault issue. Now polling memory
>> leverages the WB in system memory and HDP flush is not required, the
>> VM fault at same page happens.
>>
>> Add delay back to workaround until the root cause is found.
>>
>> Tests: VP1 or launch 40 instances of glxinfo at the same time.
>>
>> Signed-off-by: pding <Pixel.Ding at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
>> index b1de44f..5c4bbe1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
>> @@ -381,6 +381,9 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring)
>>   	if (ring->use_doorbell) {
>>   		/* XXX check if swapping is necessary on BE */
>>   		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr) << 2;
>> +		/* workaround: VM fault always happen at page 2046 */
>> +		if (amdgpu_sriov_vf(adev))
>> +			udelay(500);
>
>Have you tried using a memory barrier here?
>
>That looks like it will have massive impact on performance.
>
>Regards,
>Christian.
>
>>   		WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2);
>>   	} else {
>>   		int me = (ring == &ring->adev->sdma.instance[0].ring) ? 0 : 1;
>
>