[PATCH] Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2"

Tue Apr 14 13:51:31 UTC 2020

Am 13.04.20 um 20:20 schrieb Kent Russell:
> This reverts commit c12b84d6e0d70f1185e6daddfd12afb671791b6e.
> The original patch causes a RAS event and subsequent kernel hard-hang
> when running the KFDMemoryTest.PtraceAccessInvisibleVram on VG20 and
> Arcturus
>
> dmesg output at hang time:
> [drm] RAS event of type ERREVENT_ATHUB_INTERRUPT detected!
> amdgpu 0000:67:00.0: GPU reset begin!
> Evicting PASID 0x8000 queues
> Started evicting pasid 0x8000
> qcm fence wait loop timeout expired
> The cp might be in an unrecoverable state due to an unsuccessful queues preemption
> Failed to evict process queues
> Failed to suspend process 0x8000
> Finished evicting pasid 0x8000
> Started restoring pasid 0x8000
> Finished restoring pasid 0x8000
> [drm] UVD VCPU state may lost due to RAS ERREVENT_ATHUB_INTERRUPT
> amdgpu: [powerplay] Failed to send message 0x26, response 0x0
> amdgpu: [powerplay] Failed to set soft min gfxclk !
> amdgpu: [powerplay] Failed to upload DPM Bootup Levels!
> amdgpu: [powerplay] Failed to send message 0x7, response 0x0
> amdgpu: [powerplay] [DisableAllSMUFeatures] Failed to disable all smu features!
> amdgpu: [powerplay] [DisableDpmTasks] Failed to disable all smu features!
> amdgpu: [powerplay] [PowerOffAsic] Failed to disable DPM!
> [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <powerplay> failed -5

Do you have more information on what's going wrong here since this is a 
really important patch for KFD debugging.

>
> Signed-off-by: Kent Russell <kent.russell at amd.com>

Reviewed-by: Christian König <christian.koenig at amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ----------------------
>   1 file changed, 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index cf5d6e585634..a3f997f84020 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -254,32 +254,6 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
>   	uint32_t hi = ~0;
>   	uint64_t last;
>   
> -
> -#ifdef CONFIG_64BIT
> -	last = min(pos + size, adev->gmc.visible_vram_size);
> -	if (last > pos) {
> -		void __iomem *addr = adev->mman.aper_base_kaddr + pos;
> -		size_t count = last - pos;
> -
> -		if (write) {
> -			memcpy_toio(addr, buf, count);
> -			mb();
> -			amdgpu_asic_flush_hdp(adev, NULL);
> -		} else {
> -			amdgpu_asic_invalidate_hdp(adev, NULL);
> -			mb();
> -			memcpy_fromio(buf, addr, count);
> -		}
> -
> -		if (count == size)
> -			return;
> -
> -		pos += count;
> -		buf += count / 4;
> -		size -= count;
> -	}
> -#endif
> -
>   	spin_lock_irqsave(&adev->mmio_idx_lock, flags);
>   	for (last = pos + size; pos < last; pos += 4) {
>   		uint32_t tmp = pos >> 31;