[PATCH] drm/amdgpu: avoid clearing freed bo with sdma in gpu reset

Wed May 6 10:58:14 UTC 2020

Yes, exactly that one.

Regards,
Christian.

Am 06.05.20 um 12:35 schrieb Zhou, Tiecheng:
> [AMD Official Use Only - Internal Distribution Only]
>
> Thanks, Christian,
>
> Is this the fix that you are mentioning:
>
> commit 1675c3a24d075d484377003789245f48c2114a0b
> Author: Christian König <christian.koenig at amd.com>
> Date:   Fri Feb 21 15:10:31 2020 +0100
>
>      drm/amdgpu: stop disable the scheduler during HW fini
>
>      When we stop the HW for example for GPU reset we should not stop the
>      front-end scheduler. Otherwise we run into intermediate failures during
>      command submission.
>
>      The scheduler should only be stopped in very few cases:
>      1. We can't get the hardware working in ring or IB test after a GPU reset.
>      2. The KIQ scheduler is not used in the front-end and should be disabled during GPU reset.
>      3. In amdgpu_ring_fini() when the driver unloads.
>
>      Signed-off-by: Christian König <christian.koenig at amd.com>
>      Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
>      Acked-by: Nirmoy Das <nirmoy.das at amd.com>
>      Test-by: Dennis Li <dennis.li at amd.com>
>      Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>
> Thanks
> Tiecheng
>
>
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> Sent: Wednesday, May 6, 2020 5:44 PM
> To: Zhou, Tiecheng <Tiecheng.Zhou at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: avoid clearing freed bo with sdma in gpu reset
>
> NAK, the fundamental problem was that we disabled the SDMA paging queue during reset:
>> [  885.694682] [drm] schedpage0 is not ready, skipping [  885.694682]
>> [drm] schedpage1 is not ready, skipping
> This is fixed by now, so the problem should not happen any more.
>
> Regards,
> Christian.
>
>
> Am 06.05.20 um 11:36 schrieb Tiecheng Zhou:
>> WHY:
>> For V320 passthrough and "modprobe amdgpu lockup_timeout=500", there
>> will be kernel NULL pointer when using quark ~ BACO reset, for instance:
>>     hang_vm_compute0_bad_cs_dispatch.lua
>>     hang_vm_dma0_corrupted_header.lua
>>     etc.
>> -----------------------------
>> [  884.792885] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
>> comp_1.0.0 timeout, signaled seq=3, emitted seq=4 [  884.793772]
>> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information:
>> process quark pid 16939 thread quark pid 16940 [  884.859979] amdgpu:
>> [powerplay] set virtualization GFX DPM policy success [  884.861003]
>> amdgpu: [powerplay] activate virtualization GFX DPM policy success [  884.861065] amdgpu: [powerplay] set virtualization VCE DPM policy success [  885.693554] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
>> [  885.694682] [drm] schedpage0 is not ready, skipping [  885.694682]
>> [drm] schedpage1 is not ready, skipping [  885.694720]
>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
>> [  885.695328] BUG: unable to handle kernel NULL pointer dereference
>> at 0000000000000008 [  885.695909] PGD 0 P4D 0 [  885.696104] Oops:
>> 0000 [#1] SMP PTI
>> [  885.696368] CPU: 2 PID: 16940 Comm: quark Tainted: G           OE     4.19.52+ #6
>> [  885.696945] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS 1.10.2-1 04/01/2014 [  885.697593] RIP:
>> 0010:amdgpu_vm_sdma_commit+0x59/0x130 [amdgpu] ...
>> [  885.705042] Call Trace:
>> [  885.705251]  ? amdgpu_vm_bo_update_mapping+0xdf/0xf0 [amdgpu] [
>> 885.705696]  ? amdgpu_vm_clear_freed+0xcc/0x1b0 [amdgpu] [
>> 885.706112]  ? amdgpu_gem_va_ioctl+0x4a1/0x510 [amdgpu] [  885.706493]
>> ? __radix_tree_delete+0x7e/0xa0 [  885.706822]  ?
>> amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu] [  885.707220]  ?
>> drm_ioctl_kernel+0xaa/0xf0 [drm] [  885.707568]  ?
>> amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu] [  885.707962]  ?
>> drm_ioctl_kernel+0xaa/0xf0 [drm] [  885.708294]  ?
>> drm_ioctl+0x3a7/0x3f0 [drm] [  885.708632]  ?
>> amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu] [  885.709032]  ?
>> unmap_region+0xd9/0x120 [  885.709328]  ? amdgpu_drm_ioctl+0x49/0x80
>> [amdgpu] [  885.709684]  ? do_vfs_ioctl+0xa1/0x620 [  885.709971]  ?
>> do_munmap+0x32e/0x430 [  885.710232]  ? ksys_ioctl+0x66/0x70 [
>> 885.710513]  ? __x64_sys_ioctl+0x16/0x20 [  885.710806]  ?
>> do_syscall_64+0x55/0x100 [  885.711092]  ?
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> ...
>> [  885.719408] ---[ end trace 7ee3180f42e9f572 ]--- [  885.719766]
>> RIP: 0010:amdgpu_vm_sdma_commit+0x59/0x130 [amdgpu] ...
>> -----------------------------
>>
>> the NULL pointer (entity->rq == NULL in amdgpu_vm_sdma_commit()) as follows:
>> 1. quark sends bad job that triggers job timeout; 2. guest KMD detects
>> the job timeout and goes to gpu recovery, and it goes to
>>      ip_suspend for SDMA, and it sets sdma[].sched.ready to false; 3.
>> quark sends UNMAP operation through amdgpu_gem_va_ioctl, and guest KMD goes
>>      through amdgpu_gem_va_update_vm and finally goes to amdgpu_vm_sdma_commit,
>>      it goes to amdgpu_job_submit to drm_sched_job_init 4.
>> drm_sched_job_init fails at drm_sched_pick_best() since
>>      sdma[].sched.ready is set to false; in the meanwhile entity->rq
>> becomes NULL; 5. quark sends other UNMAP operations through amdgpu_gem_va_ioctl, while this time
>>      there will be NULL pointer because entity->rq is NULL;
>>
>> the above sequence occurs only when "modprobe amdgpu lockup_timeout=500".
>> it does not occur when lockup_timeout=10000 (default) because step 2.
>> KMD detects job timeout will be sometime after quark sends UNMAP
>> operations; i.e. quark UNMAP opeartions are finished before sdma ip suspend.
>>
>> HOW:
>> here is to add mutex_lock to wait to avoid using sdma during gpu reset.
>>
>> Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou at amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++++
>>    1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index e205ecc75a21..018b88f3b6da 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -2047,6 +2047,8 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>>    	struct dma_fence *f = NULL;
>>    	int r;
>>    
>> +	mutex_lock(&adev->lock_reset);
>> +
>>    	while (!list_empty(&vm->freed)) {
>>    		mapping = list_first_entry(&vm->freed,
>>    			struct amdgpu_bo_va_mapping, list); @@ -2062,6 +2064,7 @@ int
>> amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>>    		amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>    		if (r) {
>>    			dma_fence_put(f);
>> +			mutex_unlock(&adev->lock_reset);
>>    			return r;
>>    		}
>>    	}
>> @@ -2073,6 +2076,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>>    		dma_fence_put(f);
>>    	}
>>    
>> +	mutex_unlock(&adev->lock_reset);
>>    	return 0;
>>    
>>    }