[PATCH] drm/amdgpu: set bulk_moveable to false when a per VM is released

Huang Rui ray.huang at amd.com
Mon Sep 10 06:19:16 UTC 2018


On Sun, Sep 09, 2018 at 06:38:13PM +0800, StDenis, Tom wrote:
> On 2018-09-08 5:12 a.m., Huang Rui wrote:
> > On Wed, Sep 05, 2018 at 05:08:26PM +0200, Christian König wrote:
> >> Otherwise we might run into a use after free during bulk move.
> >>
> >> Signed-off-by: Christian König <christian.koenig at amd.com>
> > 
> > Is this patch able to fix the KASAN?
> > [   66.143009] ==================================================================
> > [   66.143254] BUG: KASAN: use-after-free in ttm_bo_bulk_move_lru_tail+0x2b/0x100 [ttm]
> > [   66.143263] Read of size 8 at addr ffff8801f193d550 by task gnome-shel:cs0/4194
> > 
> > Tom, may we have your tested-by?
> > 
> > Reviewed-by: Huang Rui <ray.huang at amd.com>
> 
> Hi Ray,
> 
> I had tested this patch and it failed to survive a piglit run.  The only 
> fix so far was to completely disable bulk moves with this:
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index ea5e277ae038..ab244a726ad9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -397,7 +397,7 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
> *adev,
>          }
>          spin_unlock(&glob->lru_lock);
> 
> -       vm->bulk_moveable = true;
> +//     vm->bulk_moveable = true;
>   }
> 
>   /**
> 

Thanks, Tom.
I enabled KASAN with compiler instrumentation type as outline, but module
is unable to load with the protection fault. Did I have something missed?

[   85.348249] calling  drm_core_init+0x0/0xde [drm] @ 1391
[   85.353763] initcall drm_core_init+0x0/0xde [drm] returned 0 after 78 usecs
[   85.376264] calling  ttm_init+0x0/0x1000 [ttm] @ 1391
[   85.381488] initcall ttm_init+0x0/0x1000 [ttm] returned 0 after 92 usecs
[   85.407897] general protection fault: 0000 [#1] SMP KASAN PTI
[   85.413751] CPU: 0 PID: 1391 Comm: modprobe Not tainted 4.19.0-rc1-custom #1
[   85.420900] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016
[   85.430374] RIP: 0010:memset_erms+0x9/0x10
[   85.434559] Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
[   85.453641] RSP: 0018:ffff8803dea27cf8 EFLAGS: 00010202
[   85.458955] RAX: 1ffffffff8174800 RBX: ffffffffc0ba4040 RCX: 1ffffffff8174808
[   85.466201] RDX: 1ffffffff8174808 RSI: 0000000000000000 RDI: dffffc0000000000
[   85.473462] RBP: 0000000000000000 R08: ffff8803cf752f88 R09: dffffc0000000000
[   85.480751] R10: 0000000000000007 R11: 00000000ef150e75 R12: ffffffffc0bb6000
[   85.488038] R13: 0000000000000002 R14: ffffffffc0ba4040 R15: ffffffffc0bb9a00
[   85.495319] FS:  00007f50d35c9700(0000) GS:ffff8803ee800000(0000) knlGS:0000000000000000
[   85.503535] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   85.509386] CR2: 00007fffa12bc6f8 CR3: 00000003e15c6004 CR4: 00000000003606f0
[   85.516630] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   85.523893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   85.531183] Call Trace:
[   85.533672]  kasan_unpoison_shadow+0xf/0x30

Thanks,
Ray

> 
> Tom
> 
> > 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++++
> >>   1 file changed, 4 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> index ea5e277ae038..ed1e6abda391 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> @@ -2513,8 +2513,12 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
> >>   		      struct amdgpu_bo_va *bo_va)
> >>   {
> >>   	struct amdgpu_bo_va_mapping *mapping, *next;
> >> +	struct amdgpu_bo *bo = bo_va->base.bo;
> >>   	struct amdgpu_vm *vm = bo_va->base.vm;
> >>   
> >> +	if (bo && bo->tbo.resv == vm->root.base.bo->tbo.resv)
> >> +		vm->bulk_moveable = false;
> >> +
> >>   	list_del(&bo_va->base.bo_list);
> >>   
> >>   	spin_lock(&vm->invalidated_lock);
> >> -- 
> >> 2.17.1
> >>
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 


More information about the amd-gfx mailing list