[PATCH v3 05/12] drm/ttm: Expose ttm_tt_unpopulate for driver use
Daniel Vetter
daniel.vetter at ffwll.ch
Wed Dec 16 14:21:31 UTC 2020
On Wed, Dec 16, 2020 at 9:04 AM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> Am 15.12.20 um 21:18 schrieb Andrey Grodzovsky:
> > [SNIP]
> >>>
> >>> While we can't control user application accesses to the mapped
> >>> buffers explicitly and hence we use page fault rerouting
> >>> I am thinking that in this case we may be able to sprinkle
> >>> drm_dev_enter/exit in any such sensitive place were we might
> >>> CPU access a DMA buffer from the kernel ?
> >>
> >> Yes, I fear we are going to need that.
> >>
> >>> Things like CPU page table updates, ring buffer accesses and FW
> >>> memcpy ? Is there other places ?
> >>
> >> Puh, good question. I have no idea.
> >>
> >>> Another point is that at this point the driver shouldn't access any
> >>> such buffers as we are at the process finishing the device.
> >>> AFAIK there is no page fault mechanism for kernel mappings so I
> >>> don't think there is anything else to do ?
> >>
> >> Well there is a page fault handler for kernel mappings, but that one
> >> just prints the stack trace into the system log and calls BUG(); :)
> >>
> >> Long story short we need to avoid any access to released pages after
> >> unplug. No matter if it's from the kernel or userspace.
> >
> >
> > I was just about to start guarding with drm_dev_enter/exit CPU
> > accesses from kernel to GTT ot VRAM buffers but then i looked more in
> > the code
> > and seems like ttm_tt_unpopulate just deletes DMA mappings (for the
> > sake of device to main memory access). Kernel page table is not touched
> > until last bo refcount is dropped and the bo is released
> > (ttm_bo_release->destroy->amdgpu_bo_destroy->amdgpu_bo_kunmap). This
> > is both
> > for GTT BOs maped to kernel by kmap (or vmap) and for VRAM BOs mapped
> > by ioremap. So as i see it, nothing will bad will happen after we
> > unpopulate a BO while we still try to use a kernel mapping for it,
> > system memory pages backing GTT BOs are still mapped and not freed and
> > for
> > VRAM BOs same is for the IO physical ranges mapped into the kernel
> > page table since iounmap wasn't called yet.
>
> The problem is the system pages would be freed and if we kernel driver
> still happily write to them we are pretty much busted because we write
> to freed up memory.
Similar for vram, if this is actual hotunplug and then replug, there's
going to be a different device behind the same mmio bar range most
likely (the higher bridges all this have the same windows assigned),
and that's bad news if we keep using it for current drivers. So we
really have to point all these cpu ptes to some other place.
-Daniel
>
> Christian.
>
> > I loaded the driver with vm_update_mode=3
> > meaning all VM updates done using CPU and hasn't seen any OOPs after
> > removing the device. I guess i can test it more by allocating GTT and
> > VRAM BOs
> > and trying to read/write to them after device is removed.
> >
> > Andrey
> >
> >
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>> Andrey
> >>
> >>
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the amd-gfx
mailing list