[PATCH v3 05/12] drm/ttm: Expose ttm_tt_unpopulate for driver use

Wed Dec 16 17:20:20 UTC 2020

On Wed, Dec 16, 2020 at 6:12 PM Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
>
> On Wed, Dec 16, 2020 at 5:18 PM Christian König
> <christian.koenig at amd.com> wrote:
> >
> > Am 16.12.20 um 17:13 schrieb Andrey Grodzovsky:
> > >
> > > On 12/16/20 9:21 AM, Daniel Vetter wrote:
> > >> On Wed, Dec 16, 2020 at 9:04 AM Christian König
> > >> <ckoenig.leichtzumerken at gmail.com> wrote:
> > >>> Am 15.12.20 um 21:18 schrieb Andrey Grodzovsky:
> > >>>> [SNIP]
> > >>>>>> While we can't control user application accesses to the mapped
> > >>>>>> buffers explicitly and hence we use page fault rerouting
> > >>>>>> I am thinking that in this  case we may be able to sprinkle
> > >>>>>> drm_dev_enter/exit in any such sensitive place were we might
> > >>>>>> CPU access a DMA buffer from the kernel ?
> > >>>>> Yes, I fear we are going to need that.
> > >>>>>
> > >>>>>> Things like CPU page table updates, ring buffer accesses and FW
> > >>>>>> memcpy ? Is there other places ?
> > >>>>> Puh, good question. I have no idea.
> > >>>>>
> > >>>>>> Another point is that at this point the driver shouldn't access any
> > >>>>>> such buffers as we are at the process finishing the device.
> > >>>>>> AFAIK there is no page fault mechanism for kernel mappings so I
> > >>>>>> don't think there is anything else to do ?
> > >>>>> Well there is a page fault handler for kernel mappings, but that one
> > >>>>> just prints the stack trace into the system log and calls BUG(); :)
> > >>>>>
> > >>>>> Long story short we need to avoid any access to released pages after
> > >>>>> unplug. No matter if it's from the kernel or userspace.
> > >>>>
> > >>>> I was just about to start guarding with drm_dev_enter/exit CPU
> > >>>> accesses from kernel to GTT ot VRAM buffers but then i looked more in
> > >>>> the code
> > >>>> and seems like ttm_tt_unpopulate just deletes DMA mappings (for the
> > >>>> sake of device to main memory access). Kernel page table is not
> > >>>> touched
> > >>>> until last bo refcount is dropped and the bo is released
> > >>>> (ttm_bo_release->destroy->amdgpu_bo_destroy->amdgpu_bo_kunmap). This
> > >>>> is both
> > >>>> for GTT BOs maped to kernel by kmap (or vmap) and for VRAM BOs mapped
> > >>>> by ioremap. So as i see it, nothing will bad will happen after we
> > >>>> unpopulate a BO while we still try to use a kernel mapping for it,
> > >>>> system memory pages backing GTT BOs are still mapped and not freed and
> > >>>> for
> > >>>> VRAM BOs same is for the IO physical ranges mapped into the kernel
> > >>>> page table since iounmap wasn't called yet.
> > >>> The problem is the system pages would be freed and if we kernel driver
> > >>> still happily write to them we are pretty much busted because we write
> > >>> to freed up memory.
> > >
> > >
> > > OK, i see i missed ttm_tt_unpopulate->..->ttm_pool_free which will
> > > release
> > > the GTT BO pages. But then isn't there a problem in ttm_bo_release since
> > > ttm_bo_cleanup_memtype_use which also leads to pages release comes
> > > before bo->destroy which unmaps the pages from kernel page table ? Won't
> > > we have end up writing to freed memory in this time interval ? Don't we
> > > need to postpone pages freeing to after kernel page table unmapping ?
> >
> > BOs are only destroyed when there is a guarantee that nobody is
> > accessing them any more.
> >
> > The problem here is that the pages as well as the VRAM can be
> > immediately reused after the hotplug event.
> >
> > >
> > >
> > >> Similar for vram, if this is actual hotunplug and then replug, there's
> > >> going to be a different device behind the same mmio bar range most
> > >> likely (the higher bridges all this have the same windows assigned),
> > >
> > >
> > > No idea how this actually works but if we haven't called iounmap yet
> > > doesn't it mean that those physical ranges that are still mapped into
> > > page
> > > table should be reserved and cannot be reused for another
> > > device ? As a guess, maybe another subrange from the higher bridge's
> > > total
> > > range will be allocated.
> >
> > Nope, the PCIe subsystem doesn't care about any ioremap still active for
> > a range when it is hotplugged.
> >
> > >
> > >> and that's bad news if we keep using it for current drivers. So we
> > >> really have to point all these cpu ptes to some other place.
> > >
> > >
> > > We can't just unmap it without syncing against any in kernel accesses
> > > to those buffers
> > > and since page faulting technique we use for user mapped buffers seems
> > > to not be possible
> > > for kernel mapped buffers I am not sure how to do it gracefully...
> >
> > We could try to replace the kmap with a dummy page under the hood, but
> > that is extremely tricky.
> >
> > Especially since BOs which are just 1 page in size could point to the
> > linear mapping directly.
>
> I think it's just more work. Essentially
> - convert as much as possible of the kernel mappings to vmap_local,
> which Thomas Zimmermann is rolling out. That way a dma_resv_lock will
> serve as a barrier, and ofc any new vmap needs to fail or hand out a
> dummy mapping.
> - handle fbcon somehow. I think shutting it all down should work out.

Oh also for fbdev I think best to switch over to
drm_fbdev_generic_setup(). That should handle all the lifetime fun
correctly already, minus the vram complication. So at least fewer
oopses for other reasons :-)
-Daniel

> - worst case keep the system backing storage around for shared dma-buf
> until the other non-dynamic driver releases it. for vram we require
> dynamic importers (and maybe it wasn't such a bright idea to allow
> pinning of importer buffers, might need to revisit that).
>
> Cheers, Daniel
>
> >
> > Christian.
> >
> > >
> > > Andrey
> > >
> > >
> > >> -Daniel
> > >>
> > >>> Christian.
> > >>>
> > >>>> I loaded the driver with vm_update_mode=3
> > >>>> meaning all VM updates done using CPU and hasn't seen any OOPs after
> > >>>> removing the device. I guess i can test it more by allocating GTT and
> > >>>> VRAM BOs
> > >>>> and trying to read/write to them after device is removed.
> > >>>>
> > >>>> Andrey
> > >>>>
> > >>>>
> > >>>>> Regards,
> > >>>>> Christian.
> > >>>>>
> > >>>>>> Andrey
> > >>>>>
> > >>>> _______________________________________________
> > >>>> amd-gfx mailing list
> > >>>> amd-gfx at lists.freedesktop.org
> > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C6ee2a428d88a4742f45a08d8a1cde9c7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637437253067654506%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WRL2smY7iemgZdlH3taUZCoa8h%2BuaKD1Hv0tbHUclAQ%3D&reserved=0
> > >>>>
> > >>
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch