pages pinned for BO lifetime and security

Daniel Vetter daniel at ffwll.ch
Wed Jul 22 21:47:00 UTC 2020


On Wed, Jul 22, 2020 at 9:27 PM Chia-I Wu <olvaffe at gmail.com> wrote:
>
> On Wed, Jul 22, 2020 at 4:28 AM Daniel Vetter <daniel at ffwll.ch> wrote:
> >
> > On Wed, Jul 22, 2020 at 1:12 PM Christian König
> > <christian.koenig at amd.com> wrote:
> > >
> > > Am 22.07.20 um 09:32 schrieb Daniel Vetter:
> > > > On Wed, Jul 22, 2020 at 9:19 AM Christian König
> > > > <christian.koenig at amd.com> wrote:
> > > >> Am 22.07.20 um 02:22 schrieb Gurchetan Singh:
> > > >>
> > > >> +Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:
> > > >>
> > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fdma-buf%2FKconfig%23n46&data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637310000122752271&sdata=WKs3KSr3K1DdVmDaIZ%2FnV8VEBPWMGMSGweeay0HIOxw%3D&reserved=0
> > > >>
> > > >> Currently, the user seems to amdgpu for P2P dma-buf and it seems to plumb ttm (*move_notify) callback to dma-buf.  We're not sure if it's a security issue occurring across DRM drivers, or one more specific to the new amdgpu use case.
> > > >>
> > > >> On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > > >>> Hi list,
> > > >>>
> > > >>> virtio-gpu is moving in the direction where BO pages are pinned for
> > > >>> the lifetime for simplicity.  I am wondering if that is considered a
> > > >>> security issue in general, especially aCan you elaborate a little bit what these other problems might be?  Memory fragmentation?fter running into the
> > > >>> description of the new DMABUF_MOVE_NOTIFY config option.
> > > >>
> > > >> Yes, that is generally considered a deny of service possibility and so far Dave and Daniel have rejected all tries to upstream stuff like this as far as I know.
> > > > Uh we have merged pretty much all arm-soc drivers without real
> > > > shrinkers. Whether that was a good idea or not is maybe different
> > > > question - now that we do have pretty good helpers maybe we should
> > > > poke this a bit more. But then SoCs Suck (tm).
> > >
> > > I was under the impression that those SoC drivers still use the GEM
> > > helpers which unpinns stuff when it is not in use. But I might be wrong.
> >
> > It's kinda mostly there, even some helpers for shrinking but a)
> > helpers on, not all drivers use it b) for purgeable objects only, not
> > generally for inactive stuff - there's no active use tracking c) cma
> > helpers (ok that one is only for vc4 as the render driver) don't even
> > have that. I had some slow burner series to get us towards dma_resv
> > locking in shmem helpers and then maybe even a common shrinker helper
> > with some "actually kick it out now" callback, but yeah never got
> > there.
> My quick survey of the SoC drivers also told me that they tend to
> demonstrate a) or b).
>
> About b), I was thinking maybe that's because the systems the drivers
> run on are historically swap-less.  There is no place to write the
> dirty pages back to and thus less incentive to support shrinking
> inactive objects.
>
> You both mentioned that the lack of swap is irrelevant (or at least
> not the only factor).  Can you elaborate a little bit on that?
> Shrinking inactive objects returns the pages to swap cache... hmm, I
> guess that helps memory defragmentation?

Yup, kernel can then move it around as it sees fit.

Also, zswap is a thing nowadays, and I think used by default on android now.
-Daniel

> >
> > So maybe per-device object shrinker helper would be something neat we
> > could lift out of ttm (when it's happening), maybe with a simple
> > callback somewhere in it's lru tracking. Probably best if the shrinker
> > lru is outright separate from anything else or it just gets messy.
> > -Daniel
> >
> > > > But for real gpus they do indeed all have shrinkers, and not just "pin
> > > > everything forever" model. Real gpus = stuff you might run on servers
> > > > or multi-app and all that stuff, not with a simple "we just kill all
> > > > background jobs if memory gets low" model like on android and other
> > > > such things.
> > > >
> > > >> DMA-buf an pinning for scanout are the only exceptions since the implementation wouldn't have been possible otherwise.
> > > >>
> > > >>> Most drivers do not have a shrinker, or whether a BO is purgeable is
> > > >>> entirely controlled by the userspace (madvice).  They can be
> > > >>> categorized as "a security problem where userspace is able to pin
> > > >>> unrestricted amounts of memory".  But those drivers are normally found
> > > >>> on systems without swap.  I don't think the issue applies.
> > > >>
> > > >> This is completely independent of the availability of swap or not.
> > > >>
> > > >> Pinning of pages in large quantities can result in all kind of problems and needs to be prevented even without swap.
> > > > Yeah you don't just kill swap, you kill a ton of other kernel services
> > > > with mass pinning. I think even the pinning of scanout buffers for
> > > > i915 from system memory is somewhat questionable (but I guess small
> > > > enough to not matter in practice).
> > >
> > > Yeah, we had a really hard time explaining that internally as well.
> > >
> > > Christian.
> > >
> > > >> Otherwise you can ran into problems even with simple I/O operations for example.
> > > >>
> > > >>> Of the desktop GPU drivers, i915's shrinker certainly supports purging
> > > >>> to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
> > > >>> or nouveau supports that.  virtio-gpu is more commonly found on
> > > >>> systems with swaps so I think it should follow the desktop practices?
> > > >>
> > > >> What we do at least in the amdgpu, radeon, i915 and nouveau is to only allow it for scanout and that in turn is limited by the physical number of CRTCs on the board.
> > > >>
> > > >>> Truth is, the emulated virtio-gpu device always supports page moves
> > > >>> with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
> > > >>> the driver does not make use of them.  That makes this less of an
> > > >>> issue because the driver can be fixed anytime (finger crossed that the
> > > >>> emulator won't have bugs in these untested paths).  This issue becomes
> > > >>> more urgent because we are considering adding a new HW command[1]
> > > >>> where page moves will be disallowed.  We definitely don't want a HW
> > > >>> command that is inherently insecure, if BO pages pinned for the
> > > >>> lifetime is considered a security issue on desktops.
> > > >>
> > > >> Yeah, that's probably not such a good idea :)
> > > > Well if the pinning is just for the duration of the hw command, it's
> > > > fine, just like batch buffers. But if it's long term pinning then that
> > > > doesn't sound like a good idea. RDMA has this as their inherit hw
> > > > programming model (except if your hw is really fancy and has hw page
> > > > fault handling on the rdma nic), and they hard limit such pins to what
> > > > you can mlock (or something similar within rdma).
> Yeah, the driver pins the pages for the lifetime of any BO at the moment.
>
> There are existing HW commands that the driver can use to notify the
> emulator about page migrations.  But because the driver pins pages
> forever, those commands are unused.  We were considering taking that
> further by disallowing those commands in some cases.  I am glad to
> learn that it is a bad idea :)
>
> > > > -Daniel
> > > >
> > > >> Regards,
> > > >> Christian.
> > > >>
> > > >>> [1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
> > > >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fvirgl%2Fdrm-misc-next%2F-%2Fblob%2Fvirtio-gpu-next%2Finclude%2Fuapi%2Flinux%2Fvirtio_gpu.h%23L396&data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637310000122752271&sdata=AWtwzk%2BP7P7031ibTumr2J%2FQB%2Fzssg1Imag%2FstysR5A%3D&reserved=0
> > > >>
> > > >> _______________________________________________
> > > >> dri-devel mailing list
> > > >> dri-devel at lists.freedesktop.org
> > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637310000122752271&sdata=I0VHr96oWzpWyiNXjD1jG9%2Fp2%2F848CdPc4%2FTf6SkWm8%3D&reserved=0
> > > >
> > > >
> > >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list