[PATCH v1 0/5] Add memory shrinker to VirtIO-GPU DRM driver

Wed Mar 9 00:56:43 UTC 2022

On Tue, Mar 8, 2022 at 3:36 PM Dmitry Osipenko
<dmitry.osipenko at collabora.com> wrote:
>
> On 3/9/22 01:24, Rob Clark wrote:
> > On Tue, Mar 8, 2022 at 11:28 AM Dmitry Osipenko
> > <dmitry.osipenko at collabora.com> wrote:
> >>
> >> On 3/8/22 19:29, Rob Clark wrote:
> >>> On Tue, Mar 8, 2022 at 5:17 AM Dmitry Osipenko
> >>> <dmitry.osipenko at collabora.com> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> This patchset introduces memory shrinker for the VirtIO-GPU DRM driver.
> >>>> During OOM, the shrinker will release BOs that are marked as "not needed"
> >>>> by userspace using the new madvise IOCTL. The userspace in this case is
> >>>> the Mesa VirGL driver, it will mark the cached BOs as "not needed",
> >>>> allowing kernel driver to release memory of the cached shmem BOs on lowmem
> >>>> situations, preventing OOM kills.
> >>>
> >>> Will host memory pressure already trigger shrinker in guest?
> >>
> >> The host memory pressure won't trigger shrinker in guest here. This
> >> series will help only with the memory pressure within the guest using a
> >> usual "virgl context".
> >>
> >> Having a host shrinker in a case of "virgl contexts" should be a
> >> difficult problem to solve.
> >
> > Hmm, I think we just need the balloon driver to trigger the shrinker
> > in the guest kernel?  I suppose a driver like drm/virtio might want to
> > differentiate between host and guest pressure (ie. consider only
> > objects that have host vs guest storage), but even without that,
> > freeing up memory in the guest when host is under memory pressure
> > seems worthwhile.  Maybe I'm over-simplifying?
>
> Might be the opposite, i.e. me over-complicating :) The variant with
> memory ballooning actually could be good and will work for all kinds of
> virtio contexts universally. There will be some back-n-forth between
> host and guest, but perhaps it will work okay. Thank you for the suggestion.
>
> >>> This is
> >>> something I'm quite interested in for "virtgpu native contexts" (ie.
> >>> native guest driver with new context type sitting on top of virtgpu),
> >>
> >> In a case of "native contexts" it should be doable, at least I can't see
> >> any obvious problems. The madvise invocations could be passed to the
> >> host using a new virtio-gpu command by the guest's madvise IOCTL
> >> handler, instead-of/in-addition-to handling madvise in the guest's
> >> kernel, and that's it.
> >
> > I think we don't want to do that, because MADV:WILLNEED would be by
> > far the most frequent guest<->host synchronous round trip.  So from
> > that perspective tracking madvise state in guest kernel seems quite
> > attractive.
>
> This is a valid concern. I'd assume that the overhead should be
> tolerable, but I don't have any actual perf numbers.

jfwiw, MADV:WILLNEED is a *very* hot path for gl drivers, based on
some measurements I did a while back with various apps/benchmarks..
easily more than 10x the next most frequent ioctl (for MADV:WONTNEED
and MADV:WILLNEED each, so more than 20x combined.. but MADV:WONTNEED
can be async).

But if the balloon triggering shrinker approach works out, that would
be pretty great.. it seems like the easy option and doesn't require
adding new host kernel uabi :-)

BR,
-R

> > If we really can't track madvise state in the guest for dealing with
> > host memory pressure, I think the better option is to introduce
> > MADV:WILLNEED_REPLACE, ie. something to tell the host kernel that the
> > buffer is needed but the previous contents are not (as long as the GPU
> > VA remains the same).  With this the host could allocate new pages if
> > needed, and the guest would not need to wait for a reply from host.
>
> If variant with the memory ballooning will work, then it will be
> possible to track the state within guest-only. Let's consider the
> simplest variant for now.
>
> I'll try to implement the balloon driver support in the v2 and will get
> back to you.
>
> >>> since that isn't using host storage
> >>
> >> s/host/guest ?
> >
> > Yes, sorry, I meant that it is not using guest storage.
>
> Thank you for the clarification.