[Intel-gfx] FW: [PATCH 1/5] drm/i915: Allow i915 to manage the vma offset nodes instead of drm core

Mon Sep 23 12:39:56 UTC 2019

On Thu, Sep 19, 2019 at 3:05 PM Daniel Vetter <daniel at ffwll.ch> wrote:
>
> On Wed, Sep 11, 2019 at 2:19 PM Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > Quoting Balestrieri, Francesco (2019-09-11 13:03:25)
> > > On 04/09/2019, 13.33, "Intel-gfx on behalf of Daniel Vetter" <intel-gfx-bounces at lists.freedesktop.org on behalf of daniel at ffwll.ch> wrote:
> > >
> > >     On Mon, Aug 26, 2019 at 2:21 PM Abdiel Janulgue
> > >     > -       ret = create_mmap_offset(obj);
> > >     > -       if (ret == 0)
> > >     > -               *offset = drm_vma_node_offset_addr(&obj->base.vma_node);
> > >     > +       mmo = kzalloc(sizeof(*mmo), GFP_KERNEL);
> > >
> > >     I got thrown off a bunch of times here reading the code, but I think I
> > >     got this right now.
> > >
> > >     Why exactly do we want multiple vma offsets? Yes this makes it a
> > >     drop-in replacement for the old cpu mmap ioctl, which was a bit
> > >     dubious design. But if we go all new here, I really wonder about why
> > >     this is necessary. No other discrete driver needs this, they all fix
> > >     the mmap mode for the lifetime of an object, because flushing stuff is
> > >     as expensive as just reallocating (or at least close enough).
> > >
> > >     I think us going once again our separate route here needs a lot more
> > >     justification than just "we've accidentally ended up with uapi like
> > >     this 10 years ago".
> >
> > That's exactly the whole point, to replace the uapi we accidentally
> > ended up with 10 years ago with the api that doesn't cause valgrind to
> > complain, is easily extensible and supports all legacy usecases which
> > should be a very good position to be in to support unknown future
> > usecases as well. Letting userspace control their mmapings is very
> > powerful, and we definitely do not want to be limiting their
> > flexibility.
> >
> > That no other driver even seems to allow multiple mmaps, and so has
> > not developed a desire to manage multiple vma per object does not seem
> > to be a reason to limit ourselves. The infrastructure all supports it;
> > the only thing that is at odds is the desire to force the lowest common
> > denominator as the defacto standard.
>
> Just because something is possible (or looks possible at first)
> doesn't make it good uapi. That's how we get stuff like gtt mmap on
> userptr bo, and then regrets. This entire thing here gets sold as
> "uapi cleanup for lmem". Which makes sense, except most of the uapi
> isn't really cleaned up at all:
>
> - We still have relocations (and we even made them more powerful by
> pipelining them, at a time when all our mesa drivers finally managed
> to move to softpin). We could ditch all the reloc stuff for gen12+ and
> lmem instead. Both amdgpu and nv50+ have done that.
>
> - We're still adding more uapi that's all adding variable state.
> Meanwhile everything else (gallium/iris, vk) move to invariant state
> object models, where you create stuff once with the right properties
> and call it a day. lmem does add quite a few new state bits here,
> would be a lot simpler if we could make them as invariant as possible.
> For placement we might need to allow changes to the priority, but not
> to the placement list itself. For mmap, you just select the mmap mode
> that you want (because once you fixing caching mode, placement list
> and tiling bits, there's really not a choice anymore), and that's the
> one you get with the single mmap offset.
>
> Instead we just hang onto all the accumulated complexity, add more,
> and the only cleanup I'm seeing is that we bake in the multi-mmap
> model to save one single line in userspace for the valgrind
> annotation. Which everyone has already anyway, that is not really
> going to go away.
>
> I think using lmem to clean up the uapi makes tons of sense, but if we
> bother with that it should be a real cleanup. Not just cosmetics for
> the one valgrind annotation line in userspace we have right now.

Chatted with Chris, and one useful thing that multi-mmap gives us (and
which I ignored) is much easier testing of coherency between all the
different views. And we need that because the kernel needs that anyway
(at least cpu wc view against all others is needed for swap-out), and
we need to be able to test it well because our hw is historically
rather buggy in this area. So bummer and lots of sighs :-/
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch