[PATCH v4] drm/atomic-helpers: remove legacy_cursor_update hacks

Daniel Vetter daniel at ffwll.ch
Thu Apr 7 06:03:48 UTC 2022


On Thu, 31 Mar 2022 at 22:41, Ville Syrjälä
<ville.syrjala at linux.intel.com> wrote:
>
> On Thu, Mar 31, 2022 at 10:25:23PM +0200, Daniel Vetter wrote:
> > On Thu, Mar 31, 2022 at 11:11:00PM +0300, Ville Syrjälä wrote:
> > > On Thu, Mar 31, 2022 at 10:02:53PM +0200, Daniel Vetter wrote:
> > > > On Thu, Mar 31, 2022 at 10:52:54PM +0300, Ville Syrjälä wrote:
> > > > > On Thu, Mar 31, 2022 at 09:11:29PM +0200, Daniel Vetter wrote:
> > > > > > On Thu, Mar 31, 2022 at 06:48:43PM +0300, Ville Syrjälä wrote:
> > > > > > > On Thu, Mar 31, 2022 at 05:14:29PM +0200, Daniel Vetter wrote:
> > > > > > > > On Thu, Mar 31, 2022 at 04:25:13PM +0300, Ville Syrjälä wrote:
> > > > > > > > > On Thu, Mar 31, 2022 at 03:05:45PM +0200, Maxime Ripard wrote:
> > > > > > > > > > From: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > > > > > > > >
> > > > > > > > > > The stuff never really worked, and leads to lots of fun because it
> > > > > > > > > > out-of-order frees atomic states. Which upsets KASAN, among other
> > > > > > > > > > things.
> > > > > > > > > >
> > > > > > > > > > For async updates we now have a more solid solution with the
> > > > > > > > > > ->atomic_async_check and ->atomic_async_commit hooks. Support for that
> > > > > > > > > > for msm and vc4 landed. nouveau and i915 have their own commit
> > > > > > > > > > routines, doing something similar.
> > > > > > > > > >
> > > > > > > > > > For everyone else it's probably better to remove the use-after-free
> > > > > > > > > > bug, and encourage folks to use the async support instead. The
> > > > > > > > > > affected drivers which register a legacy cursor plane and don't either
> > > > > > > > > > use the new async stuff or their own commit routine are: amdgpu,
> > > > > > > > > > atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and vmwgfx.
> > > > > > > > > >
> > > > > > > > > > Inspired by an amdgpu bug report.
> > > > > > > > > >
> > > > > > > > > > v2: Drop RFC, I think with amdgpu converted over to use
> > > > > > > > > > atomic_async_check/commit done in
> > > > > > > > > >
> > > > > > > > > > commit 674e78acae0dfb4beb56132e41cbae5b60f7d662
> > > > > > > > > > Author: Nicholas Kazlauskas <nicholas.kazlauskas at amd.com>
> > > > > > > > > > Date:   Wed Dec 5 14:59:07 2018 -0500
> > > > > > > > > >
> > > > > > > > > >     drm/amd/display: Add fast path for cursor plane updates
> > > > > > > > > >
> > > > > > > > > > we don't have any driver anymore where we have userspace expecting
> > > > > > > > > > solid legacy cursor support _and_ they are using the atomic helpers in
> > > > > > > > > > their fully glory. So we can retire this.
> > > > > > > > > >
> > > > > > > > > > v3: Paper over msm and i915 regression. The complete_all is the only
> > > > > > > > > > thing missing afaict.
> > > > > > > > > >
> > > > > > > > > > v4: Rebased on recent kernel, added extra link for vc4 bug.
> > > > > > > > > >
> > > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=199425
> > > > > > > > > > Link: https://lore.kernel.org/all/20220221134155.125447-9-maxime@cerno.tech/
> > > > > > > > > > Cc: mikita.lipski at amd.com
> > > > > > > > > > Cc: Michel Dänzer <michel at daenzer.net>
> > > > > > > > > > Cc: harry.wentland at amd.com
> > > > > > > > > > Cc: Rob Clark <robdclark at gmail.com>
> > > > > > > > > > Cc: "Kazlauskas, Nicholas" <nicholas.kazlauskas at amd.com>
> > > > > > > > > > Tested-by: Maxime Ripard <maxime at cerno.tech>
> > > > > > > > > > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > > > > > > > > > Signed-off-by: Maxime Ripard <maxime at cerno.tech>
> > > > > > > > > > ---
> > > > > > > > > >  drivers/gpu/drm/drm_atomic_helper.c          | 13 -------------
> > > > > > > > > >  drivers/gpu/drm/i915/display/intel_display.c | 13 +++++++++++++
> > > > > > > > > >  drivers/gpu/drm/msm/msm_atomic.c             |  2 ++
> > > > > > > > > >  3 files changed, 15 insertions(+), 13 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > > > index 9603193d2fa1..a2899af82b4a 100644
> > > > > > > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > > > @@ -1498,13 +1498,6 @@ drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
> > > > > > > > > >       int i, ret;
> > > > > > > > > >       unsigned int crtc_mask = 0;
> > > > > > > > > >
> > > > > > > > > > -      /*
> > > > > > > > > > -       * Legacy cursor ioctls are completely unsynced, and userspace
> > > > > > > > > > -       * relies on that (by doing tons of cursor updates).
> > > > > > > > > > -       */
> > > > > > > > > > -     if (old_state->legacy_cursor_update)
> > > > > > > > > > -             return;
> > > > > > > > > > -
> > > > > > > > > >       for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > > > > > > > >               if (!new_crtc_state->active)
> > > > > > > > > >                       continue;
> > > > > > > > > > @@ -2135,12 +2128,6 @@ int drm_atomic_helper_setup_commit(struct drm_atomic_state *state,
> > > > > > > > > >                       continue;
> > > > > > > > > >               }
> > > > > > > > > >
> > > > > > > > > > -             /* Legacy cursor updates are fully unsynced. */
> > > > > > > > > > -             if (state->legacy_cursor_update) {
> > > > > > > > > > -                     complete_all(&commit->flip_done);
> > > > > > > > > > -                     continue;
> > > > > > > > > > -             }
> > > > > > > > > > -
> > > > > > > > > >               if (!new_crtc_state->event) {
> > > > > > > > > >                       commit->event = kzalloc(sizeof(*commit->event),
> > > > > > > > > >                                               GFP_KERNEL);
> > > > > > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > > > > > index bf7ce684dd8e..bde32f5a33cb 100644
> > > > > > > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > > > > > @@ -8855,6 +8855,19 @@ static int intel_atomic_commit(struct drm_device *dev,
> > > > > > > > > >                               state->base.legacy_cursor_update = false;
> > > > > > > > > >       }
> > > > > > > > > >
> > > > > > > > > > +     /*
> > > > > > > > > > +      * FIXME: Cut over to (async) commit helpers instead of hand-rolling
> > > > > > > > > > +      * everything.
> > > > > > > > > > +      */
> > > > > > > > >
> > > > > > > > > Intel cursors can't even do async updates so this is rather
> > > > > > > > > nonsensical. What we need is some kind of reasonable mailbox
> > > > > > > > > support.
> > > > > > > >
> > > > > > > > This is not the async plane update you're thinking of. i915 really should
> > > > > > > > switch over more to atomic helpers.
> > > > > > >
> > > > > > > The comment should be clarified then. As is I have no real idea
> > > > > > > what it's trying to say.
> > > > > >
> > > > > > Use drm_atomic_commit and only overwrite atomic_commit_tail.
> > > > >
> > > > > You mean drm_atomic_helper_commit() I suppose?
> > > > >
> > > > > > But I'm not
> > > > > > really clear why the comment isn't clear - i915 is the only driver not
> > > > > > using them, maybe should start to take a look when they're so unfamiliar
> > > > > > :-P
> > > > >
> > > > > There are a lot of problems with that:
> > > > > - no uapi/hw state split so if there's anything that looks
> > > > >   at the state it's very likely going to get things wrong
> > > > > - it doesn't know anything about i915's own state objects
> > > > > - uses wrong workqueues
> > > > >
> > > > > Those are the ones that immediately came to mind when I looked
> > > > > at it. Probably I missed some others as well.
> > > >
> > > > Yeah and we could have fixed them in the shared code, and 5+ years ago I
> > > > even had patches, but i915 gem team had the idea they know better. That
> > > > part really needs to be fixed, and if that means redoing a bunch of
> > > > display work because display team didn't push back on gem team reinventing
> > > > all the wheels ... tough luck.
> > > >
> > > > I suggest you get started.
> > >
> > > I offered to do the hw/uapi split in the core. You refused it.
> >
> > That should work with with atomic_commit_tail too, or at least I'm not
> > seeing why not.
>
> I haven't gone through all of it to see how much it's poking inside
> the plane/crtc states and assuming the uapi state matches the hardware
> state. Maybe there is not much.
>
> >
> > > I raised my objection to the introduction of the single lock
> > > scheme for private state objects. No one was interested.
> >
> > Yeah add it to the list of things i915 reinvents I guess. And as long as
> > you're out, you kinda don't get much of a vote. Which is why being out of
> > this isn't a bright idea.
>
> It was reinvented _after_ the locking change. We were using
> private objects before.

So the atomic state subclassing was deprecated in 2017 in the
kerneldoc and drivers started moving over to drm_private_state
(including i915):

commit da6c05969785a0f4108a089ef33c55f46ae21775
Author: Daniel Vetter <daniel.vetter at ffwll.ch>
Date:   Thu Dec 14 21:30:53 2017 +0100

   drm/atomic: document how to handle driver private objects

And I actually replied in that old thread from 2018 where the
drm_private_state locking was added and explained how to do read-only
state in drm_private_state, but you didn't follow up on that:

https://lore.kernel.org/dri-devel/20180306073709.GT22212@phenom.ffwll.local/

Other drivers have been using that approach without issue meanwhile.

There's also been a pile of discussions on #dri-devel irc since then
with more involved ideas, but for that entire time I have not heard a
single time again that this is causing issues for intel. Nor any kinds
of proposals how to fix it.

Instead you figured that's not for you and started moving i915 away
from drm_private_state. Without cc'ing dri-devel, without kicking off
the discussion again and maybe clarifying what the problem is exactly,
and without explaining in the commit message that you're moving away
from drm_private_state because of some locking issue in that thing
that you don't want to address on dri-devel.

At least if I got this all right and found the right commit:

commit fd1a9bba73fa10e1601a43264283fe4696d6f82c
Author: Ville Syrjälä <ville.syrjala at linux.intel.com>
Date:   Mon Jan 20 19:47:25 2020 +0200

   drm/i915: Convert bandwidth state to global state

I'm probably way more pissed on this than what's appropriate, but I
just spend the last 3 years fixing up a giantic dumpster fire in the
i915-gem side, because a few people were way too keen on reinveinting
wheels in their little driver corner instead of working with dri-devel
to get it all sorted out.

Cheers, Daniel




> > > I don't think this situation is on me.
> >
> > These were your examples why you can't adopt the atomic commit helpers,
> > and aside from the workqueue one (where i915 really should not be the only
> > driver doing it differently, that makes no sense at all) they're not real
> > reasons to not do this.
> >
> > The real reasons are
> > - cursor code not yet adopted async plane commit
>
> As long as people talk about async commits while apparently
> meaning something totally different it's hard to even be on
> the same page.
>
> > - i915_sw_fence
> > - ... including the boosting and everything else that i915 gem team hand
> >   rolled in there
> >
> > And those need to be fixed eventually. And this time around not by doing
> > more i915 hand rolling of stuff.
> > -Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
> --
> Ville Syrjälä
> Intel



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list