[Intel-gfx] [PATCH] drm: avoid spurious EBUSY due to nonblocking atomic modesets
Daniel Vetter
daniel.vetter at ffwll.ch
Wed Apr 8 15:34:35 UTC 2020
On Wed, Apr 8, 2020 at 4:03 PM Ville Syrjälä
<ville.syrjala at linux.intel.com> wrote:
>
> On Tue, Feb 25, 2020 at 05:34:00PM +0200, Ville Syrjälä wrote:
> > On Tue, Feb 25, 2020 at 04:09:26PM +0100, Daniel Vetter wrote:
> > > On Tue, Feb 25, 2020 at 3:48 PM Ville Syrjälä
> > > <ville.syrjala at linux.intel.com> wrote:
> > > >
> > > > On Tue, Feb 25, 2020 at 12:50:24PM +0100, Daniel Vetter wrote:
> > > > > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > > > > pull in arbitrary other resources, including CRTCs (e.g. when
> > > > > reconfiguring global resources).
> > > > >
> > > > > But in nonblocking mode userspace has then no idea this happened,
> > > > > which can lead to spurious EBUSY calls, both:
> > > > > - when that other CRTC is currently busy doing a page_flip the
> > > > > ALLOW_MODESET commit can fail with an EBUSY
> > > > > - on the other CRTC a normal atomic flip can fail with EBUSY because
> > > > > of the additional commit inserted by the kernel without userspace's
> > > > > knowledge
> > > > >
> > > > > For blocking commits this isn't a problem, because everyone else will
> > > > > just block until all the CRTC are reconfigured. Only thing userspace
> > > > > can notice is the dropped frames without any reason for why frames got
> > > > > dropped.
> > > > >
> > > > > Consensus is that we need new uapi to handle this properly, but no one
> > > > > has any idea what exactly the new uapi should look like. As a stop-gap
> > > > > plug this problem by demoting nonblocking commits which might cause
> > > > > issues by including CRTCs not in the original request to blocking
> > > > > commits.
> > > > >
> > > > > v2: Add comments and a WARN_ON to enforce this only when allowed - we
> > > > > don't want to silently convert page flips into blocking plane updates
> > > > > just because the driver is buggy.
> > > > >
> > > > > v3: Fix inverted WARN_ON (Pekka).
> > > > >
> > > > > References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
> > > > > Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
> > > > > Cc: Daniel Stone <daniel at fooishbar.org>
> > > > > Cc: Pekka Paalanen <pekka.paalanen at collabora.co.uk>
> > > > > Cc: stable at vger.kernel.org
> > > > > Reviewed-by: Daniel Stone <daniels at collabora.com>
> > > > > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/drm_atomic.c | 34 +++++++++++++++++++++++++++++++---
> > > > > 1 file changed, 31 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > > > > index 9ccfbf213d72..4c035abf98b8 100644
> > > > > --- a/drivers/gpu/drm/drm_atomic.c
> > > > > +++ b/drivers/gpu/drm/drm_atomic.c
> > > > > @@ -1362,15 +1362,43 @@ EXPORT_SYMBOL(drm_atomic_commit);
> > > > > int drm_atomic_nonblocking_commit(struct drm_atomic_state *state)
> > > > > {
> > > > > struct drm_mode_config *config = &state->dev->mode_config;
> > > > > - int ret;
> > > > > + unsigned requested_crtc = 0;
> > > > > + unsigned affected_crtc = 0;
> > > > > + struct drm_crtc *crtc;
> > > > > + struct drm_crtc_state *crtc_state;
> > > > > + bool nonblocking = true;
> > > > > + int ret, i;
> > > > > +
> > > > > + /*
> > > > > + * For commits that allow modesets drivers can add other CRTCs to the
> > > > > + * atomic commit, e.g. when they need to reallocate global resources.
> > > > > + *
> > > > > + * But when userspace also requests a nonblocking commit then userspace
> > > > > + * cannot know that the commit affects other CRTCs, which can result in
> > > > > + * spurious EBUSY failures. Until we have better uapi plug this by
> > > > > + * demoting such commits to blocking mode.
> > > > > + */
> > > > > + for_each_new_crtc_in_state(state, crtc, crtc_state, i)
> > > > > + requested_crtc |= drm_crtc_mask(crtc);
> > > > >
> > > > > ret = drm_atomic_check_only(state);
> > > > > if (ret)
> > > > > return ret;
> > > > >
> > > > > - DRM_DEBUG_ATOMIC("committing %p nonblocking\n", state);
> > > > > + for_each_new_crtc_in_state(state, crtc, crtc_state, i)
> > > > > + affected_crtc |= drm_crtc_mask(crtc);
> > > > > +
> > > > > + if (affected_crtc != requested_crtc) {
> > > > > + /* adding other CRTC is only allowed for modeset commits */
> > > > > + WARN_ON(!state->allow_modeset);
> > > >
> > > > Not sure that's really true. What if the driver needs to eg.
> > > > redistribute FIFO space or something between the pipes? Or do we
> > > > expect drivers to now examine state->allow_modeset to figure out
> > > > if they're allowed to do certain things?
> > >
> > > Maybe we need more fine-grained flags here, but adding other states
> > > (and blocking a commit flow) is exactly the uapi headaches this patch
> > > tries to solve here. So if our driver currently adds crtc states to
> > > reallocate fifo between pipes for an atomic flip then yes we're
> > > breaking userspace. Well, everyone figured out by now that you get
> > > random EBUSY and dropped frames for no apparent reason at all, and
> > > work around it. But happy, they are not.
> >
> > I don't think we do this currently for the FIFO, but in theory we
> > could.
> >
> > The one thing we might do currently is cdclk reprogramming, but that
> > can only happen without a full modeset when there's only a single
> > active pipe. So we shouldn't hit this right now. But that restriction
> > is going to disappear in the future, at which point we may want to
> > do this even with multiple active pipes.
>
> Looks like we're hitting something like this on some CI systems.
> After a bit of pondering I guess we could fix that by not sending
> out any flips events until all crtcs have finished the commit. Not
> a full solution though as it can't help if there are multiple threads
> trying to commit independently on different CRTC and one thread
> happens to need a full modeset on all CRTCs. But seems like it
> should solve the the single threaded CI fails we're seeing.
Well it's more annoying, since I typed this patch last we gained some
igt which actually check that we can push through commits
indepedently, which now fail on latest gen and so CI tells me "you
can't merge this patch".
Also note that this here is just a userspace api issue, ordering
commits more carefully can't fix anything here.
I guess maybe we should just merge this patch meanwhile and tell CI
that the breakage is expected.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list