Regression with mainline kernel on rpi4

Daniel Vetter daniel at ffwll.ch
Thu Oct 14 13:15:36 UTC 2021


On Wed, Oct 13, 2021 at 05:01:03PM +0200, Maxime Ripard wrote:
> On Thu, Sep 30, 2021 at 11:19:59AM +0200, Daniel Vetter wrote:
> > On Tue, Sep 28, 2021 at 10:34:46AM +0200, Maxime Ripard wrote:
> > > Hi Daniel,
> > > 
> > > On Sat, Sep 25, 2021 at 12:50:17AM +0200, Daniel Vetter wrote:
> > > > On Fri, Sep 24, 2021 at 3:30 PM Maxime Ripard <maxime at cerno.tech> wrote:
> > > > >
> > > > > On Wed, Sep 22, 2021 at 01:25:21PM -0700, Linus Torvalds wrote:
> > > > > > On Wed, Sep 22, 2021 at 1:19 PM Sudip Mukherjee
> > > > > > <sudipm.mukherjee at gmail.com> wrote:
> > > > > > >
> > > > > > > I added some debugs to print the addresses, and I am getting:
> > > > > > > [   38.813809] sudip crtc 0000000000000000
> > > > > > >
> > > > > > > This is from struct drm_crtc *crtc = connector->state->crtc;
> > > > > >
> > > > > > Yeah, that was my personal suspicion, because while the line number
> > > > > > implied "crtc->state" being NULL, the drm data structure documentation
> > > > > > and other drivers both imply that "crtc" was the more likely one.
> > > > > >
> > > > > > I suspect a simple
> > > > > >
> > > > > >         if (!crtc)
> > > > > >                 return;
> > > > > >
> > > > > > in vc4_hdmi_set_n_cts() is at least part of the fix for this all, but
> > > > > > I didn't check if there is possibly something else that needs to be
> > > > > > done too.
> > > > >
> > > > > Thanks for the decode_stacktrace.sh and the follow-up
> > > > >
> > > > > Yeah, it looks like we have several things wrong here:
> > > > >
> > > > >   * we only check that connector->state is set, and not
> > > > >     connector->state->crtc indeed.
> > > > >
> > > > >   * We also check only in startup(), so at open() and not later on when
> > > > >     the sound streaming actually start. This has been there for a while,
> > > > >     so I guess it's never really been causing a practical issue before.
> > > > 
> > > > You also have no locking
> > > 
> > > Indeed. Do we just need locking to prevent a concurrent audio setup and
> > > modeset, or do you have another corner case in mind?
> > > 
> > > Also, generally, what locks should we make sure we have locked when
> > > accessing the connector and CRTC state? drm_mode_config.connection_mutex
> > > and drm_mode_config.mutex, respectively?
> > > 
> > > > plus looking at ->state objects outside of atomic commit machinery
> > > > makes no sense because you're not actually in sync with the hw state.
> > > > Relevant bits need to be copied over at commit time, protected by some
> > > > spinlock (and that spinlock also needs to be held over whatever other
> > > > stuff you're setting to make sure we don't get a funny out-of-sync
> > > > state anywhere).
> > > 
> > > If we already have a lock protecting against having both an ASoC and KMS
> > > function running, it's not clear to me what the spinlock would prevent
> > > here?
> > 
> > Replicating the irc chat here. With
> > 
> > commit 6c5ed5ae353cdf156f9ac4db17e15db56b4de880
> > Author: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> > Date:   Thu Apr 6 20:55:20 2017 +0200
> > 
> >     drm/atomic: Acquire connection_mutex lock in drm_helper_probe_single_connector_modes, v4.
> > 
> > this is already taken care of for drivers and should be all good from a
> > locking pov.
> 
> So, if I understand this properly, this superseeds your comment on the
> spinlock for the hw state, but not the comment that we need some locking
> to synchronize between the audio and KMS path (and CEC?). Right?

Other way round. There's 3 things involved here:
1. kms output probe code
2. kms atomic commit code
3. calls from asoc side

The above referenced commit makes sure 1&2 are synchronized. The problem
is that 2&3 are not synchonronized, and from 3, no matter how much locking
you have, you cannot look at kms state. I.e. not allowed to look at
crtc->state for example, irrespective of whether you're holding
drm_modeset_lock or not. This is because the atomic nonblocking commit is
done without holding any locks, protection is purely down to ownership
rules of state structures and ordering (through drm_crtc_commit) of
in-flight nonblocking atomic commits.

That's why you need a sperate lock _and_ copy state, so taht 2&3 stay in
sync.

In practice you only care about modeset changes from 2 vs anything from 3,
and most userspace does modeset atomic commits as blocking commits, which
means you won't notice that your locking has gaps.

btw same problem exists between atomic and (vblank) irq handler. There you
need a irqsafe spinlock and you also have to copy (because the irq handler
just cannot access ->state in any safe way, because it doesn't own that
structure).

This is maybe a bit the confusing thing with atomic commit: ->state isn't
protected by locks, but through ownership rules. Only for atomic check is
->state protected by locks, but once we're committed we switch over to
ownership rules for protection. swap_states() is that point of no return.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list