Parallel modesets and private state objects broken, where to go with MST?

Fri Mar 25 13:56:40 UTC 2022

Hi Lyude,

On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
> Hi! First a little bit of background: I've recently been trying to get rid of
> all of the non-atomic payload bandwidth management code in the MST helpers in
> order to make it easier to implement DSC and fallback link rate retraining
> support down the line. Currently bandwidth information is stored in two
> places, partially in the MST atomic state and partially in the mst manager's
> payload table (which exists outside of the atomic state and has its own
> locking). The portions in the atomic state are used to try to determine if a
> given display configuration can fit within the given bandwidth limitations
> during the atomic check phase, and are implemented through the use of private
> state objects.

Yeah, this looks awfully similar to what vc4 is doing.

The way the hardware is setup is that there's a device (HVS) that does
the composition of all planes, over 3 FIFO, and each FIFO can be output
to a CRTC with a mux. The HVS then needs to raise its internal clock
rate depending on the load (based on the resolution and number of
planes, mostly) of each FIFO.

Thus we have a private object (vc4_hvs_state) that lists all the FIFOs,
with their associated loads, and we pull that state in for each commit.

> My current goal has been to move as much of this as possible over to the
> atomic state and entirely get rid of the payload table along with it's locks.
> My main reason for doing this is that it both decomplicates things quite a
> bit, and I'm really also hoping that getting rid of that payload code will
> make it clearer to others how it works - and stop the influx of bandaid
> patches (e.g. adding more and more special cases to MST to fix poorly
> understood issues being hit in one specific driver and nowhere else) that I've
> had to spend way more time then I'd like trying to investigate and review.
> 
> So, the actual problem: following a conversation with Daniel Vetter today I've
> gotten the impression that private modesetting objects are basically just
> broken with parallel modesets? I'm still wrapping my head around all of this
> honestly, but from what I've gathered: CRTC atomic infra knows how to do waits
> in the proper places for when other CRTCs need to be waited on to continue a
> modeset, but there's no such tracking with private objects. If I understand
> this correctly, that means that even if two CRTC modesets require pulling in
> the same private object state for the MST mgr: we're only provided a guarantee
> that the atomic checks pulling in that private object state won't
> concurrently. But when it comes to commits, it doesn't sound like there's any
> actual tracking for this and as such - two CRTC modesets which have both
> pulled in the MST private state object are not prevented from running
> concurrently.

Yeah, in our case the issue was two-fold:

 - The first one is that since the load is shared over each CRTC, we
   need, for each commit, to have the global load. This is non-obvious
   because you might get some new states that only affect a single CRTC,
   or a plane but not its CRTC, so making sure we always get the entire
   picture was a bit challenging.

   You'll need to pull the state at each commit in atomic_check, and
   then basically remove the old state of whatever is in your commit,
   and add the new stuff. Just iterating over all the connectors in a
   state for example won't work.

 - Then, indeed, there's a race issue. IIRC, the basic idea is that if
   two (non-blocking) commits don't share any resources (so like both
   have a single plane, CRTC and connector, but none shared), there's no
   execution ordering guaranteed by default. But there is one in the
   structures: you still have your old and new states.

   So if you commit your changes like this:

   * Initial State
   * State 1
   * State 2

   Your old private object in state 1 will be the initial state one, the
   new will be the one from state 1. And for state 2, the old will be
   state 1, the new will be state 2.

   But if state 2 gets committed first, then your old state is weird
   already, because it's actually the next one. It get worse when state
   1 gets committed since the old state is the initial state, but since
   state 2 has been committed the initial state has been freed already.

> This unfortunately throws an enormous wrench into the MST atomic conversion
> I've been working on - as I was under the understanding while writing the code
> for this that all objects in an atomic state are blocked from being used in
> any new atomic commits (not checks, as parallel checks should be fine in my
> case) until there's no commits active with said object pulled into the atomic
> state. I certainly am not aware of any way parallel modesetting could actually
> be supported on MST, so it's not really a feature we want to deal with at all
> besides stopping it from happening. This also unfortunately means that the
> current atomic modesetting code we have for MST is potentially broken,
> although I assume we've never hit any real world issues with it because of the
> non-atomic locking we currently have for the payload table.
> 
> So, Daniel had mentioned that supposedly you've been dealing with similar
> issues with VC4 and might have already made progress on coming up with ways to
> deal with it.

This is all upstream already :)

You can have a look at vc4_atomic_commit_setup, that pull the HVS state
in, and associates the drm_crtc_commit to the HVS state.

Then, vc4_atomic_commit_tail, we'll wait for the old HVS state
drm_crtc_commit before moving forward.

It's not really challenging: the basic idea is that whenever your commit
start you would wait for the drm_crtc_commit to be completed, forcing
the ordering.

This requires some code in atomic_commit path though, so it might be
difficult to integrate properly in the drivers that would use MST.

It would be an interesting discussion, I have a similar issue with HDMI
scrambling support: I'd like to have most of the logic generic, but
making sure all the HDMI drivers (but only them) use that generic code
properly is challenging.

Maxime
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20220325/5983b045/attachment.sig>