[PATCH 14/17] drm/msm: add atomic support

Tue May 27 11:48:41 PDT 2014

On Tue, May 27, 2014 at 1:50 PM, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Tue, May 27, 2014 at 11:58:33AM -0400, Rob Clark wrote:
>> On Mon, May 26, 2014 at 1:54 PM, Daniel Vetter <daniel at ffwll.ch> wrote:
>> > Ok, I think I should have read your msm implementation a _lot_ earlier.
>> > Explains your desing choices neatly.
>> >
>> > Two observations:
>> >
>> > - A GO bit makes nuclear pageflips ridiculously easy to implement,
>> >   presuming the hardware actually works. And it's the sane model, so imo a
>> >   good one to wrap the atomic helpers around.
>> >
>> >   But reality is often a lot more ugly, especially if you're employed by
>> >   Intel. Which is why I'm harping so much on this helpers-vs-core
>> >   interface issues ... We really need the full state transition in one
>> >   piece to do anything useful.
>> >
>> > - msm doesn't have any resource sharing going on for modeset operations
>> >   (which I mean lighting up crtcs to feed pixel streams to connectors).
>> >   Which means the simplistic "loop over all crtcs and call the old
>> >   ->setcrtc" approach actually works.
>>
>> we do actually have some shared resources on mdp5 generation (the
>> "smp" blocks, basically a shared buffer which we need to allocate fifo
>> space from for each plane)..
>>
>> I'm mostly ignoring this at the moment, because we don't support
>> enough crtc's yet to run out of smp blocks.  But hooking in at the
>> current ->atomic_commit() would be enough for me, I think.  Tbh, it is
>> something that should be handled at the ->atomic_check() stage so I
>> hadn't given much though to ->atomic_commit() stage.
>>
>> That all said, I've no problem with adding one more hook, after
>> ->atomic_commit() and lock magic, before loop over planes/crtcs, so
>> drivers that need to can replace the loops and do their own thing.
>> Current state should be reasonably good for sane hw.  I'm just waiting
>> for patches for i915 to add whatever it needs.
>
> I don't think that will be enough for you. Example, slightly hypothetical:
>
> - Configuration A: crtc 0 displays a small screen, crtc 1 a giant screen.
>   Together they just barely fit into your fifo space.
>
> - Configuration B: crtc 0 drives a giant screen, crtc 1 a tiny one.
>   Presume different outputs here so that no implicit output stealing
>   happens upon mode switching.
>
> Atomic switch should work, but don't since if you just loop over crtcs you
> have the intermediate stage where you try to drive 2 giant screens, which
> will run out of fifo resources. And I think it's really common to have
> such implicit resource sharing, maybe except for the most beefy desktop
> gpu which simply can't run out of memory bw with today's screen.

Well, this situation is a bit problematic in other similar cases..
moving plane between crtc's which needs to wait on a vblank is another
vaguely similar case.  I was kinda thinking of punting on that one
(ie. -EBUSY and userspace tries again on next frame).  Maybe for
modeset that doesn't work out so well, since frame N+1 you'll still be
at configuration A and have the same problem.

Would be kinda nice if helpers could order things according to what
decreases resource requirements vs what increases.  Although we could
probably get a lot of mileage out of just making the 'atomic loop over
things helper' apply config for crtcs/planes which will be disabled
first, before the ones which will be enabled at the end of the commit.
 Hmm.

Either way, if you have to replace 'atomic loop over things helper'
with your own i915 specific thing, it shouldn't make much difference
to you.  And I don't really think we need to solve all the worlds
problems in the first version.  But seems like there could be some
value in making helpers a bit more aware of shared resource
constraints.

> So afaics you really need to push a bit of smarts into the crtc helpers to
> make atomic possible, which then leaves you with interaction issues
> between the atomic stuff for GO bit capable hw for plane updates and the
> modeset ordering hell.
>
>> >   The problem I see here is that if you have hardware with more elaborate
>> >   needs (e.g. shared dplls), but otherwise sanity for plane updates (i.e.
>> >   a GO bit) then your current atomic helpers will make it rather hard to
>> >   mix this. So I think we should pimp the crtc helpers a bit to be atomic
>> >   compliant (i.e. kill all outputs first before enabling anything new) and
>> >   try to integrate this with the atomic helpers used for GO bit style
>> >   updates.
>>
>> Not really, I don't think.  You can still do whatever shared resource
>> setup in ->atomic_commit().  For drivers which want to defer commit
>> (ie. doing commit after fb's ready from cpu, rather than from gpu), it
>> would probably be more convenient to split up atomic_commit() so
>> drivers have a post lock hook.  I think it is just a few line patch,
>> and I think that it probably isn't really needed until i915 is ready.
>> I'm pretty sure we can change this to do what i915 needs, while at the
>> same time keeping it simple for 'GO bit' drivers.
>>
>> The bit about crtc helpers, I'll have to think about.  I'm not sure
>> that we want to require that 'atomic' (modeset or pageflip) completely
>> *replaces* current state.  So disabling unrelated crtcs doesn't seem
>> like the right thing.  But we have some time to decide about the
>> semantics of an atomic modeset before we expose to userspace.
>
> I'm not talking about replacing unrelated crtcs. It's also not about
> underspecified state or about enabling/changing global resources.
>
> It is _only_ about ordering of the various operations: If both the
> current and the desired new configuration are at the limit of what the hw
> can do you can't switch to the new configuration by looping over all
> crtcs. The fact that this doesn't work is the entire point of atomic
> modesets. And if we have helpers which aren't cut out for the task at hand
> for any hw where we need to have atomic modesets to make such changes
> possible without userspace going nuts, then the helpers are imo simply not
> useful
>
>> >   i915 has dpll sharing on ivb/hsw, but doesn't use the the crtc helpers
>> >   anymore. But the radone eyefinity (or whatever the damn thing is called)
>> >   I have lying around here fits the bill: It has 5 DP+ outputs but only 2
>> >   dplls. So you can drive e.g. 3 DP screens and then switch to 2 HDMI
>> >   screens atomically and it should all pan out.
>> >
>> >   But since your current helpers just loop over all crtcs without any
>> >   regard to ordering constraints this will fall over if the 2 HDMI outputs
>> >   get enabled before the 3 DP outputs get disabled all disabled.
>>
>> the driver should have rejected the config before it gets to commit
>> stage, if it were an impossible config.
>
> The configuration _is_ possible. We simply have to be somewhat careful
> with transitioning to it, since not _all_ intermediate states are
> possible. Your current helpers presume that's the case, which afaik isn't
> the case on any hw where we have global limits. For modesets.
>
> Nuclear pageflips are a completely different thing, as long as your hw has
> a GO bit.
>
>> > So with all that in mind I have 3 sanity checks for the atomic interfaces
>> > and helpers:
>> >
>> > 1. The atomic helpers should make enabling sane hw with a GO bit easy for
>> > nuclear pageflips. Benchmark would be sane hw like msm or omap.
>> >
>> > 2. It should cooperate with the crtc helpers such that hw with some shared
>> > resources (like dplls) works for atomic modesets. That pretty much means
>> > "disable everything (including crtc/connetor pipelines that only changed
>> > some parts) before enabling anything". Benchmark would be a platform with
>> > shared dplls.
>>
>> btw, not something I'd thought about before, but shared dplls seem
>> common enough, that it is something core could help with.  (Assuming
>> they are all related to pixel clk and not some derived thing that core
>> wouldn't know about.)
>
> Yup, that's what I'm trying to say ;-) But it was just an example, I
> believe that atm your helpers don't help for any shared modeset resources
> at all.

no, not at all (other than the ww_mutex stuff which should be useful
for shared resources and more fine grained locking within driver).  It
wasn't really a goal.

But having some knowledge about shared resources seems like it could
make core helpers more useful when I get closer to exploiting the
limits of the hw I have..  I suspect i915 is just further down that
path than the other drivers.

>> I think we do need to decide what partial state updates me in the
>> context of modeset or pageflip.  I kinda think the right thing is
>> different for modeset vs pageflip.  Maybe we want to define an atomic
>> flag to mean "disable/discard everything else"..  at any rate, we only
>> need to sort that before exposing the API to userspace.
>
> Yeah, I still strongly support this split in the api itself. For i915 my
> plan is to have separate configuration structures for modeset state and
> pageflip/plane config state. When we do an atomic modeset we compute both
> (maybe with some shortcut if we know that the pipe config didn't change at
> all). Then comes the big switch:
>
> - If we have the same pipe config, we simply to a vblank synce nuclear
>   flip to the new config.
>
> - If pipe configs change it will be much more elaborate:
>
>   1. Do a vblank synced nuclear flip to black on all pipes that need to go
>   off (whether they get disabled or reconfigured doesn't matter for now).
>
>   2. Disable pipes.
>
>   3. Commit new state on the sw side.
>
>   4. Enable all pipes with the new config, but only scanning out black.
>
>   5. Do a vblank-synced nuclear flip to the new config. This would also be
>   the one that would signale the drm events that the atomic update
>   completed.
>
>   For fastboot we might need some hacks to fuse this all together, e.g for
>   some panel fitter changes we don't need to disable the pipe completely.
>   But that's the advanced stuff really.
>
> I think modelling the crtc helpers after this model could work. But that
> means that the crtc helpers and the nuclear flip atomic helpers for GO
> bit capable hw need to be rather tightly integrated, while still allowing
> drivers to override the nuclear flip parts.
>
>> > 3. The core->driver interface should be powerful enough to support
>> > insanity like i915, but no more. Which means all the code that's share
>> > (i.e. the set_prop code I've been harping all over the place about) should
>> > be done in the core, without any means for drivers to override. Currently
>> > the drm core also takes care of a bunch of things like basic locking and
>> > refcounting, and that's kinda the spirit for this. i915 is the obvious
>> > benchmark here.
>>
>> The more I think about it, the more I think we should leave set_prop
>> as it is (although possibly with drm_atomic_state ->
>> drm_{plane,crtc,etc}_state).
>>
>> In the past, before primary plane, I really needed this.  And I expect
>> having a convenient way to 'sniff' changing properties as they go by
>> will be useful in some cases.
>
> I actually really like the addition of the state object to ->set_prop. At
> least for i915 (which already has a fair pile of additional properties)
> that looks like the perfect way to prep the stage.
>
> But at least for the transition we should keep the impact minimal. So no
> manadatory callbacks and don't enforce the usage of the state object until
> the drm core is fully converted to always follow a set_prop with a
> ->commit. Since currently we have internal mode_set calls in all i915
> set_prop functions and we need to convert them all. But we can't do that
> until all the core stuff uses the atomic interface for all legacy ioctl
> ops.
>

fwiw, at least all set_prop ioctl stuff from core follows up with a
atomic_commit now.  There are one or two more places doing mode_set
un-atomically.  And I'm not sure if you call set_prop anywhere
internally from i915.

BR,
-R