[RFC 0/9] nuclear pageflip

Fri Sep 14 14:46:35 PDT 2012

On Fri, Sep 14, 2012 at 5:14 PM, Jesse Barnes <jbarnes at virtuousgeek.org> wrote:
> On Wed, 12 Sep 2012 21:58:31 +0300
> Ville Syrjälä <ville.syrjala at linux.intel.com> wrote:
>
>> On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
>> > On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
>> > <ville.syrjala at linux.intel.com> wrote:
>> > > On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
>> > >> But I think we could still do this w/ one ioctl per crtc for atomic-pageflip.
>> > >
>> > > We could, if we want to sacrifice the synced multi display case. I just
>> > > think it might be a real use case at some point. IVI feels like the most
>> > > likely short term cadidate to me, but perhaps someone would finally
>> > > introduce some new style phone/tablet thingy. I have seen
>> > > concepts/prototypes of such multi display gadgets in the past, but the
>> > > industry apparently got a bit stuck on the rectangular slab with
>> > > touchscreen on one side design.
>> >
>> > I could be wrong, but I think IVI the screens would normally be too
>> > far apart to matter?
>>
>> I was thinking of something like a display on the dash that normally
>> sits low with only a small sliver visible, and extends upwards when
>> you fire up a movie player for example. Internally it could be made
>> up of two displays for power savings purposes.
>>
>> > Anyways, it is really only a problem if you can't do two ioctl()s
>> > within one vblank period. If it actually turns out to be a real
>> > problem,
>>
>> Well exactly that's the problem this whole atomic pageflip stuff is
>> trying to tackle, no? ;)
>>
>> > we could always add later an ioctl that takes an array of
>> > 'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
>> > really useful or not.. but maybe I'm thinking too much about how
>> > weston does it's rendering of different output's independently.
>>
>> I'm just now thinking of surfaceflinger's way of doing things, with
>> its prepare and commit phases. If you need to issue two ioctls to handle
>> cloned displays, then you can end up in a somewhat funky situation.
>>
>> Let's say you have a video overlay in use (one each display naturally),
>> and you increase the downscaling factor enough so that you now have
>> enough memory bandwith to support only one overlay. With independent
>> check ioctls for each display, you never have the full device state
>> available in the kernel, so each check succeeds during the prepare
>> phase. So you decide that you can keep using the video overlays.
>>
>> You then proceed to commit the state, but after the first display has
>> been commited you get an error when trying to commit the second one.
>> What can you do now? The only option is to keep displaying the old
>> frame on the other displays for some time longer, and then on the
>> next frame you can switch to GPU composition. But on the next frame you
>> perhaps no longer need to use GPU composition, but since you can't
>> verify that in the prepare phase, you have no other option but to use
>> GPU composition.
>>
>> So when you run into a configuration that can be supported only
>> partially, you get animation stalls on some displays due to skipped
>> frames, and you always have to fall back to GPU composition for the
>> next frame.
>>
>> If on the other hand you would check the whole state in one ioctl,
>> you'd get the error in the first prepare phase, and could fall back
>> to GPU composition immediately.
>>
>> Am I too much of a perfectionst in considering such things? I don't
>> think so, but perhaps other people disagree.
>
> I don't think there's any harm in having multiple ioctls for different
> things.
>
> I was initially hoping the nuclear page flip would be very simple.
> Intended for simply updating buffers of several planes associated with
> a single display.  That makes the inner loop of something like Wayland
> or SF simple, but obviously doesn't cover every case (in fact I had
> avoided dealing with moving planes initially).
>
> Rob's patchset goes further than that, but obviously not as far as you
> propose.
>
> OTOH, keeping things really simple and not very featureful means there
> are fewer points of failure, which is what I think callers would expect
> from a flip API...
>
> So where does that leave us?  I'd propose we have a very simple,
> stripped down, single crtc flip ioctl, along with a big atomic mode set
> ioctl, and then perhaps a fancier multi-crtc flip ioctl.

I think (hope) the consensus coming out of this thread is something
along these lines:

 - We use properties for specifying what to change to be future
compatible with new crtc features, but also to allow exposing
hw-specific properties and tie them into the atomicity of the
pageflip.  The KMS overlays are a lowest-common denominator for all
the various overlay types out there and it should be possible to write
a piece of chipset specific compositor code to use features that can't
be expressed through KMS overlays.

 - We have two types of properties: dynamic and non-dynamic ones.
Dynamic properties can always be changed in the next frame (fb bos, hw
cursor position, overlay position, for example), non-dynamic
properties typically involve changing the way bandwidth are allocated
and changing them may fail.

 - We need a test ioctl that can verify whether changing non-dynamic
properties will work.  Using the atomic modeset for that with a
test-only flag seems like a good option since that already has the
logic to analyze bandwidth allocation across all crtcs.  On the other
hand, it may make more sense to use the multiflip ioctl as well here.
What we need to check is whether the change made by a multifflip is
possible, so it seems natural to communicate that change to the kernel
using the same ioctl and data structs as the multiflip itself.  The
bandwidth calculation is a global decision and involves all crtcs and
the current state, so the kernel can decide just fine if a multiflip
is possible or not, based on the current state and the requested
multiflip.

 - Atomic multiflip for one crtc is essential for avoiding flicker and
artifacts, but ill-defined for multiple crtcs simultaneously and even
in the genlock case, the failure mode is hardly noticable (one crtc
may drop a frame in case the compositor is racing with vsync, in which
case multiflip just means both crtcs drop a frame).  For flipping
multiple fbs and planes, on one crtc, however, atomicity means that we
can combine gpu rendering and overlays in a reliable way, without
having to worry about flicker when sprites turn on a frame later after
we've already erased the surface contents from the main fb.  We need
to be able to render the scene graph split across various planes at
certain positions and know for certain that when we flip, that's the
configuration that ends up on the output.

 - Pageflip events can be controlled by a flag (as for the current
pageflip ioctl) or perhaps disabled by setting user_data to 0, but the
user data is passed in with each nuclear pageflip ioctl and each ioctl
generates one event (if requested) which returns the user data that
was passed in at ioctl time.  This is how it currently works, the
event mechanism is already in place, I see no reason to change this
behaviour.  Surely, we're not concerned about 8 extra bytes in the
ioctl struct?  The atomic modeset event (in test mode or not) never
generates an event, so there's no need for user data there.

 - Pageflip for multiple crtc may be useful in case of gen-locked
crtc, but it is a corner case and not likely to be present or relevant
in mainstream hw.  With the properties being an extensible mechanism,
we could probably expose gen-locked crtcs through the properties or
such and in worst case make a new ioctl as Jesses suggests.

Kristian