[RFC 0/9] nuclear pageflip

Thu Sep 13 07:29:25 PDT 2012

On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
> On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
> <ville.syrjala at linux.intel.com> wrote:
> > On Wed, Sep 12, 2012 at 02:40:56PM -0500, Rob Clark wrote:
> >> On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
> >> <ville.syrjala at linux.intel.com> wrote:
> >> > On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
> >> >> On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
> >> >> <ville.syrjala at linux.intel.com> wrote:
> >> >> > On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
> >> >> >> But I think we could still do this w/ one ioctl per crtc for atomic-pageflip.
> >> >> >
> >> >> > We could, if we want to sacrifice the synced multi display case. I just
> >> >> > think it might be a real use case at some point. IVI feels like the most
> >> >> > likely short term cadidate to me, but perhaps someone would finally
> >> >> > introduce some new style phone/tablet thingy. I have seen
> >> >> > concepts/prototypes of such multi display gadgets in the past, but the
> >> >> > industry apparently got a bit stuck on the rectangular slab with
> >> >> > touchscreen on one side design.
> >> >>
> >> >> I could be wrong, but I think IVI the screens would normally be too
> >> >> far apart to matter?
> >> >
> >> > I was thinking of something like a display on the dash that normally
> >> > sits low with only a small sliver visible, and extends upwards when
> >> > you fire up a movie player for example. Internally it could be made
> >> > up of two displays for power savings purposes.
> >> >
> >> >> Anyways, it is really only a problem if you can't do two ioctl()s
> >> >> within one vblank period. If it actually turns out to be a real
> >> >> problem,
> >> >
> >> > Well exactly that's the problem this whole atomic pageflip stuff is
> >> > trying to tackle, no? ;)
> >> >
> >> >> we could always add later an ioctl that takes an array of
> >> >> 'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
> >> >> really useful or not.. but maybe I'm thinking too much about how
> >> >> weston does it's rendering of different output's independently.
> >> >
> >> > I'm just now thinking of surfaceflinger's way of doing things, with
> >> > its prepare and commit phases. If you need to issue two ioctls to handle
> >> > cloned displays, then you can end up in a somewhat funky situation.
> >> >
> >> > Let's say you have a video overlay in use (one each display naturally),
> >> > and you increase the downscaling factor enough so that you now have
> >> > enough memory bandwith to support only one overlay. With independent
> >> > check ioctls for each display, you never have the full device state
> >> > available in the kernel, so each check succeeds during the prepare
> >> > phase. So you decide that you can keep using the video overlays.
> >> >
> >> > You then proceed to commit the state, but after the first display has
> >> > been commited you get an error when trying to commit the second one.
> >> > What can you do now? The only option is to keep displaying the old
> >> > frame on the other displays for some time longer, and then on the
> >> > next frame you can switch to GPU composition. But on the next frame you
> >> > perhaps no longer need to use GPU composition, but since you can't
> >> > verify that in the prepare phase, you have no other option but to use
> >> > GPU composition.
> >> >
> >> > So when you run into a configuration that can be supported only
> >> > partially, you get animation stalls on some displays due to skipped
> >> > frames, and you always have to fall back to GPU composition for the
> >> > next frame.
> >> >
> >> > If on the other hand you would check the whole state in one ioctl,
> >> > you'd get the error in the first prepare phase, and could fall back
> >> > to GPU composition immediately.
> >> >
> >> > Am I too much of a perfectionst in considering such things? I don't
> >> > think so, but perhaps other people disagree.
> >>
> >> Ok, if you have a case where the state of the two crtc's are not
> >> actually independent, then I think you have a valid point.
> >>
> >> I'm not quite sure what userspace would do about it, though.. for the
> >> general case where vsync isn't locked, and you can't even necessarily
> >> assume vsync period is the same, then I don't really think you want to
> >> couple rendering to the displays.
> >
> > I would say this is going to be the most common use case if you consider
> > just the number of shipping devices. It's pretty much what every Android
> > phone/tablet with a HDMI port has to do.
> 
> bleh, surfaceflinger kinda sucks then..

Why? This use case is not enforced by surfaceflinger, it's just the use
case most devices would have.

I don't think there's anything wrong with the way surfaceflinger is designed
with the prepare and commit phases. How else would you do it?

> >> >From userspace API, I guess something like:
> >>
> >> struct drm_mode_crtc_atomic_page_flip {
> >>       uint32_t flags;
> >>       uint32_t count_crtcs;
> >>       uint64_t crtc_ids_ptr;  /* array of uint32_t */
> >>       uint64_t count_props_ptr; /* array of uint32_t, # of prop's per crtc */
> >>       uint64_t props_ptr;  /* ptr to array of drm_mode_obj_set_property */
> >>       uint64_t user_data;
> >> };
> >
> > Starting to look much like my drm_mode_atomic struct :)
> >
> > Let's compare:
> >
> > struct drm_mode_atomic {
> >         __u32 flags;
> >         __u32 count_objs;
> >         __u64 objs_ptr;
> >         __u64 count_props_ptr;
> >         __u64 props_ptr;
> >         __u64 prop_values_ptr;
> >         __u64 blob_values_ptr;
> > };
> 
> well, you do miss userdata, I think

Sure, because I didn't add the event stuff yet.

> > One differences seem to be that you have a mix of SOA and AOS concepts
> > in yours, whereas mine is pure SOA.
> 
> well, I was trying to re-use drm_mode_obj_set_property because that
> seemed to keep the code simple, and makes it more similar to existing
> set-property stuff.

I sort of modelled mine after the way things are done by the various
getter ioctls. I don't like messing up my brain by mixing SOA
and AOS in the same place.

> OTOH, maybe that doesn't really matter because
> userspace would normally be going through libdrm and not seeing the
> ioctl structs directly, so we could make the ioctl structs as ugly as
> we want.
> 
> I'm a bit on the fence about how the pageflip ioctl should look, in
> particular about pageflip on multiple CRTCs.  I'd like to know the
> involved CRTC(s) upfront, as that seems to make error checking easier
> on the driver (in case of already pending flip).  Although I'll think
> a bit about alternatives.

I don't see much point in iterating that information directly in begin().
You anyway get the same information during the set() operations. I
suppose you could save some effort by avoiding some state allocation in
begin(). But really, this seems like optimizing for the an uncommon error
case. You should not encounter it in practice unless your user space is
doing something silly.

> Also, if you pageflip on multiple CRTC's, should the be multiple
> vblank events, and multiple userdata's?

That's a bit of an open question. I was considering several options:

1) One event after the full operation is complete
 Not a good idea when you want to throttle based on a fast refreshing
 display but the system has also slow refreshing displays.

2) One event per pipe
 Seems reasonable. We could use a bitmask so that the user can ask for
 the event to be delivered only for specific pipes. I'm not sure whether
 multiple user_datas would be better, or just the one with user space
 taking care of refcounting it in case it's a pointer to some object.

3) One event per scanout engine
 Not sure this makes sense since the idea is anyway to do things
 atomically within the pipe.

> Really the atomic-pageflip ioctl is pretty much a small part of the
> entire patchset, so I'm not really too much against changing it.  Most
> of the point of the patchset was to try to enhance properties to do
> what we need, and come up w/ a sane way to decouple state from the
> various kms objects to easily handle atomic commit or rollback.

I agree 100% with this goal. Once the state would be split cleanly, we
could do stuff like keep the user and fbcon states totally separate.
Restoring the fbcon state would then trivial. Currently the fbcon
state as such doesn't really exist. IIRC you had some patches to reset
various bits of state at certain points just to work around this issue.

> > Also your struct has a bunch of redundant data since
> > drm_mode_obj_set_property keeps repeating the object ID needlessly, and
> > crtc_ids_ptr is also just repeating information already present in
> > drm_mode_obj_set_property. drm_mode_obj_set_property further forces you
> > to pass in the object type, which is pretty much useless.
> 
> well, not completely, the object-id/type could also be a plane
> attached to a crtc..

Sure, but even that gets repeated needlessly for every prop.

> > Your version can't pass in blobs, which is going to be a problem
> > when you want to push in gamma tables/ramps and whatnot. Also I've been
> > considering that for atomic modesetting I'd pass display modes as blobs
> > too, since the display mode object IDs are currently hidden from user
> > space, and also the IDs are a bit volatile since the modes can be
> > shuffled around during EDID probing. I suppose one could try to fix
> > these display mode ID issues, and when you need to push in gamma tables
> > or other heavy weight data, you invent new types of drm objects to
> > describe them, and you pass them by ID as well.
> 
> hmm, ok, I was wondering why you were supporting blobs, when existing
> setprop APIs where not.  Well, maybe it wouldn't be a bad idea to
> start by making existing setprop stuff support blobs.

I wouldn't care about the existing APIs. Why would anyone use them when
the atomic API provides the same capability and more?

-- 
Ville Syrjälä
Intel OTC