[PATCH 3/4] drm: introduce DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP

Tue Aug 30 10:24:50 UTC 2022

On Tue, Aug 30, 2022 at 11:40:10AM +0300, Pekka Paalanen wrote:
> On Tue, 30 Aug 2022 11:08:22 +0300
> Ville Syrjälä <ville.syrjala at linux.intel.com> wrote:
> 
> > On Mon, Aug 29, 2022 at 04:01:44PM +0000, Simon Ser wrote:
> > > On Friday, August 26th, 2022 at 10:19, Ville Syrjälä <ville.syrjala at linux.intel.com> wrote:
> > >   
> > > > On Wed, Aug 24, 2022 at 03:08:55PM +0000, Simon Ser wrote:  
> > > > > This new kernel capability indicates whether async page-flips are
> > > > > supported via the atomic uAPI. DRM clients can use it to check
> > > > > for support before feeding DRM_MODE_PAGE_FLIP_ASYNC to the kernel.  
> > > > 
> > > > I think we'd need to clarify the semantics of the async flag
> > > > for atomic commits.
> > > > 
> > > > Eg. on Intel hw only pure page flips are possible async, if you do
> > > > anything else (change plane size/pos/scaling/etc.) you will need
> > > > to do a sync update. Technically not even all page flips (from the
> > > > uapi POV) might be possible as the exact scanout source address
> > > > is specified via two registers, only one of which can be update
> > > > async. So technically the two framebuffers might be laid out
> > > > just slightly differently which could prevent an async flip.
> > > > Also only some subset of planes actually support async flips.  
> > > 
> > > Also IIRC some format modifiers don't support async flip at all (CCS)?  
> > 
> > Yeah, that too. Also planar YUV formats aren't allowed.
> > 
> > >   
> > > > And on hw where multiple planes support it on the same crtc, only one
> > > > plane can do it at a time. Well, more accurately we can only select
> > > > one plane at a time to give us the "flip done" interrupt. I guess
> > > > if the user wants to async flip multiple planes at the same time
> > > > we could do them serially as opposed to in parallel to make sure
> > > > all the flips actually happened before we signal completion of the
> > > > entire commit. Async flips of multiple planes probably won't
> > > > happen atomically anyway so doing them serially seems fine.
> > > > 
> > > > ATM in i915 we probably don't have sufficient state checks in
> > > > place to catch all the restrictions, and instead in part we rely
> > > > on the limited scope of the legacy async flip ioctl to make sure
> > > > the operation doesn't attempt something the hw can't do.  
> > > 
> > > Yeah, that makes sense.
> > > 
> > > In the documentation patch discussion [1], it appears it's not clear what
> > > drivers should do when async flip isn't possible with the legacy uAPI.
> > > 
> > > For the atomic uAPI, we need to pick on of these two approaches:
> > > 
> > > 1. Let the kernel fall back to a sync flip if async isn't possible. This
> > >    simplifies user-space, but then user-space has no reliable way to figure out
> > >    what really happened (sync or async?). That could be fixed with a new
> > >    read-only CRTC prop indicating whether the last flip was async or not.
> > >    However, maybe someone will come up in the future with user-space which
> > >    needs to only apply an update if async flip is possible, in which case this
> > >    approach falls short.
> 
> There is the pageflip completion timestamp in the DRM event. If
> userspace knows the phase and period of the scanout cycle, the
> completion timestamp should tell quite reliably if the update was
> tearing.
> 
> For the phase, one can query KMS for the last vblank timestamp. This
> should work also for VRR I assume.
> 
> For the period, fixed-frequency video mode has it straight. VRR gives
> only a range or a minimum period.
> 
> > > 2. Make the kernel return EINVAL if async flip isn't possible. This adds more
> > >    complexity to user-space, but enables a more reliable and deterministic
> > >    uAPI. This is also more consistent with the rest of the existing atomic
> > >    uAPI.  
> > 
> > The current behaviour is somewhat a combination of the two. We return
> > an error if async flip is not possible at all given the current state.
> > 
> > When async flip is possible we return success, but may still internally
> > fall back to a sync flip for the first flip. That is required on some
> > borked hardware that can't switch from sync flips to async flips without
> > doing an extra sync flip. Also on some other hardware we intentionally
> > fall back to a sync flip for the first async flip, so that we can
> > reprogram some display FIFO stuff (aimed to make the subsequent async
> > flips faster).
> 
> Oh, so userspace should expect to run async for long periods of time,
> and not use async this frame, sync next, then async again depending on
> per-frame timings.
> 
> That seems important to note.

Yeah, our hw can't really do the "tear if late, otherwise do a sync flip"
behaviour. GLX_swap_control_tear I think was the glx extension for that.

> 
> It's almost like the async flag should be a KMS property instead of a
> commit ioctl flag.

I guess that would be one way to implement it.

> 
> > > Note, current user-space would only need to opportunistically enable async
> > > flip. IOW, I think that for current user-space plans "async if possible,
> > > otherwise sync" is good enough. That behavior maps well to the Vulkan present
> > > modes as well (which says that "this mode *may* result in visible tearing", but
> > > doesn't guarantee it).  
> > 
> > The current behaviour is to fall back to a blit if the async
> > flip fails. So you still get the same effective behaviour, just
> > not as efficient. I think that's a reasonable way to handle it.
> 
> That's purely an Xorg thing, right?

Yes.

> 
> Should Wayland compositors implement the same thing is a good question.

Is the whole tear+wayland situation more or less up in the air still?

> 
> > > Another possible shortcoming of the proposed new uAPI here is that user-space
> > > cannot submit a single atomic commit which updates multiple CRTCs, and
> > > individually select which CRTC does an async flip. This could be fixed with
> 
> I would think that you can just do per-CRTC atomic commits in that
> case. You would do per-CRTC atomic commits anyway when the vblanks do
> not coincide. I expect tearing updates to have unpredictable latency,
> especially if they can silently fall back to sync flips, so doing
> multi-CRTC async flips is not useful.
> 
> > > a "ASYNC_FLIP" CRTC prop which the kernel always resets to 0 on commit. I'm not
> > > sure we want/need to cross that bridge right now, it would be easy enough to
> > > add as a second step if some user-space would benefit from it.  
> > 
> > Technically it should really be per-plane since that is what does
> > the flip. But I have a feeling that allowing a mix of async and
> > sync in the same commit is just going to make everything more
> > complicated without really helping anything (async flips won't
> > happen atomically anyway with anything else).
> > 
> > One (crazy?) idea I had for the atomic api is that we could even
> > reject most of the properties already on the uapi level before anyone
> > gets to examine the final state. Ie. as soon as the atomic ioctl sees
> > eg. a gamma LUT property change it would just immediately return
> > an error if async flip is also requested.
> 
> I agree with these two paragraphs.
> 
> What about limiting async flag to atomic commits that update only a
> single KMS plane (regardless of how many planes are active)? Maybe that
> would be a good first step?

Sure, that might help a bit. But I'm rather more worried about
all the state that you can now include in the atomic commit but
aren't actually allowed to change. Eg. probably all crtc/connector
properties, and most plane properties. The legacy ioctl blocks most
of that implicitly, but the atomic ioctl does not. Hence my idea
about rejecting basically everything except the plane fb property
changes already at the ioctl level. Would avoid that stuff
leaking in by accident if the driver forgets to check absolutely
everything.

-- 
Ville Syrjälä
Intel