[PATCH 4/5] drm/amdgpu: Wait for end of last waited-for vblank before programming flip

Wed Jun 15 09:23:09 UTC 2016

On Wed, Jun 15, 2016 at 05:03:41PM +0900, Michel Dänzer wrote:
> On 14.06.2016 17:06, Daniel Vetter wrote:
> > On Tue, Jun 14, 2016 at 04:25:28PM +0900, Michel Dänzer wrote:
> >> On 14.06.2016 14:53, Daniel Vetter wrote:
> >>> On Tue, Jun 14, 2016 at 11:09:10AM +0900, Michel Dänzer wrote:
> >>>> On 06/13/16 23:06, Daniel Vetter wrote:
> >>>>> On Mon, Jun 13, 2016 at 05:58:29PM +0900, Michel Dänzer wrote:
> >>>>>> On 06/13/16 17:06, Daniel Vetter wrote:
> >>>>>>>
> >>>>>>> Afaiui the rules are:
> >>>>>>> - The timestamp for vblank event needs to agree with whatever oml_sync
> >>>>>>>   requries.
> >>>>>>> - The event delivery itself needs to be consistent with what page_flip
> >>>>>>>   takes, i.e. if userspace sees an event and immediately issues a
> >>>>>>>   page_flip then it should not be able to hit the same vblank with that
> >>>>>>>   pageflip.
> >>>>>>> [...]
> >>>>>>
> >>>>>>> I assume you're goal is to not delay page_flips unecessarily, without
> >>>>>>> breaking requirement 2 here. Imo a simpler fix would be to delay the
> >>>>>>> vblank handling to end of vblank. Fixes everything without hacks, [...]
> >>>>>>
> >>>>>> Except it breaks the original purpose of the wait for vblank
> >>>>>> functionality, which is to wait for the beginning of a vertical blank
> >>>>>> period. [0] You're focusing too much on page flips and suggesting to
> >>>>>> throw out the vblank baby with the bathwater. I really don't see the big
> >>>>>> issue which would justify that.
> >>>>>>
> >>>>>>
> >>>>>> [0] As an analogy, how useful would e.g. calendar notifications be if
> >>>>>> they arrived at the end of the events they're about? "Hey, that meeting
> >>>>>> you were supposed to attend? It just finished!"
> >>>>>
> >>>>> Ok, what exactly is the use-case for waiting for vblanks _without_
> >>>>> scheduling a flip afterwards? At least in drm the rule is that ABI is what
> >>>>> userspace observes and actually cares about.
> >>>>
> >>>> E.g.: In cases where page flipping cannot be used, Xorg / the DDX driver
> >>>> waits for the target vertical blank period before emitting the drawing
> >>>> commands for a buffer swap operation. If the vblank notification only
> >>>> arrives when the vertical blank period is already over, this is very
> >>>> likely to result in tearing.
> >>>>
> >>>> Some X compositors and AFAIK even applications such as media players can
> >>>> use DRM_IOCTL_WAIT_VBLANK similarly. Obviously it's not intended to be
> >>>> used directly like that, but nonetheless it is.
> >>>
> >>> Is there really anything using it like that outside of -ati?
> >>
> >> Yes. With DRI3/Present, it's driver independent code in
> >> xserver/present/. With DRI2, it's theoretically up to the DDX driver,
> >> but all drivers seem to have basically the same logic; the modesetting
> >> driver certainly does.
> > 
> > Hm, didn't know that everyone does that. Seemed to silly an idea to waste
> > all that gpu bandwidth by waiting for vblank ...
> 
> Actually it's kind of the other way around: One of the reasons for using
> sync-to-vblank is to save power for rendering stuff which cannot be seen
> anyway.
> 
> Also note that only the specific client using DRI2 or Present for
> presentation is blocked, nothing else.
> 
> 
> >>> I didn't know that we pass vblank waits to X clients. Either way annoying,
> >>> since it means you need to keep things working like this for amd drivers
> >>> forever.
> >>
> >> Please don't try to single out our drivers, it seems like rather your
> >> driver is the odd man out in this case.
> > 
> > Hm, why/where is i915 special?
> 
> By only triggering the vblank interrupt at the end of a vertical blank
> period, breaking the original DRM_IOCTL_WAIT_VBLANK functionality.

Ok, turns out we never did that, but we did move the drm event for page
flip completion around. So indeed vblank wait drm event should fire at
start of vblank if possible. I still maintain that (by default) you
shouldn't allow a page flip to overtake a drm event.

> >>> How bad would it really be to just always delay the page_flip past vblank?
> >>
> >> In the variable refresh rate case it could be pretty bad, basically
> >> dropping to the minimum refresh rate.
> > 
> > Variable refresh rate is entirely undefined right now in kms.
> 
> I don't think that really matters for the issue I'm thinking of.
> 
> My understanding is that variable refresh rate basically works by
> transmitting the frame contents using the maximum refresh rate timing,
> and then dynamically extending the vertical blank period until either
> the next flip arrives, or the time since the last frame was transmitted
> corresponds to the minimum refresh rate. So if we always wait for the
> end of vertical blank before programming a flip, we'll probably end up
> degrading to the minimum refresh rate whenever we're already in vertical
> blank by the time we're ready to program the flip. Which will happen
> with any app which cannot sustain a framerate > the maximum refresh
> rate. So we'll end up running at either the minimum or maximum refresh
> rate (or possibly even worse, flip-flopping between the two), defeating
> the purpose of variable refresh rate.
> 
> 
> Also, as I explained before, even without variable refresh rate, always
> delaying flip programming past vblank can cause flips to be
> unnecessarily delayed by one frame.

Yes, but right now you show them too early, breaking existing stuff. My
stance is that you need an explicit flag to tell the kernel which one you
want (userspace tells you already by either asking for a specific frame,
or for "as soon as possible"). I don't like second-guessing userspace.
And yes that means that for variable vrefresh you probably need to run it
at 60Hz by default, except when userspace set a special mode flag which
allows variable vrefresh, telling the kernel that it knows things can get
slow if it doesn't ask for asap.

> >>> If that's not good enough I'd say we should add a
> >>> faster-than-vblank-but-still-synced page_flip flag. Then userspace could
> >>> tell you exactly whether you should always wait (no flags), or never wait
> >>> (with this new flag).
> >>
> >> That would be an inferior solution compared to my series, e.g.: If
> >> userspace calls DRM_IOCTL_MODE_PAGE_FLIP before the target vblank seqno
> >> is reached, it cannot use the new flag, otherwise the flip might take
> >> effect too early. However, if we are then already in the target vblank
> >> period when the fences have signalled and we are ready to program the
> >> flip, we have to wait for the end of vblank first, and the flip will be
> >> delayed by one frame.
> >>
> >> If we're going to change the userspace interface, it would be better to
> >> re-purpose the reserved field of struct drm_mode_crtc_page_flip for
> >> explicitly specifying the target vblank seqno (via a new cap and flag).
> >> Then the kernel and userspace would no longer need to second-guess each
> >> other.
> >>
> >> But even if we take that route, this series would be desirable for
> >> getting us most of the way there for existing userspace.
> > 
> > Well that's what I mean. You'r patches here shoehorn what you want into
> > existing api, trying to second-guess userspaces intentions inferred from
> > what it does. Ime that tends to end in trouble. It would be much better to
> > have a clear uabi for this. And my flag was just a suggestion.
> > 
> > I just had a discussion on irc with Dave about another topic were Dave
> > complained about some of the inferred abi rules SNA uses (entirely
> > differently, in probe code). And I thought it was a nice idea, since hey
> > it works and its easier. But really it's not, since in the end kms is
> > supposed to be somewhat generic. And usually your nice idea for inferring
> > behaviour then tends to break down somewhere. At least I think the generic
> > kms rules are:
> > 
> > - vblank events fire at lockstep [...].
> > - pageflip immediately after a vblank wait needs to hit the next vblank.
> > - pageflip in a loop needs to result in at most 1 flip per vblank, and if
> >   you're too fast then the kernel should return -EBUSY. [...]
> 
> My series doesn't break any of these rules (or any others I'm aware of).
> It avoids unnecessarily delaying flips in some cases within the confines
> of these rules.
> 
> 
> > Imo everything else (in this case: make the flip complete on the same
> > frame as the vblank, if you hit the vblank window) needs special flags,
> > with clear meaning of what they do. The specific flip target sounds like a
> > good idea, except that current userspace can't be fixed, so we need to
> > make it work without any flag.
> 
> Hey, that was my point above! So you agree that (something like) this
> series will be needed anyway, even if new flags are added to make it
> more explicit? :)

Yeah I think there's a bit of confusion going on here ;-) Of course I'm
not against fixing this, and I agree that fixing it by delaying the vblank
drm event (like I proposed at first) is not good. What I think would be
best to fix this:

- For all current userspace (i.e. no flags or anything) force vrefresh to
  to be fixed, and delay page flips which hit the vblank window to be
  after that. This way amd drivers are consistent with every other kms
  drivers, and work like current userspace seems to expect: Wait for
  vblank, then assume any flips will only hit after the next vblank.

- For clients which don't care about which frame to be displayed on, add a
  flag to page_flip (and atomic_commit) that asks for asap, but still
  without tearing. With atomic we even want to be able to overwrite older
  flips, to reduce latency as much as possible for clients who can render
  more frames than vrefresh.

- For variable vrefresh I think we need yet another flag (crtc property,
  or mode flag) so that userspace can tell the kernel when it's ok to have
  vblank times with massive jitter. Again default should be current mode.
  Userspace can then enable variable vrefresh when it only has clients
  which don't care when exactly they show up. As soon as something shows
  up which asks for precise timestamps and displaying of the flip (using
  OML_sync_control or Present), userspace can disable the variable
  vrefresh mode again to get back to a lockstep 60Hz (or whatever it is).

- Anyone using WAIT_VBLANK behind the compositors back might be in even
  more trouble, but imo we shouldn't give too much concern to that case.
  Instead apply more pressure for everyone to switch over to Present or
  OML_sync_control (and we need that for EGL, too).

That way there's no second-guessing of userspace intentions by watching
vblank waits, it should be possible to extend for variable refresh and
benchmark/low-latency mode, and we don't need special driver hacks in core
code either. Thoughts on whether this plan would fit you too?

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch