[PATCH 4/5] drm/amdgpu: Wait for end of last waited-for vblank before programming flip

Thu Jun 16 02:15:14 UTC 2016

On 15.06.2016 18:23, Daniel Vetter wrote:
> On Wed, Jun 15, 2016 at 05:03:41PM +0900, Michel Dänzer wrote:
>> On 14.06.2016 17:06, Daniel Vetter wrote:
>>> On Tue, Jun 14, 2016 at 04:25:28PM +0900, Michel Dänzer wrote:
>>>> On 14.06.2016 14:53, Daniel Vetter wrote:
>>>>>
>>>>> I didn't know that we pass vblank waits to X clients. Either way annoying,
>>>>> since it means you need to keep things working like this for amd drivers
>>>>> forever.
>>>>
>>>> Please don't try to single out our drivers, it seems like rather your
>>>> driver is the odd man out in this case.
>>>
>>> Hm, why/where is i915 special?
>>
>> By only triggering the vblank interrupt at the end of a vertical blank
>> period, breaking the original DRM_IOCTL_WAIT_VBLANK functionality.
> 
> Ok, turns out we never did that, but we did move the drm event for page
> flip completion around. So indeed vblank wait drm event should fire at
> start of vblank if possible. I still maintain that (by default) you
> shouldn't allow a page flip to overtake a drm event.

We don't allow that AFAIK. If you think we do, please describe the
scenario you're thinking of.

The only thing I can think of is that you may be concerned about cases
where userspace calls DRM_IOCTL_MODE_PAGE_FLIP several times in a row
without calling DRM_IOCTL_WAIT_VBLANK in between. If so, I can change
patch 2 to also track page flip completions in
file_priv->last_vblank_wait. That would ensure that two consecutive
flips can never complete in the same vertical blank period.

>>>>> How bad would it really be to just always delay the page_flip past vblank?
>>>>
>>>> [...]
>>
>> [...] always delaying flip programming past vblank can cause flips to be
>> unnecessarily delayed by one frame.
> 
> Yes, but right now you show them too early, breaking existing stuff.

No, we don't! Neither before this series nor after it, nor at any point
during it.

The DRI2 code in our DDX drivers actually checks for this and complains
if a flip completes earlier than expected. We did break this before and
received bug reports about it. Mario fixed it by changing the
MASTER_UPDATE_MODE register to 3, which restricts flips to take effect
only at the beginning of vertical blank. This series makes it safe to
change it back to 0 (allowing flips to take effect anywhere in vertical
blank) and then does so.

>>>>> If that's not good enough I'd say we should add a
>>>>> faster-than-vblank-but-still-synced page_flip flag. Then userspace could
>>>>> tell you exactly whether you should always wait (no flags), or never wait
>>>>> (with this new flag).
>>>>
>>>> That would be an inferior solution compared to my series, e.g.: If
>>>> userspace calls DRM_IOCTL_MODE_PAGE_FLIP before the target vblank seqno
>>>> is reached, it cannot use the new flag, otherwise the flip might take
>>>> effect too early. However, if we are then already in the target vblank
>>>> period when the fences have signalled and we are ready to program the
>>>> flip, we have to wait for the end of vblank first, and the flip will be
>>>> delayed by one frame.
>>>>
>>>> If we're going to change the userspace interface, it would be better to
>>>> re-purpose the reserved field of struct drm_mode_crtc_page_flip for
>>>> explicitly specifying the target vblank seqno (via a new cap and flag).
>>>> Then the kernel and userspace would no longer need to second-guess each
>>>> other.
>>>>
>>>> But even if we take that route, this series would be desirable for
>>>> getting us most of the way there for existing userspace.
>>>
>>> Well that's what I mean. You'r patches here shoehorn what you want into
>>> existing api, trying to second-guess userspaces intentions inferred from
>>> what it does. Ime that tends to end in trouble. It would be much better to
>>> have a clear uabi for this. And my flag was just a suggestion.
>>>
>>> [...] At least I think the generic kms rules are:
>>>
>>> - vblank events fire at lockstep [...].
>>> - pageflip immediately after a vblank wait needs to hit the next vblank.
>>> - pageflip in a loop needs to result in at most 1 flip per vblank, and if
>>>   you're too fast then the kernel should return -EBUSY. [...]
>>
>> My series doesn't break any of these rules (or any others I'm aware of).
>> It avoids unnecessarily delaying flips in some cases within the confines
>> of these rules.
>>
>>
>>> Imo everything else (in this case: make the flip complete on the same
>>> frame as the vblank, if you hit the vblank window) needs special flags,
>>> with clear meaning of what they do. The specific flip target sounds like a
>>> good idea, except that current userspace can't be fixed, so we need to
>>> make it work without any flag.
>>
>> Hey, that was my point above! So you agree that (something like) this
>> series will be needed anyway, even if new flags are added to make it
>> more explicit? :)
> 
> Yeah I think there's a bit of confusion going on here ;-) Of course I'm
> not against fixing this, and I agree that fixing it by delaying the vblank
> drm event (like I proposed at first) is not good. What I think would be
> best to fix this:
> 
> - For all current userspace (i.e. no flags or anything) force vrefresh to
>   to be fixed, and delay page flips which hit the vblank window to be
>   after that. This way amd drivers are consistent with every other kms
>   drivers, and work like current userspace seems to expect: Wait for
>   vblank, then assume any flips will only hit after the next vblank.

This series preserves this behaviour. A flip is only allowed to complete
during the current vertical blank period if userspace either:

* expected it to complete in this vertical blank (or an earlier one).
  In other words, if the flip doesn't complete in this vertical blank,
  it is delayed (further) compared to userspace expectations.

* hasn't called DRM_IOCTL_WAIT_VBLANK at all (so apparently it doesn't
  care about when the flip completes).

It sounds like you're saying we aren't allowed to fix cases where flips
are completing later than expected by userspace, because other drivers
haven't fixed those cases yet. Quite frankly, that sucks. Nothing other
than possible hardware restrictions prevents other drivers from fixing
this as well.

> - For clients which don't care about which frame to be displayed on, add a
>   flag to page_flip (and atomic_commit) that asks for asap, but still
>   without tearing.

I explained above why such a flag wouldn't help in all cases. When I get
a chance, I'll work on a series allowing the target vblank seqno to be
specified explicitly in DRM_IOCTL_MODE_PAGE_FLIP. That should cover all
cases.

> - For variable vrefresh I think we need yet another flag (crtc property,
>   or mode flag) so that userspace can tell the kernel when it's ok to have
>   vblank times with massive jitter. Again default should be current mode.
>   Userspace can then enable variable vrefresh when it only has clients
>   which don't care when exactly they show up. As soon as something shows
>   up which asks for precise timestamps and displaying of the flip (using
>   OML_sync_control or Present), userspace can disable the variable
>   vrefresh mode again to get back to a lockstep 60Hz (or whatever it is).

Colour me skeptical about that being a good approach for variable
refresh rate, but that's another discussion which doesn't directly
affect this series.

-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer