[PATCH v4 3/4] drm: Document variable refresh properties

Wed Oct 31 14:41:01 UTC 2018

On 10/31/18 10:12 AM, Michel Dänzer wrote:
> On 2018-10-31 2:38 p.m., Kazlauskas, Nicholas wrote:
>> On 10/30/18 11:34 AM, Kazlauskas, Nicholas wrote:
>>> On 10/30/18 5:29 AM, Michel Dänzer wrote:
>>>> On 2018-10-29 7:03 p.m., Ville Syrjälä wrote:
>>>>> On Mon, Oct 29, 2018 at 05:37:49PM +0100, Michel Dänzer wrote:
>>>>>> On 2018-10-26 7:59 p.m., Ville Syrjälä wrote:
>>>>>>> On Fri, Oct 26, 2018 at 05:34:15PM +0000, Kazlauskas, Nicholas wrote:
>>>>>>>> On 10/26/18 10:53 AM, Ville Syrjälä wrote:
>>>>>>>>>
>>>>>>>>> Speaking of timestamps. What is the expected behaviour of vblank
>>>>>>>>> timestamps when vrr is enabled?
>>>>>>>>
>>>>>>>> When vrr is enabled the duration of the vertical front porch will be
>>>>>>>> extended until flip or timeout occurs. The vblank timestamp will vary
>>>>>>>> based on duration of the vertical front porch. The min/max duration for
>>>>>>>> the front porch can be specified by the driver via the min/max range.
>>>>>>>>
>>>>>>>> No changes to vblank timestamping handling should be necessary to
>>>>>>>> accommodate variable refresh rate.
>>>>>>>
>>>>>>> The problem is that the timestamp is supposed to correspond to the first
>>>>>>> active pixel. And since we don't know how long the front porch will be
>>>>>>> we can't realistically report the true value. So I guess just assuming
>>>>>>> min front porch length is as good as anything else?
>>>>>>
>>>>>> That (and documenting that the timestamp corresponds to the earliest
>>>>>> possible first active pixel, not necessarily the actual one, with VRR)
>>>>>> might be good enough for the actual vblank event timestamps.
>>>>>>
>>>>>>
>>>>>> However, I'm not so sure about the timestamps of page flip completion
>>>>>> events. Those could be very misleading if the flip completes towards the
>>>>>> timeout, which could result in bad behaviour of applications which use
>>>>>> them for animation timing.
>>>>>>
>>>>>> Maybe the timestamp could be updated appropriately (yes, I'm hand-waving
>>>>>> :) in drm_crtc_send_vblank_event?
>>>>>
>>>>> Hmm. Updated how? Whether it's a page flip event or vblank even we won't
>>>>> know when the first active pixel will come. Although I suppose if
>>>>> there is some kind of vrr slew rate limit we could at least account
>>>>> for that to report a more correct "this is the earliest you migth be
>>>>> able to see your frame" timestamp.
>>>>>
>>>>> Oh, or are you actually saying that shceduling a new flip before the
>>>>> timeout is actually going to latch that flip immediately? I figured
>>>>> that the flip would get latched on the next start of vblank regardless,
>>>>> and the act of scheduling a flip will just kick the hardware to start
>>>>> scanning the previously latched frame earlier.
>>>>>
>>>>> scanout A | ... vblank | scanout A | ... vblank | scanout B | ... vblank
>>>>>                      ^                ^        ^               ^
>>>>>                      |                |        flip C          latch C
>>>>>                      flip B           latch B
>>>>
>>>> This would kind of defeat the point of VRR, wouldn't it? If a flip was
>>>> scheduled after the start of vblank, the vblank would always time out,
>>>> resulting in the minimum refresh rate.
>>>>
>>>>
>>>>> scanout A | ... vblank | scanout B | ... vblank | scanout C | ... vblank
>>>>>                      ^ ^                      ^ ^
>>>>>                      | latch B                | latch C
>>>>>                      flip B                   flip C
>>>>
>>>> So this is what happens.
>>>>
>>>> Regardless, when the flip is latched, AMD hardware generates a "pflip"
>>>> interrupt, and its handler calls drm_crtc_send_vblank_event (and in the
>>>> case of DC drm_crtc_accurate_vblank_count before that). So the time when
>>>> drm_crtc_send_vblank_event is called might be a good approximation of
>>>> when scanout of the next frame starts.
>>>>
>>>> Another possibility might be to wait for the hardware vline counter to
>>>> wrap around to 0 before calling drm_crtc_accurate_vblank_count, then the
>>>> calculations should be based on 0 instead of crtc_vtotal.
>>>
>>>
>>> I understand the issue you're describing now. The timestamp is supposed
>>> to signify the end of the current vblank. The call to get scanout
>>> position is supposed to return the number of lines until scanout (a
>>> negative value) or the number of lines since the next scanout began (a
>>> positive value).
>>>
>>> The amdgpu driver calculates the number of lines based on a hardware
>>> register status position which returns an increasing value from 0 that
>>> indicates the current vpos/hpos for the display.
>>>
>>> For any vpos below vbl_start we know the value is correct since the next
>>> vblank hasn't begun yet. But for anythign about vbl_start it's probably
>>> wrong since it applies a corrective offset based on the fixed value of
>>> crtc_vtotal. It's even worse when the value is above crtc_vtotal since
>>> it'll be calculating the number of lines since the last scanout since
>>> it'll be a positive value.
>>>
>>> So the issue becomes determining when the vfront porch will end.
>>>
>>> When the flip address gets written the vfront porch will end at the
>>> start of the next line leaving only the back porch plus part of the
>>> line. But we don't know when the flip will occur, if at all. It hasn't
>>> occurred yet in this case.
>>>
>>> Waiting for the wrap around to 0 might be the best choice here since
>>> there's no guarantee the flip will occur.
>>
>> I put some more thought into this and I don't think this is as bad as I
>> had originally thought.
>>
>> I think the vblank timestamp is supposed to be for the first active
>> pixel of the next scanout. The usage of which is for clients to time
>> their content/animation/etc.
>>
>> The client likely doesn't care when they issue their flip, just that
>> their content is matched for that vblank timestamp. The fixed refresh
>> model works really well for that kind of application.
>>
>> Utilizing variable refresh rate would be a mistake in that case since
>> the client would then have to time based on when they flip which is a
>> lot harder to "predict" precisely.
> 
> It's only a "mistake" as long as the timestamps are misleading. :) As
> discussed before, accurate presentation timestamps are one requirement
> for achieving perfectly smooth animation.

For most 3D games the game world/rendering can advance by an arbitrary 
timestep - and faster will be smoother, which is what the native mode 
would give you.

For fixed interval content/animation where you can't do that variable 
refresh rate could vastly improve the output. But like discussed before 
that would need a way to specify the interval/presentation timestamp to 
control that front porch duration. The timestamp will be misleading in 
any other case.

> 
> 
>> I did some more investigation into when amdgpu gets the scanout position
>> and what values we get back out of it. The timestamp is updated shortly
>> after the crtc irq vblank which is typically within a few lines of
>> vbl_start. This makes sense, since we can provide the prediction
>> timestamp early. Waiting for the vblank to wrap around to 0 doesn't
>> really make sense here since that would mean we already hit timeout or
>> the flip occurred
> 
> Sounds like you're mixing up the two cases of "actual" vblank events
> (triggered by the "vblank" interrupt => drm_(crtc_)handle_vblank) and
> flip completion events (triggered by the PFLIP interrupt with our
> hardware => drm_crtc_send_vblank_event).
> 
> Actual vblank events need to be delivered to userspace at the start of
> vblank, so we indeed can't wait until the timestamp is accurate for
> them. We just need to document the exact semantics of their timestamp
> with VRR.
> 
> For page flip completion events though, the timestamp needs to be
> accurate and correspond to when the flipped frame starts being scanned
> out, otherwise we'll surely break at least some userspace relying on
> this information.
> 
> 
Yeah, I was. I guess what's sent is the estimated vblank timestamp 
calculated at the start of the interrupt. And since that's just a guess 
rather than what's actually going to happen it's going to be wrong in a 
lot of cases.

I could see the wrap-around method working if the vblank timestamp was 
somehow updated in amdgpu or in drm_crtc_send_vblank_event. This would 
be a relatively simple fix but would break anything in userspace that 
relied on the timestamp for vblank interrupt and the flip completion 
being the same value. But I guess that's the case for any fix for this 
potential problem.

Nicholas Kazlauskas