[PATCH] drm: reduce default drm vblank off delay to 50ms

Thu Nov 1 22:56:57 PDT 2012

On 30.10.12 20:28, Jesse Barnes wrote:
> On Tue, 30 Oct 2012 20:20:44 +0100
> Daniel Vetter <daniel at ffwll.ch> wrote:
>
>> On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes <jbarnes at virtuousgeek.org> wrote:
>>> People keep whining about this, but no one seems to send a patch.  This
>>> *ought* to be safe now that we've dealt with the hw races in Mario's
>>> updated code, and fixed the bugs we know about in VT switch, DPMS, and
>>> multi-head configuraions.
>>>
>>> Signed-off-by: Jesse Barnes <jbarnes at virtuousgeek.org>
>>
>> Afaik the fundamental race of enabling the vblank is still there, so
>> this is just duct-tape. And our hw has the required registers (on
>> gen5+ at least) to close this race for real and abolish all "disable
>> vblank irq later to paper over races and smooth things out). Hence I
>> think we should dtrt and so
>>
>> Nacked-by: Daniel Vetter <daniel at ffwll.ch>
>>
>> Also discussed with Jesse on irc, we've had fun ;-)
>
> That's ridiculous.  Just because we have a race we can't fix wrt
> reading hw regs, doesn't mean we can't reduce the timeout.
>
> I nack your nack.
>

Jesse, thanks for cc'ing me, much appreciated :)

Psychtoolbox should be fine with a 50 msecs vblank off delay. I think i 
tested with values somewhere between 50 - 100 msecs at the time the drm 
patches were made. The way i schedule swaps for a specific target time 
'twhen' is to:

1. glXGetSyncValuesOML(msc,ust) as a baseline vblank count and time.
2. Some basic math to calculate targetMSC based on 'twhen' and step 1.
3. glXSwapBuffersMscOML(targetMSC);

So it should tolerate wrong vblank counts due to races, as long as 
vblank doesn't get disabled between 1. and 3. A default vblank off delay 
lower than maybe 1 refresh duration would make me nervous, given that 
most kms drivers are not race-free and it is possible for a userspace 
app or the x-server to get stalled for a couple of msecs.

E.g., radeon wouldn't allow this race-free, because the drm code assumes 
that the hw vblank counter increments at leading edge of vblank, whereas 
radeon hw does it at the traling edge, iirc. nouveau doesn't have any hw 
vblank counter support, so it is "race free" until somebody fixes the 
driver ;-) - ie., it would work with apps that use steps 1/2/3 above, 
but broken for any other app.

More trusting apps could get into more trouble with 50 msecs vs. 5000 
msecs ...

What kind of race in your code do you mean? What does your commit 
message "...now that we've dealt with the hw races in Mario's updated 
code..." refer to?

I remember that Matthew Garret sent out a patch set a couple of months 
ago which aimed to make this configurable per kms driver, so each driver 
could opt into low vblank off delay if it was ready to do it properly. 
That would make a lot of sense.

That said, for the current intel-ddx it wouldn't even matter for X11/GLX 
applications if the kernel returned random numbers instead of vblank 
counts, because the OML_sync_control swap scheduling in the ddx is 
totally broken, and even glXSwapBuffers is a wreck for any swapinterval 
setting except 0 and 1 since triple-buffering was introduced a year ago 
and introduced some bug. Psychtoolbox now detects this since a few 
weeks, prints out critical warnings to the user and tries to work around 
with glXSwapBuffers trickery.

I submitted a patch to fix the rather trivial but nasty bug. Jesse, if 
you find some time, could you review it and convince Chris to merge it?
Should be on the intel-gfx list and in your inbox, a simple . Subject line
"[PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.", sent at 
7 October 2012.

I'll forward it again to you. Chris Wilson already gave me an off-list 
review for that patch series after i pinged him. Unfortunately a rather 
terse one, cited here:

"On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner 
<mario.kleiner at tuebingen.mpg.de> wrote:
 > Hi Chris,
 >
 > can you please check & merge at least the first two patches of the
 > series into the intel ddx?

The first is along the right path, but I think you want to base the
split on divisor==0. The second is broken in the DRI2 layer, as the
SwapLimit mechanism exposes the multi-client race to a single client
(i.e. there is no serialisation between the InvalidateEvent and the
GetBuffers, and the InvalidateEvent is always broadcast too early.) The
third is just plain wrong, as pageflip may be a blocking ioctl (and will
impose client blocking) so anybody who wants to override SwapBuffersWait
will also want to prevent the inherent wait due to the flip. In any
event that option is to only paper over the loose DRI2 specification and
the lack of client control...
-Chris
"

If i understood the comment wrt. to patch [2/3] "The first is along the 
right path, but I think you want to base the split on divisor==0." 
correctly, then that would be just as broken as the current 
implementation, with some code obfuscation and confusion added, but i 
don't know, maybe i just misunderstood his comment? I asked for 
clarification, but he didn't respond to my followup e-mails. Maybe he 
was just too busy to reply, maybe i'm just too pushy and annoying.

I will try to make some friendship with piglit and see if i can write 
some tests when i find some time to catch such bugs earlier in the future.

Also, i take from Chris comments about patch [1/3] above that properly 
implemented triple-buffering with correct timestamping and 
synchronisation is impossible with the current X-Servers, because the 
DRI2 invalidate event mechanism is racy and broken to the point of not 
being fixable? Basically we would have to wait for DRI3?

Sorry for getting a bit off-topic here, but i think it's important to 
keep all layers working in a meaningful way if you want to support some 
rather serious applications of computer graphics.

thanks,
-mario