[RFC] DRI2 synchronization and swap bits
Mario Kleiner
mario.kleiner at tuebingen.mpg.de
Sun Nov 1 12:46:45 PST 2009
Hello everybody
My name is Mario Kleiner and i'm new to this list, so i apologize
beforehand should i violate some rules of netiquette, state the
totally obvious, or if this post is somehow considered off-topic or
way too long. Please tell me if so, and how to do better next time.
First some background to why i am posting, then some proposals more
to the point of this RFC.
I read this RFC and i'm very excited about the prospect of having
well working support for the OML_sync_control extension in DRI2 on
Linux/X11. I was hoping for this to happen since years, so a big
thank you in advance! This is why i hope to provide some input from
the perspective of future "power-users" of functions like
glXGetSyncValuesOML(), glXSwapBuffersMscOML(), glXWaitForSbcOML. I'm
the co-developer of a popular free-software toolkit (Psychtoolbox)
that is used mostly in the neuroscience / cognitive science community
by scientist to find out how the different senses (visual, auditory,
haptic, ...) work and how they work together. Our requirements to
graphics are often much more demanding than what a videogame, typical
vr-environment or a mediaplayer has.
Our users often have very strict requirements for scheduling frame-
accurate and tear-free visual stimulus display, synchronizing
bufferswaps across display-heads, and low-latency returns from swap-
completion. Often they need swap-completion timestamps which are
available with the shortest possible delay after a successfull swap
and accurately tied to the vblank at which scanout of a swapped frame
started. The need for timestamps with sub-millisecond accuracy is not
uncommon. Therefore, well working OML_sync_control support would be
basically a dream come true and a very compelling feature for Linux
as a platform for cognitive science.
I spent the last 12 hours reading the CompositeSwap page at the DRI-
Wiki and through Jesse Barnes git-tree and the drivers/gpu/drm/
drm_irq.c file in the linux-next git-tree at kernel org, which i
assume (correctly?) is the current state of art wrt. to the DRM, and
have some thoughts or wishes.
1. Wrt to "2) DRI2WaitMSC/SBC a) Concern about blocking the client on
the server side as opposed to a client side wait."
I'm not sure about the extra latency involved by blocking the client
on the server side, instead of a client side wait, but i can assure
you that for our applications, 1 millisecond extra delay between swap-
completion and unblocking can make a significant difference. Quite
often certain actions need to be triggered in sync with swap
completion. Examples are starting recording equipment for brain
activity (fMRI, EEG, MEG, eye-trackers) or other physiological
responses, starting sound playback or recording, sending trigger
packets over a network, driving special digital/analog I/O boards,
driving motion simulators etc. So low-latency unblocking would be
much appreciated from our side.
2. On the CompositePage in the DRM Wiki, there is this comment:
"...It seems that composited apps should never need to know about
real world screen vblank issues, ... ....When dealing with a
redirected window it seems it would be acceptable to come up with an
entirely fake number for all existing extensions that care about
vblanks.."
I don't like this idea about entirely fake numbers and like to vote
for a solution that is as close as possible to the non-redirected
case. Most of our applications run in non-redirected, full-screen,
undecorated, page-flipped windows, ie., without a compositor being
involved. I can think of a couple future usage cases though where
reasonably well working redirected/composited windows would be very
useful for us, but only if we get meaningful timestamps and vblank
counts that are tied to the actual display onset.
3. The Wiki also mentions "The direct rendered cases outlined in the
implementation notes above are complete, but there's a bug in the
async glXSwapBuffers that sometimes causes clients to hang after
swapping rather than continue." Looking through the code of <http://
cgit.freedesktop.org/~jbarnes/xf86-video-intel/tree/src/i830_dri.c?
id=a0e2e624c47516273fa3d260b86d8c293e2519e4> i can see that in
I830DRI2SetupSwap() and I830DRI2SetupWaitMSC(), in the "if (divisor
== 0) { ...}" path, the functions return after DRM_VBLANK_EVENT
submission without assigning *event_frame = vbl.reply.sequence; This
looks problematic to me, as the xserver is later submitting
event_frame in the call to DRI2AddFrameEvent() inside DRI2SwapBuffers
() as a cookie to find the right events for clients to wait on? Could
this be a reason for clients hanging after swap? I found a few other
spots where i other misunderstood something or there are small bugs.
What is the appropriate way to report these?
4. According to spec, the different OML_sync_control functions do
return a UST timestamp which is supposed to reflect the exact time of
when the MSC last incremented, i.e., at the start of scanout of a new
video frame. SBC and MSC are supposed to increment atomically/
simultaneously at swap completion, so the UST in the (UST,SBC,MSC)
triplet is supposed to mark the time of transition of either MSC or
MSC and SBC at swap completion. This makes a lot of sense to me, it
is exactly the type of timestamp that our toolkit critically depends on.
Ideally the UST timestamp should be corrected to reflect start of
scanout, but a UST that is consistently taken at vblank interrupt
time would do as well. In the current implementation this is *not*
the semantic we'd get for UST timestamps.
The I830DRI2GetMSC() call uses a call to drmWaitVBlank() and its
returned vbl.reply.tval_sec and vbl.reply.tval_usec values for
computing UST.
I830DRI2SetupSwap() and I830DRI2SetupWaitMSC() ask drmWaitVBlank() to
drm_queue_vblank_event() vblank events. Later on, UST is computed
from the timestamp contained in the dequeued events.
If you look at the drm_wait_vblank() and drm_queue_vblank_event()
functions in the current dri_irq.c inside the linux-next tree, you'll
expect the following undesireable behaviour:
I830DRI2GetMSC -> drmWaitVBlank -> drm_wait_vblank: Falls through
DRM_WAIT_ON, because the wait condition is not satisifed and calls
do_gettimeofday(&now) for the UST timestamp. This timestamping is not
synchronized to the vblank at all!
I830DRI2SetupSwap() or I830DRI2SetupWaitMSC() -> drmWaitVBlank ->
drm_wait_vblank -> drm_queue_vblank_event for a certain vblwait-
>request.sequence number. If this target sequence number has not yet
been reached, the event gets queued and later on timestamped via
do_gettimeofday() in drm_handle_vblank_events(), which is called from
the vblank irq handler --> Exactly the behaviour we want! If however
the vblwait->request.sequence number has been reached already in
drm_queue_vblank_event() then the routine will retire the event
immediately and apply a do_gettimeofday() timestamp immediately,
which will result in a wrong UST timestamp.
Unreliable UST timestamps would make the whole OML_sync_control
extension almost useless for us and probably other applications that
require good sync e.g, btw. video and audio streams, so i'd ask you
politely for improvements here.
I guess one (simple from the viewpoint of a non-kernel hacker?) way
would be to always timestamp the vblank in the drm_handle_vblank()
routine, immediately after incrementing the vblank_count, probably
protecting both the timestamp acquisition and vblank increment by one
spinlock, so both get updated atomically? Then one could maybe
extend drm_vblank_count() to readout and return vblank count and
corresponding timestamp simultaneously under protection of the lock?
Or any other way to provide the timestamp together with the vblank
count in an atomic fashion to the calling code in
drm_queue_vblank_event(), drm_queue_vblank_event() and
drm_handle_vblank_events()?
If you read up to here, thanks a lot for your attention and apologies
for the long post.
-mario
*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany
e-mail: mario.kleiner at tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax: +49 (0)7071/601-616
www: http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)
More information about the xorg-devel
mailing list