[Intel-gfx] [PATCH 05/19] drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off()
Mario Kleiner
mario.kleiner.de at gmail.com
Tue Sep 2 21:33:18 CEST 2014
Hi Ville,
went through the vblank rework patch set, mostly looks good to me. I
couldn't find any bugs in the code. A first quick test-run on my old
Intel GMA-950 (Gen 3'ish i think?) also didn't show apparent problems
with the OML_sync_control functions. I'll try to test more carefully
with that card and maybe with a few more cards in the next days, if i
can get my hands on something more recent.
The problematic bits:
Patch 3/19 [Don't clear vblank timestamp...] in combination with [5/19
below]:
I agree that not clearing the timestamps during drm_vblank_off() is
probably the better thing to do for userspace. The idea behind clearing
the timestamps was that a ust timestamp of zero signals to userspace
that the timestamp is invalid/undefined at the moment, so the client
should retry the query if it needs a valid timestamp. This worked in
practice insofar as a value of zero can't happen normally, unless a
client would query a timestamp during the first microsecond since
machine powerup. But i guess returning the last valid (msc, ust) pair to
a client during vblank off may be better for things like compositors
etc. I also wonder if we ever documented this ust == 0 -> -EAGAIN behaviour?
The problem with patch 5/19 is gpus/drivers which don't provide precise
instantaneous vblank timestamps - which are afaik everything except
intel, radeon and nouveau. On such drivers, the old code would return a
zero ust timestamp during queries after the first drm_vblank_get() and
until the first vblank irq happens and initializes the timestamps to
something valid. The zero ust would signal "please retry" to the client.
With patch 5/19 you'd get an updated vblank counter with an outdated
vblank timestamp - whatever is stored in the ringbuffer from the past,
because drm_update_vblank_count() can't update the timestamp without
support for the optional vblank-timestamp driver function. A mismatched
msc, ust would be very confusing to clients.
The only way one could get valid msc + ust on such drivers would be to
enable vblank irq's and then wait for the next vblank irq to actually
update the timestamp, at the cost of a couple of msecs waiting time.
So either have drm_update_vblank_count() itself sleep until next vblank
"if (!rc) ..." at the very end, as a rc == 0 would signal an
imprecise/wrong vblank timestamp. Or have all callers of it do this, if
locking makes it neccessary. Or only care about it for the
drm_vblank_off() special case, e.g., if !vblank->enabled there, then
drm_vblank_get() -> wait for a valid timestamp and count to show up ->
drm_vblank_put() -> vblank_disable_and_save().
For Patch 11/19 [Add dev->vblank_disable_immediate flag]: Can we make it
so that a drm_vblank_offdelay module parameter of zero always overrides
the flag? The only reason why a user wants to set drm_vblank_offdelay to
zero is if that user absolutely needs precise and reliable vblank
counts/timestamps and finds out that something is broken with his
driver+gpu, so uses this as an override to temporarily fix a broken
driver. That doesn't work if the vblank_disable_immediate flag overrides
the override from the user - the user couldn't opt out of the trouble.
This might not be such an issue with Intel cards, as you have test
suites and a QA team, and i assume/hope you tested every single intel
gpu shipped in the last decade or so if the whole vblank off/on logic
really is perfectly race-free now? At least it seems to work with that
one gen-3 card i quickly tested. But for most other drivers with small
teams and no dedicated QA this could end badly quickly for the user
without any manual override.
The docs should probably clarify that a hw vblank counter isn't enough
for the vblank_disable_immediate flag to be set. Their vblank
off/on/hardware counter query implementation must be completely race
free. iirc this means the hw counter query must behave as if the vblank
counter always increments at the leading edge of vblank. E.g., radeon
has hw counter queries, but the counter increments either at the
trailing edge, or somewhere in the middle of vblank, so there it
wouldn't work without races, causing off-by-one errors sometimes.
For Patch 14/19 [Don't update vblank timestamp when the counter didn't
change]
That would go wrong if a driver doesn't implement a proper vblank
counter query. E.g., nouveau has precise vblank timestamping since Linux
3.14, but still no functional hw counter query.
Almost all embedded gpu drivers currently implement completely bogus hw
vblank counter queries, because that driver hook is mandatory. I think
it would make sense if we would make that hook optional, allow a NULL
function pointer and adapt to the lack of that query, e.g., by never
disabling vblank irq's, except in drm_vblank_off() when a kms-driver
insists on disabling its irq during modeset/dpms off/suspend etc.
With these remarks somehow taken into account you have my
Reviewed-by: Mario Kleiner <mario.kleiner.de at gmail.com>
for the whole series, if you want.
thanks,
-mario
On 08/06/2014 01:49 PM, ville.syrjala at linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala at linux.intel.com>
>
> If the vblank irq has already been disabled (via the disable timer) when
> we call drm_vblank_off() sample the counter and timestamp one last time.
> This will make the sure that the user space visible counter will account
> for time between vblank irq disable and drm_vblank_off().
>
> Reviewed-by: Matt Roper <matthew.d.roper at intel.com>
> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> ---
> drivers/gpu/drm/drm_irq.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index af96517..1f86f6c 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -140,6 +140,19 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc)
> */
> spin_lock_irqsave(&dev->vblank_time_lock, irqflags);
>
> + /*
> + * If the vblank interrupt was already disbled update the count
> + * and timestamp to maintain the appearance that the counter
> + * has been ticking all along until this time. This makes the
> + * count account for the entire time between drm_vblank_on() and
> + * drm_vblank_off().
> + */
> + if (!dev->vblank[crtc].enabled) {
> + drm_update_vblank_count(dev, crtc);
> + spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags);
> + return;
> + }
> +
> dev->driver->disable_vblank(dev, crtc);
> dev->vblank[crtc].enabled = false;
>
More information about the Intel-gfx
mailing list