Funky new vblank counter regressions in Linux 4.4-rc1
Ville Syrjälä
ville.syrjala at linux.intel.com
Thu Nov 19 10:20:51 PST 2015
On Thu, Nov 19, 2015 at 06:46:28PM +0100, Mario Kleiner wrote:
> Hi Alex and Michel and Ville,
>
> it's "fix vblank stuff" time again ;-)
>
> Ville's changes to the DRM's drm_handle_vblank() /
> drm_update_vblank_count() code in Linux 4.4 not only made that code more
> elegant, but also removed the robustness against the vblank irq quirks
> in AMD hw and similar hardware. So now i get tons of off-by-one errors and
>
> "[ 432.345] (WW) RADEON(1): radeon_dri2_flip_event_handler: Pageflip
> completion event has impossible msc 24803 < target_msc 24804" XOrg
> messages from that kernel.
Argh. Sorry about that.
>
> One of the reasons for trouble is that AMD hw quirk where the hw fires
> an extra vblank irq shortly after vblank irq's get enabled, not
> synchronized to vblank, but typically in the middle of active scanout,
> so we get a redundant call to drm_handle_vblank in the middle of scanout.
I think that should be fine as such. The code should ignore redudntant
vbl irqs. Well, assuming you have a reliable hw counter or you use the
timestamp guesstimate mechanism and your scanout position is reported
accurately. But I guess you have a bit of problem with both.
>
> To fix that i have a minor patch to make drm_update_vblank_count() again
> robust against such redundant calls, which i will send out later to the
> mailing list. Diff attached for reference.
>
> The second quirk of AMD hw is that the vblank interrupt fires a few
> scanlines before start of vblank, so drm_handle_vblank ->
> drm_update_vblank_count() -> dev->driver->get_vblank_counter() gets
> called before the start of the vblank for which the new vblank count
> should be queried.
Does it fire too soon, or is the scanout position register value(s)
just offset by a few lines perhaps?
We have that with i915 and I simply fix up the value when reading it
out. Fortunately for us the offset is constant (or at least seems to
be) for a given platform/connector combo.
>
> The third problem is that the DRM vblank handling always had the
> assumption that hardware vblank counters would essentially increment at
> leading edge of vblank - basically in sync with the firing of the vblank
> irq, so that a hw counter readout from within the vblank irq handler
> would always deliver the new incremented value. If this assumption is
> violated then the counting by use of the hw counter gets unreliable,
> because depending on random small delays in irq handling the code may
> end up sampling the hw counter pre- or post-increment, leading to
> inconsistent updating and funky bugs. It just so happens that AMD
> hardware doesn't increment the hw counter at leading edge of vblank, so
> stuff falls apart.
>
> So to fix those two problems i'm tinkering with cooking the hw vblank
> counter value returned by radeon_get_vblank_counter_kms() to make it
> appear as if the counter incremented at leading edge of vblank in sync
> with vblank irq.
Yeah, I do that in i915 too. Well, on some platforms where the
counter increments at the leading edge of active.
>
> It almost sort of works on the rs600 code path, but i need a bit of info
> from you:
>
> 1. There's this register from the old specs for m76.pdf, which is not
> part of the current register defines for radeon-kms:
>
> "D1CRTC_STATUS_VF_COUNT - RW - 32 bits - [GpuF0MMReg:0x60A8]"
>
> It contains the lower 16 bits of framecounter and the 13 bits of
> vertical scanout position. It seems to give the same readings as the 24
> bit R_0060A4_D1CRTC_STATUS_FRAME_COUNT we use for the hw counter. This
> would come handy.
>
> Does Evergreen and later have a same/similar register and where is it?
>
> 2. The hw framecounter seems to increment when the vertical scanout
> position wraps back from (VTOTAL - 1) to 0, at least on the one DCE-3
> gpu i tested so far. Is this so on all asics? And is the hw counter
> increment happening exactly at the moment that vertical scanout position
> jumps back to zero, ie. both events are driven by the same signal? Or is
> the framecounter increment just happening somewhere inside either
> scanline VTOTAL-1 or scanline 0?
>
>
> If we can fix this and get it into rc2 or rc3 then we could avoid a bad
> regression and with a bit of luck at the same time improve by being able
> to set dev->vblank_disable_immediate = true then and allow vblank irqs
> to get turned off more aggressively for a bit of extra power saving.
>
> thanks,
> -mario
> -- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> unsigned long flags)
> {
> struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
> - u32 cur_vblank, diff;
> + u32 cur_vblank, diff = 0;
> bool rc;
> struct timeval t_vblank;
> + const struct timeval *t_old;
> + u64 diff_ns;
> int count = DRM_TIMESTAMP_MAXRETRIES;
> int framedur_ns = vblank->framedur_ns;
>
> @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags);
> } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && --count > 0);
>
> - if (dev->max_vblank_count != 0) {
> - /* trust the hw counter when it's around */
> - diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
> - } else if (rc && framedur_ns) {
> - const struct timeval *t_old;
> - u64 diff_ns;
> -
> + /*
> + * Always use vblank timestamping based method if supported to reject
> + * redundant vblank irqs. E.g., AMD hardware needs this to not screw up
> + * due to some irq handling quirk.
Hmm. I'm thinking it would be better to simply not claim that there's a hw
counter if it isn't reliable.
> + *
> + * This also sets the diff value for use as fallback below in case the
> + * hw does not support a suitable hw vblank counter.
> + */
> + if (rc && framedur_ns) {
> t_old = &vblanktimestamp(dev, pipe, vblank->count);
> diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old);
>
> @@ -212,11 +216,28 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> */
> diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns);
>
> - if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ)
> - DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored."
> - " diff_ns = %lld, framedur_ns = %d)\n",
> - pipe, (long long) diff_ns, framedur_ns);
> - } else {
> + if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) {
> + DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored."
> + " diff_ns = %lld, framedur_ns = %d)\n",
> + pipe, (long long) diff_ns, framedur_ns);
> +
> + /* Treat this redundant invocation as no-op. */
> + WARN_ON_ONCE(cur_vblank != vblank->last);
> + return;
> + }
> + }
> +
> + /*
> + * hw counters, if marked as supported via max_vblank_count != 0,
> + * *must* increment at leading edge of vblank and in sync with
> + * the firing of the hw vblank irq, otherwise we have a race here
> + * between gpu and us, where small variations in our execution
> + * timing can cause off-by-one miscounting of vblanks.
> + */
> + if (dev->max_vblank_count != 0) {
> + /* trust the hw counter when it's around */
> + diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
> + } else if (!(rc && framedur_ns)) {
> /* some kind of default for drivers w/o accurate vbl timestamping */
> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
> }
--
Ville Syrjälä
Intel OTC
More information about the dri-devel
mailing list