Funky new vblank counter regressions in Linux 4.4-rc1
Ville Syrjälä
ville.syrjala at linux.intel.com
Mon Nov 23 07:51:09 PST 2015
On Mon, Nov 23, 2015 at 04:23:21PM +0100, Mario Kleiner wrote:
> On 11/20/2015 04:34 PM, Ville Syrjälä wrote:
> > On Fri, Nov 20, 2015 at 04:24:50PM +0100, Mario Kleiner wrote:
>
> ...
>
> >> What we do in radeon-kms is similar. If DRM_CALLED_FROM_VBLIRQ and we
> >> are no more than 1% of the display height away from start of vblank we
> >> fudge scanout position in a way so that the timestamp gets computed for
> >> the soon-to-begin vblank, not the old one.
> >
> > The problem with basing that fudging purely on DRM_CALLED_FROM_VBLIRQ is
> > that if you call .get_scanout_positon() from a non-irq context between
> > the irq firing and start of vblank, you'll think you're still in the
> > previous frame. Hence my suggestion to note down the frame counter when
> > called from the irq, and then keep doing the fudging until you've truly
> > crossed the start of vblank.
> >
>
> Ok, but why would that be a bad thing? I think we want it to think it is
> in the previous frame if it is called outside the vblank irq context.
> The only reason we fudge it to the next frames vblank if i vblank irq is
> because we know the vblank irq handler we are executing atm. was meant
> to execute within the upcoming vblank for the next frame, so we fudge
> the scanout positions and thereby timestamp to correspond to that new
> frame. But if something called outside irq context it should get a
> scanout position/timestamp that corresponds to "reality".
It would be a bad thing since it would cause the timestamp to jump
backwards, and that would also cause the frame count guesstimate to go
backwards.
>
> It would maybe be a problem to get different answers for a query at the
> same scanout position if we used .get_scanout_position() for something
> else than for calculating vblank timestamps, but we don't do that atm.
>
> Maybe i'm overlooking something here related to the somewhat rewritten
> code from the last year or so? But in the original design this would be
> exactly what i intended?
>
> ...
>
> >> So it's good enough for typical desktop
> >> applications/compositors/games/media players, and a nice improvement
> >> over the previous state, but not quite sufficient for applications that
> >> need long time consistent vblank counts for long waits of multiple
> >> seconds or even minutes. So it's still good to have hw counters if we
> >> can get some that are reliable enough.
> >
> > Ah, I didn't realize you care about small errors in the counter for
> > such long periods of vblank off.
> >
>
> Actually, you are right, i was stupid - not enough sleep last friday. I
> do care about such small errors, but the long vblank off periods don't
> matter at all the way my software works. I query the current count and
> timestamp (glXGetSyncValuesOML), calculate a target count based on those
> and then schedule a swap for that target count via glXSwapBuffersMscOML.
> That swapbuffers call will keep vblank irqs on until the kms pageflip is
> queued. So i only care about vblank counts/timestamps being consistent
> for short amounts of time, typically 1 video refresh from vblank query
> to queueing a vblank event, and then from reception of that
> event/queuing the pageflip to pageflip completion event. So even if the
> system would be heavily loaded and my code would have big preemption
> delays i think counts that are consistent over a few seconds would be
> enough to keep things working. Otherwise it wouldn't work now either
> with a vblank off after 5 seconds and nouveau not having vblank hw counters.
>
> -mario
>
>
> >>
> >> -mario
> >>
> >>
> >>>>
> >>>> -mario
> >>>>
> >>>>>>
> >>>>>> It almost sort of works on the rs600 code path, but i need a bit of info
> >>>>>> from you:
> >>>>>>
> >>>>>> 1. There's this register from the old specs for m76.pdf, which is not
> >>>>>> part of the current register defines for radeon-kms:
> >>>>>>
> >>>>>> "D1CRTC_STATUS_VF_COUNT - RW - 32 bits - [GpuF0MMReg:0x60A8]"
> >>>>>>
> >>>>>> It contains the lower 16 bits of framecounter and the 13 bits of
> >>>>>> vertical scanout position. It seems to give the same readings as the 24
> >>>>>> bit R_0060A4_D1CRTC_STATUS_FRAME_COUNT we use for the hw counter. This
> >>>>>> would come handy.
> >>>>>>
> >>>>>> Does Evergreen and later have a same/similar register and where is it?
> >>>>>>
> >>>>>> 2. The hw framecounter seems to increment when the vertical scanout
> >>>>>> position wraps back from (VTOTAL - 1) to 0, at least on the one DCE-3
> >>>>>> gpu i tested so far. Is this so on all asics? And is the hw counter
> >>>>>> increment happening exactly at the moment that vertical scanout position
> >>>>>> jumps back to zero, ie. both events are driven by the same signal? Or is
> >>>>>> the framecounter increment just happening somewhere inside either
> >>>>>> scanline VTOTAL-1 or scanline 0?
> >>>>>>
> >>>>>>
> >>>>>> If we can fix this and get it into rc2 or rc3 then we could avoid a bad
> >>>>>> regression and with a bit of luck at the same time improve by being able
> >>>>>> to set dev->vblank_disable_immediate = true then and allow vblank irqs
> >>>>>> to get turned off more aggressively for a bit of extra power saving.
> >>>>>>
> >>>>>> thanks,
> >>>>>> -mario
> >>>>>
> >>>>>> -- a/drivers/gpu/drm/drm_irq.c
> >>>>>> +++ b/drivers/gpu/drm/drm_irq.c
> >>>>>> @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >>>>>> unsigned long flags)
> >>>>>> {
> >>>>>> struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
> >>>>>> - u32 cur_vblank, diff;
> >>>>>> + u32 cur_vblank, diff = 0;
> >>>>>> bool rc;
> >>>>>> struct timeval t_vblank;
> >>>>>> + const struct timeval *t_old;
> >>>>>> + u64 diff_ns;
> >>>>>> int count = DRM_TIMESTAMP_MAXRETRIES;
> >>>>>> int framedur_ns = vblank->framedur_ns;
> >>>>>>
> >>>>>> @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >>>>>> rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags);
> >>>>>> } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && --count > 0);
> >>>>>>
> >>>>>> - if (dev->max_vblank_count != 0) {
> >>>>>> - /* trust the hw counter when it's around */
> >>>>>> - diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
> >>>>>> - } else if (rc && framedur_ns) {
> >>>>>> - const struct timeval *t_old;
> >>>>>> - u64 diff_ns;
> >>>>>> -
> >>>>>> + /*
> >>>>>> + * Always use vblank timestamping based method if supported to reject
> >>>>>> + * redundant vblank irqs. E.g., AMD hardware needs this to not screw up
> >>>>>> + * due to some irq handling quirk.
> >>>>>
> >>>>> Hmm. I'm thinking it would be better to simply not claim that there's a hw
> >>>>> counter if it isn't reliable.
> >>>>>
> >>>>>> + *
> >>>>>> + * This also sets the diff value for use as fallback below in case the
> >>>>>> + * hw does not support a suitable hw vblank counter.
> >>>>>> + */
> >>>>>> + if (rc && framedur_ns) {
> >>>>>> t_old = &vblanktimestamp(dev, pipe, vblank->count);
> >>>>>> diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old);
> >>>>>>
> >>>>>> @@ -212,11 +216,28 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >>>>>> */
> >>>>>> diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns);
> >>>>>>
> >>>>>> - if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ)
> >>>>>> - DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored."
> >>>>>> - " diff_ns = %lld, framedur_ns = %d)\n",
> >>>>>> - pipe, (long long) diff_ns, framedur_ns);
> >>>>>> - } else {
> >>>>>> + if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) {
> >>>>>> + DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored."
> >>>>>> + " diff_ns = %lld, framedur_ns = %d)\n",
> >>>>>> + pipe, (long long) diff_ns, framedur_ns);
> >>>>>> +
> >>>>>> + /* Treat this redundant invocation as no-op. */
> >>>>>> + WARN_ON_ONCE(cur_vblank != vblank->last);
> >>>>>> + return;
> >>>>>> + }
> >>>>>> + }
> >>>>>> +
> >>>>>> + /*
> >>>>>> + * hw counters, if marked as supported via max_vblank_count != 0,
> >>>>>> + * *must* increment at leading edge of vblank and in sync with
> >>>>>> + * the firing of the hw vblank irq, otherwise we have a race here
> >>>>>> + * between gpu and us, where small variations in our execution
> >>>>>> + * timing can cause off-by-one miscounting of vblanks.
> >>>>>> + */
> >>>>>> + if (dev->max_vblank_count != 0) {
> >>>>>> + /* trust the hw counter when it's around */
> >>>>>> + diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
> >>>>>> + } else if (!(rc && framedur_ns)) {
> >>>>>> /* some kind of default for drivers w/o accurate vbl timestamping */
> >>>>>> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
> >>>>>> }
> >>>>>
> >>>>>
> >>>
> >
--
Ville Syrjälä
Intel OTC
More information about the dri-devel
mailing list