[Intel-gfx] [PATCH igt v2] tests/kms_cursor_legacy: Boost timing sensitive subtests to RT prio
Imre Deak
imre.deak at intel.com
Mon Sep 12 20:57:54 UTC 2016
On Mon, 2016-09-12 at 21:04 +0100, Chris Wilson wrote:
> On Mon, Sep 12, 2016 at 05:47:57PM +0300, Imre Deak wrote:
> > Even in an otherwise quiescent system there may be user/kernel
> > threads
> > independent of the test that add enough latency to make timing
> > sensitive
> > subtests fail. Boost the priority of such subtests to avoid these
> > failures.
> >
> > This got rid of sporadic failures in basic-cursor-vs-flip-legacy
> > and
> > basic-cursor-vs-flip-varying-size with 'missed 1 frame' error
> > message
> > APL and BSW.
> >
> > v2:
> > - Boost the priority in flip_vs_cursor_crc() too.
> >
> > CC: Chris Wilson <chris at chris-wilson.co.uk>
> > CC: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> > Signed-off-by: Imre Deak <imre.deak at intel.com>
>
> But we shouldn't need to. The basic test is:
>
> align to vblank
> request non-blocking flip
> update cursor
In these subtests we run these cursor updates in a loop.
> check vblank hasn't advanced
>
> We are not doing any busy loops here and there should be nothing else
> running on the system. So what caused the context switch? Who are we
> fighting against?
The cursor thread is one source for the delay, other than that it could
be anything running in the background. In my traces it looked like
something related to CI remote logging that caused >16ms delay for both
the user flip thread and the subsequent MMIO work. Imo there is no
guarantee that such delays won't happen between threads running at the
same priority, hence the need for higher priority for timing sensitive
stuff. Note that we see this problem on BSW with with 2 CPUs.
> If the only thing that is causing the issue is the
> kernel thread used for the mmioflip (which won't be scheduled for
> another 16ms until the next vblank), we have another bug to track
> down.
The MMIO flip work is scheduled right after we request the flip (since
we do the request after the previous flip completed) and I saw it being
delayed >16ms for the above reasons. Besides this I also saw the user
space flip thread being delayed the same way.
> Imo, this patch is just papering over an issue that as it stands will
> be
> present in real userspace (i.e. causing jerkiness in X, weston, cros
> etc).
I can't see any other way than adjusting priorities to guarantee the
timely completion of some work. Otherwise you'll only get best effort
scheduling and that doesn't seem to be enough in these subtests.
--Imre
More information about the Intel-gfx
mailing list