[Intel-gfx] [PATCH] drm/i915/selftests: Increase timeout in requests perf selftest

Thomas Hellström thomas.hellstrom at linux.intel.com
Thu Oct 21 05:36:43 UTC 2021


On Wed, 2021-10-20 at 13:34 -0700, John Harrison wrote:
> On 10/11/2021 10:57, Matthew Brost wrote:
> > perf_parallel_engines is micro benchmark to test i915 request
> > scheduling. The test creates a thread per physical engine and
> > submits
> > NOP requests and waits the requests to complete in a loop. In
> > execlists
> > mode this works perfectly fine as powerful CPU has enough cores to
> > feed
> > each engine and process the CSBs. With GuC submission the uC gets
> > overwhelmed as all threads feed into a single CTB channel and the
> > GuC
> > gets bombarded with CSBs as contexts are immediately switched in
> > and out
> > on the engines due to the zero runtime of the requests. When the
> > GuC is
> > overwhelmed scheduling of contexts is unfair due to the nature of
> > the
> > GuC scheduling algorithm. This behavior is understood and deemed
> > acceptable as this micro benchmark isn't close to real world use
> > case.
> > Increasing the timeout of wait period for requests to complete.
> > This
> > makes the test understand that is ok for contexts to get starved in
> > this
> > scenario.
> > 
> > A future patch / cleanup may just delete these micro benchmark
> > tests as
> > they basically mean nothing. We care about real workloads not made
> > up
> > ones.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> Reviewed-by: John Harrison <John.C.Harrison at Intel.com>

Also
Reviewed-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>

I think one important thing to keep in mind here is that this selftest
actually *did* find a flaw, Albeit it upon analysis turned out not to
be significant.

But given that, user-space clients like, for example, gem_exec_suspend
seems to be able to trigger similar delays as well at least to some
extend with a huge amount of small requests submitted from user-space
we shold probably verify at some point that this isn't exploitable by a
malicious client starving other clients on the same system.

/Thomas


> 
> > ---
> >   drivers/gpu/drm/i915/selftests/i915_request.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c
> > b/drivers/gpu/drm/i915/selftests/i915_request.c
> > index d67710d10615..6496671a113c 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> > @@ -2805,7 +2805,7 @@ static int p_sync0(void *arg)
> >                 i915_request_add(rq);
> >   
> >                 err = 0;
> > -               if (i915_request_wait(rq, 0, HZ / 5) < 0)
> > +               if (i915_request_wait(rq, 0, HZ) < 0)
> >                         err = -ETIME;
> >                 i915_request_put(rq);
> >                 if (err)
> > @@ -2876,7 +2876,7 @@ static int p_sync1(void *arg)
> >                 i915_request_add(rq);
> >   
> >                 err = 0;
> > -               if (prev && i915_request_wait(prev, 0, HZ / 5) < 0)
> > +               if (prev && i915_request_wait(prev, 0, HZ) < 0)
> >                         err = -ETIME;
> >                 i915_request_put(prev);
> >                 prev = rq;
> 




More information about the dri-devel mailing list