[Intel-gfx] ✗ Fi.CI.IGT: failure for Default request/fence expiry + watchdog (rev3)

Daniel Vetter daniel at ffwll.ch
Mon Mar 22 13:41:36 UTC 2021


On Mon, Mar 22, 2021 at 01:37:58PM +0000, Tvrtko Ursulin wrote:
> 
> On 19/03/2021 01:17, Patchwork wrote:
> 
> Okay with 20s default expiration the hangcheck tests on Tigerlake pass and
> we are left with these failures:
> 
> >       IGT changes
> > 
> > 
> >         Possible regressions
> > 
> >   *
> > 
> >     igt at gem_ctx_ringsize@idle at bcs0:
> > 
> >       o shard-skl: PASS
> >         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9870/shard-skl10/igt@gem_ctx_ringsize@idle@bcs0.html>
> >         -> INCOMPLETE
> >         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19806/shard-skl7/igt@gem_ctx_ringsize@idle@bcs0.html>
> 
> Too many runnable requests on a slow Skylake SKU with command parsing
> active. Too many to finish withing the 20s default expiration that is. This
> is actually the same root cause as the below tests tries to explicitly
> demonstrate:
> 
> >   *
> > 
> >     {igt at gem_watchdog@far-fence at bcs0} (NEW):
> > 
> >       o shard-glk: NOTRUN -> FAIL
> >         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19806/shard-glk7/igt@gem_watchdog@far-fence@bcs0.html>
> >   *
> > 
> >     {igt at gem_watchdog@far-fence at vcs0} (NEW):
> > 
> >       o shard-apl: NOTRUN -> FAIL
> >         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19806/shard-apl1/igt@gem_watchdog@far-fence@vcs0.html>
> >         +2 similar issues
> >   *
> > 
> >     {igt at gem_watchdog@far-fence at vecs0} (NEW):
> > 
> >       o shard-kbl: NOTRUN -> FAIL
> >         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19806/shard-kbl7/igt@gem_watchdog@far-fence@vecs0.html>
> >         +2 similar issues
> 
> The vulnerability default expiration adds compared to the current state is
> applicable to heaviliy loaded systems where GPU is shared between multiple
> clients.
> 
> Otherwise series seems to work. Failing tests can be blacklisted going
> forward. Ack to merge and merge itself, after review, I leave to maintainers
> since personally I am not supportive of this mechanism.

Yeah I think we have some leftovers to look at after this has landed on
igt side, since with 20s we're rather long on the timeout side, and some
of the tests need to be resurrected with the preempt-ctx execbuf mode I
think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list