[igt-dev] [Intel-gfx] [PATCH i-g-t v6] tests/perf_pmu: Verify engine busyness accuracy
Chris Wilson
chris at chris-wilson.co.uk
Mon Feb 19 10:26:34 UTC 2018
Quoting Tvrtko Ursulin (2018-02-19 09:57:20)
>
> On 19/02/2018 09:27, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-19 09:19:47)
> >>
> >> Do you have a link to BSW hang? Is that obviously related to PMU?
> >
> > It's only occurring in this test, just looks like an issue with the
> > spinner:
> >
> > [bsw] https://intel-gfx-ci.01.org/tree/drm-tip/kasan_2/fi-bsw-n3050/igt@perf_pmu@busy-accuracy-2-bcs0.html
>
> ...
> <0>[ 681.022677] perf_pmu-1516 1..s1 282520414us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a
> <0>[ 681.022838] perf_pmu-1516 1..s1 282520580us : execlists_submission_tasklet: bcs0 cs-irq head=5 [5?], tail=0 [0?]
> <0>[ 681.023001] perf_pmu-1516 1..s1 282520594us : execlists_submission_tasklet: bcs0 csb[0]: status=0x00000001:0x00000000, active=0x1
> <0>[ 681.023168] kworker/-338 1.... 298087910us : reset_common_ring: bcs0 seqno=a
> <0>[ 681.023321] ksoftirq-17 1..s. 298088483us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a
> <0>[ 681.023482] ksoftirq-17 1..s. 298088575us : execlists_submission_tasklet: bcs0 cs-irq head=0 [0], tail=1 [1]
> <0>[ 681.023644] ksoftirq-17 1..s. 298088579us : execlists_submission_tasklet: bcs0 csb[1]: status=0x00000018:0x00000003, active=0x1
> <0>[ 681.023811] ksoftirq-17 1..s. 298088581us : execlists_submission_tasklet: bcs0 out[0]: ctx=3.1, seqno=a
>
> Everything stops.
>
> > [kbl] https://intel-gfx-ci.01.org/tree/drm-tip/kasan_2/fi-kbl-7560u/igt@perf_pmu@busy-accuracy-2-bcs0.html
>
> ...
> <0>[ 506.745332] perf_pmu-1544 3..s1 107905835us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a
> <0>[ 506.745397] <idle>-0 2..s1 107905980us : execlists_submission_tasklet: bcs0 cs-irq head=2 [1?], tail=3 [3?]
> <0>[ 506.745440] <idle>-0 2..s1 107905983us : execlists_submission_tasklet: bcs0 csb[3]: status=0x00000001:0x00000000, active=0x1
> <0>[ 506.745498] kworker/-30 3.... 120840583us : reset_common_ring: bcs0 seqno=a
> <0>[ 506.745547] ksoftirq-29 3..s. 120840688us : execlists_submission_tasklet: bcs0 in[0]: ctx=3.1, seqno=a
> <0>[ 506.745598] in:imklo-499 2..s1 120840710us : execlists_submission_tasklet: bcs0 cs-irq head=0 [0], tail=1 [1]
> <0>[ 506.745637] in:imklo-499 2..s1 120840712us : execlists_submission_tasklet: bcs0 csb[1]: status=0x00000018:0x00000003, active=0x1
> <0>[ 506.745676] in:imklo-499 2..s1 120840713us : execlists_submission_tasklet: bcs0 out[0]: ctx=3.1, seqno=a
>
> Everything stops here.
>
> I have not idea what's happening here. In both cases I would expect the test
> to have exited after the GPU hang (or at least attempt to exit!), since it
> would detect it overran the timeout.
>
> Could it be stuck in gem_sync after the reset? Or somewhere else?
I think it's that we will be throwing the calibration off if it hangs.
If busy_ns = 10s, won't that generate a target idle time of 500s?
-Chris
More information about the igt-dev
mailing list