[Intel-gfx] [PATCH] drm/i915: Park signaling thread while wrapping the seqno
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Oct 26 11:05:01 UTC 2018
On 26/10/2018 11:08, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-10-26 10:58:22)
>>
>> On 24/10/2018 11:49, Chris Wilson wrote:
>>> A danger encountered when resetting the seqno (using
>>> debugfs/i915_next_seqno) is that as we change the breadcrumb stored in
>>> the HWSP, it may be inspected by the signaler thread leading to
>>> confusion in our sanity checks.
>>>
>>> <0> [136.331342] i915/sig-347 3..s1 136336154us : execlists_submission_tasklet: rcs0 awake?=1, active=5
>>> <0> [136.331373] i915/sig-347 3d.s2 136336155us : process_csb: rcs0 cs-irq head=5, tail=0
>>> <0> [136.331402] i915/sig-347 3d.s2 136336155us : process_csb: rcs0 csb[0]: status=0x00000018:0x00000002, active=0x5
>>> <0> [136.331434] i915/sig-347 3d.s2 136336156us : process_csb: rcs0 out[0]: ctx=2.1, global=219 (fence 46:8455) (current 219), prio=0
>>> <0> [136.331466] i915/sig-347 3d.s2 136336156us : process_csb: rcs0 completed ctx=2
>>> <0> [136.332027] gem_exec-1049 0.... 136336246us : reset_all_global_seqno.part.5: rcs0 seqno 219 (current 219) -> -43
>>> <0> [136.332056] gem_exec-1049 0.... 136336251us : reset_all_global_seqno.part.5: bcs0 seqno 183 (current 183) -> -43
>>> <0> [136.332085] gem_exec-1049 0.... 136336255us : reset_all_global_seqno.part.5: vcs0 seqno 191 (current 191) -> -43
>>> <0> [136.332114] gem_exec-1049 0.... 136336259us : reset_all_global_seqno.part.5: vcs1 seqno 180 (current 180) -> -43
>>> <0> [136.332143] gem_exec-1049 0.... 136336262us : reset_all_global_seqno.part.5: vecs0 seqno 212 (current 212) -> -43
>>> <0> [136.332174] i915/sig-347 3.... 136336280us : intel_breadcrumbs_signaler: intel_breadcrumbs_signaler:673 GEM_BUG_ON(!i915_request_completed(rq))
>>
>> Why this can't happen with normal seqno wrap? And if it can, do we need
>> a fixes tag?
>
> It can only happen if we go backwards in seqno.
>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_request.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 28819f8c4da6..71107540581d 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -136,6 +136,8 @@ static int reset_all_global_seqno(struct drm_i915_private *i915, u32 seqno)
>>> intel_engine_get_seqno(engine),
>>> seqno);
>>>
>>> + kthread_park(engine->breadcrumbs.signaler);
>>> +
>>
>> Needed even if increasing the seqno? It wouldn't work to do it under the
>> i915_seqno_passed below?
>
> It's not needed, just the onion felt better. Cost is not really an
> issue, we've already idled the GPU so flushing the kthread is not
> worrying.
>
> We might be able to replace the onion with a
> static void __signaler_flush(struct intel_engine_cs *engine)
> {
> struct task_stuct *tsk = engine->breadcrumbs.signaler;
>
> kthread_park(tsk);
> kthread_unpark(tsk);
> }
> call from inside !seqno_passed, but I feel more confident if the
> signaler cannot be kicked while the HWSP is in flux.
Okay, makes sense.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list