[Intel-gfx] [PATCH 02/20] drm/i915/gt: Couple up old virtual breadcrumb on new sibling

Chris Wilson chris at chris-wilson.co.uk
Tue May 12 08:49:25 UTC 2020


Quoting Tvrtko Ursulin (2020-05-12 09:41:01)
> 
> On 11/05/2020 08:57, Chris Wilson wrote:
> > The second try at staging the transfer of the breadcrumb. In part one,
> > we realised we could not simply move to the second engine as we were
> > only holding the breadcrumb lock on the first. So in commit 6c81e21a4742
> > ("drm/i915/gt: Stage the transfer of the virtual breadcrumb"), we
> > removed it from the first engine and marked up this request to reattach
> > the signaling on the new engine. However, this failed to take into
> > account that we only attach the breadcrumb if the new request is added
> > at the start of the queue, which if we are transferring, it is because
> > we know there to be a request to be signaled (and hence we would not be
> > attached). In this second try, we remove from the first list under its
> > lock, take ownership of the link, and then take the second lock to
> > complete the transfer.
> 
> Overall just an optimisation not to call i915_request_enable_breadcrumb, 
> I mean not add to the list indirectly?

The request that we need to add already has its breadcrumb enabled. The
request is on the veng->context.signals list, it's just that the veng is
on siblings[0] signalers list and we are no longer guaranteed to
generate an interrupt on engine.

There's an explosion in the current code due to the lists not moving
as expected on enabling the breadcrumb on the next request (because of
                if (pos == &ce->signals) /* catch transitions from empty list */
                        list_move_tail(&ce->signal_link, &b->signalers);

)

The explosion is on a dead list, but has on a couple of occasions looked
like

<4> [373.551331] RIP: 0010:i915_request_enable_breadcrumb+0x144/0x380 [i915]
<4> [373.551341] Code: c7 c2 20 f1 42 c0 48 c7 c7 77 85 28 c0 e8 44 bc f2 ec bf 01 00 00 00 e8 5a 8e f2 ec 31 f6 bf 09 00 00 00 e8 6e 09 e3 ec 0f 0b <3b> 45 80 0f 89 5d ff ff ff 48 8b 6d 08 4c 39 e5 75 ee 49 8b 4d 38
<4> [373.551356] RSP: 0018:ffffb64d0114b9f8 EFLAGS: 00010083
<4> [373.551363] RAX: 00000000000036b2 RBX: ffffa310385096c0 RCX: 0000000000000003
<4> [373.551372] RDX: 00000000000036b2 RSI: 000000002ac5cf63 RDI: 00000000ffffffff
<4> [373.551379] RBP: dead000000000122 R08: ffffa31047075a50 R09: 00000000fffffffe
<4> [373.551385] R10: 0000000053a90a70 R11: 000000005e84b7e5 R12: ffffa3103fde38c0
<4> [373.551392] R13: ffffa3103fde3888 R14: ffffa30ff0982328 R15: ffffa30ff0982000
<4> [373.551401] FS:  00007f19f3359e40(0000) GS:ffffa3104ed00000(0000) knlGS:0000000000000000
<4> [373.551410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [373.551414] CR2: 00007f19f2aac778 CR3: 0000000232b0c004 CR4: 00000000003606e0
<4> [373.551421] Call Trace:
<4> [373.551466]  ? dma_i915_sw_fence_wake+0x40/0x40 [i915]
<4> [373.551506]  ? dma_i915_sw_fence_wake+0x40/0x40 [i915]
<4> [373.551515]  __dma_fence_enable_signaling+0x60/0x160
<4> [373.551558]  ? dma_i915_sw_fence_wake+0x40/0x40 [i915]
<4> [373.551564]  dma_fence_add_callback+0x44/0xd0
<4> [373.551605]  __i915_sw_fence_await_dma_fence+0x6f/0xc0 [i915]
<4> [373.551665]  __i915_request_commit+0x442/0x5b0 [i915]
<4> [373.551721]  i915_gem_do_execbuffer+0x17fb/0x2eb0 [i915]

kasan/kcsan do not complain; it's just a broken list.
-Chris


More information about the Intel-gfx mailing list