[Intel-gfx] [PATCH 1/3] drm/i915/execlists: Force write serialisation into context image vs execution
mika.kuoppala at linux.intel.com
Thu Nov 8 12:13:42 UTC 2018
Chris Wilson <chris at chris-wilson.co.uk> writes:
> Quoting Mika Kuoppala (2018-11-08 12:00:39)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>> > Ensure that the writes into the context image are completed prior to the
>> > register mmio to trigger execution. Although previously we were assured
>> > by the SDM that all writes are flushed before an uncached memory
>> > transaction (our mmio write to submit the context to HW for execution),
>> > we have empirical evidence to believe that this is not actually the
>> > case.
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
>> I would have marked this also as References.
> I'm confident in my local results indicating some success here, albeit
> in not exactly the same quick death, but still out-of-bounds execution.
>> > References: https://bugs.freedesktop.org/show_bug.cgi?id=108315
>> > References: https://bugs.freedesktop.org/show_bug.cgi?id=106887
>> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> > ---
>> > drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++++-
>> > 1 file changed, 13 insertions(+), 1 deletion(-)
>> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> > index 22b57b8926fc..f7892ddb3f13 100644
>> > --- a/drivers/gpu/drm/i915/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> > @@ -380,7 +380,8 @@ static u64 execlists_update_context(struct i915_request *rq)
>> > reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
>> > - /* True 32b PPGTT with dynamic page allocation: update PDP
>> > + /*
>> > + * True 32b PPGTT with dynamic page allocation: update PDP
>> > * registers and point the unallocated PDPs to scratch page.
>> > * PML4 is allocated during ppgtt init, so this is not needed
>> > * in 48-bit mode.
>> > @@ -388,6 +389,17 @@ static u64 execlists_update_context(struct i915_request *rq)
>> > if (!i915_vm_is_48bit(&ppgtt->vm))
>> > execlists_update_context_pdps(ppgtt, reg_state);
>> > + /*
>> > + * Make sure the context image is complete before we submit it to HW.
>> > + *
>> > + * Ostensibly, writes (including the WCB) should be flushed prior to
>> > + * an uncached write such as our mmio register access, the empirical
>> > + * evidence (esp. on Braswell) suggests that the WC write into memory
>> > + * may not be visible to the HW prior to the completion of the UC
>> > + * register write and that we may begin execution from the context
>> > + * before its image is complete leading to invalid PD chasing.
>> > + */
>> > + wmb();
>> Let's put it into use and gather more evidence.
>> Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Aye. Sure about r-b? I'm quite happy to take an a-b since we're just
> postulating to gather evidence.
Agreed that a-b is more accurate. r-b would indicate I know what the
heck is going on there under the hood.
More information about the Intel-gfx