[Intel-gfx] [PATCH 1/3] drm/i915/execlists: Force write serialisation into context image vs execution

Mika Kuoppala mika.kuoppala at linux.intel.com
Thu Nov 8 12:13:42 UTC 2018


Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-11-08 12:00:39)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>> 
>> > Ensure that the writes into the context image are completed prior to the
>> > register mmio to trigger execution. Although previously we were assured
>> > by the SDM that all writes are flushed before an uncached memory
>> > transaction (our mmio write to submit the context to HW for execution),
>> > we have empirical evidence to believe that this is not actually the
>> > case.
>> >
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
>> 
>> I would have marked this also as References.
>
> I'm confident in my local results indicating some success here, albeit
> in not exactly the same quick death, but still out-of-bounds execution.
>  
>> > References: https://bugs.freedesktop.org/show_bug.cgi?id=108315
>> > References: https://bugs.freedesktop.org/show_bug.cgi?id=106887
>> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++++-
>> >  1 file changed, 13 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> > index 22b57b8926fc..f7892ddb3f13 100644
>> > --- a/drivers/gpu/drm/i915/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> > @@ -380,7 +380,8 @@ static u64 execlists_update_context(struct i915_request *rq)
>> >  
>> >       reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
>> >  
>> > -     /* True 32b PPGTT with dynamic page allocation: update PDP
>> > +     /*
>> > +      * True 32b PPGTT with dynamic page allocation: update PDP
>> >        * registers and point the unallocated PDPs to scratch page.
>> >        * PML4 is allocated during ppgtt init, so this is not needed
>> >        * in 48-bit mode.
>> > @@ -388,6 +389,17 @@ static u64 execlists_update_context(struct i915_request *rq)
>> >       if (!i915_vm_is_48bit(&ppgtt->vm))
>> >               execlists_update_context_pdps(ppgtt, reg_state);
>> >  
>> > +     /*
>> > +      * Make sure the context image is complete before we submit it to HW.
>> > +      *
>> > +      * Ostensibly, writes (including the WCB) should be flushed prior to
>> > +      * an uncached write such as our mmio register access, the empirical
>> > +      * evidence (esp. on Braswell) suggests that the WC write into memory
>> > +      * may not be visible to the HW prior to the completion of the UC
>> > +      * register write and that we may begin execution from the context
>> > +      * before its image is complete leading to invalid PD chasing.
>> > +      */
>> > +     wmb();
>> 
>> Let's put it into use and gather more evidence.
>> 
>> Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>
> Aye. Sure about r-b? I'm quite happy to take an a-b since we're just
> postulating to gather evidence.

Agreed that a-b is more accurate. r-b would indicate I know what the
heck is going on there under the hood.
-Mika


More information about the Intel-gfx mailing list