[Intel-gfx] [PATCH 1/2] drm/i915/bxt: work around HW coherency issue when accessing GPU seqno

Jesse Barnes jbarnes at virtuousgeek.org
Wed Jun 10 08:33:26 PDT 2015


On 06/10/2015 08:26 AM, Imre Deak wrote:
> On ke, 2015-06-10 at 08:10 -0700, Jesse Barnes wrote:
>> On 06/10/2015 03:59 AM, Imre Deak wrote:
>>> I think the discussion here is about two separate things:
>>> 1. Possible ordering issue between the seqno store and the completion
>>> interrupt
>>> 2. Coherency issue that leaves the CPU with a stale view of the seqno
>>> indefinitely, which this patch works around
>>>
>>> I'm confident that in my case the problem is not due to ordering. If it
>>> was "only" ordering then the value would show up eventually. This is not
>>> the case though, __wait_for_request will see the stale value
>>> indefinitely even though it gets woken up periodically afterwards by the
>>> lost IRQ logic (with hangcheck disabled).
>>
>> Yeah, based on your workaround it sounds like the write from the CS is
>> landing in memory but failing to invalidate the associated CPU
>> cacheline.  I assume mapping the HWSP as uncached also works around this
>> issue?
> 
> I assume it would, but it would of course have a bigger overhead. Based
> on my testing the coherency problem happens only occasionally, so for
> the rest of the times we still would want to benefit from cached reads.
> See especially __i915_spin_request().

Yeah, pretty sure we want it cached given how often we read from it.  I
was just curious if the UC mapping would address this just to narrow
things down even further.

>> I wonder if this is just an issue with GGTT mappings on BXT.  If we had
>> per context HSWPs using PPGTT (and maybe even 48 bit PPGTT) mappings,
>> the snoop may be performed correctly...  Looks like we don't have a
>> store_dword variant for explicit coherent or incoherent buffer writes
>> (though it does test PPGTT writes at least); that would make this easy
>> to test.
> 
> Well, I tried to play around with the GGTT PTE/PAT flags to see if we're
> not setting something or there is a bit that is significant despite the
> specification. The spec says it's only the snoop flag that matters and
> everything maps to PAT index 0. That looks correct in our code. In any
> case I would expect that the MI_STORE_DATA_IMM would be coherent since
> this is explicitly stated in the spec.
> 
> I also added a new testcase to gem_storedw_batches_loop that tests the
> same thing via PPGTT mappings, it fails the same way.

Ok interesting, so it sounds like a more general problem than just the GGTT.

Thanks,
Jesse


More information about the Intel-gfx mailing list