[Intel-gfx] [PATCH 1/2] drm/i915/bxt: work around HW coherency issue when accessing GPU seqno
Jesse Barnes
jbarnes at virtuousgeek.org
Wed Jun 10 08:33:26 PDT 2015
On 06/10/2015 08:26 AM, Imre Deak wrote:
> On ke, 2015-06-10 at 08:10 -0700, Jesse Barnes wrote:
>> On 06/10/2015 03:59 AM, Imre Deak wrote:
>>> I think the discussion here is about two separate things:
>>> 1. Possible ordering issue between the seqno store and the completion
>>> interrupt
>>> 2. Coherency issue that leaves the CPU with a stale view of the seqno
>>> indefinitely, which this patch works around
>>>
>>> I'm confident that in my case the problem is not due to ordering. If it
>>> was "only" ordering then the value would show up eventually. This is not
>>> the case though, __wait_for_request will see the stale value
>>> indefinitely even though it gets woken up periodically afterwards by the
>>> lost IRQ logic (with hangcheck disabled).
>>
>> Yeah, based on your workaround it sounds like the write from the CS is
>> landing in memory but failing to invalidate the associated CPU
>> cacheline. I assume mapping the HWSP as uncached also works around this
>> issue?
>
> I assume it would, but it would of course have a bigger overhead. Based
> on my testing the coherency problem happens only occasionally, so for
> the rest of the times we still would want to benefit from cached reads.
> See especially __i915_spin_request().
Yeah, pretty sure we want it cached given how often we read from it. I
was just curious if the UC mapping would address this just to narrow
things down even further.
>> I wonder if this is just an issue with GGTT mappings on BXT. If we had
>> per context HSWPs using PPGTT (and maybe even 48 bit PPGTT) mappings,
>> the snoop may be performed correctly... Looks like we don't have a
>> store_dword variant for explicit coherent or incoherent buffer writes
>> (though it does test PPGTT writes at least); that would make this easy
>> to test.
>
> Well, I tried to play around with the GGTT PTE/PAT flags to see if we're
> not setting something or there is a bit that is significant despite the
> specification. The spec says it's only the snoop flag that matters and
> everything maps to PAT index 0. That looks correct in our code. In any
> case I would expect that the MI_STORE_DATA_IMM would be coherent since
> this is explicitly stated in the spec.
>
> I also added a new testcase to gem_storedw_batches_loop that tests the
> same thing via PPGTT mappings, it fails the same way.
Ok interesting, so it sounds like a more general problem than just the GGTT.
Thanks,
Jesse
More information about the Intel-gfx
mailing list