[Intel-gfx] [PATCH] drm/i915/icl: Forcibly evict stale csb entries
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri Dec 7 12:37:33 UTC 2018
Chris Wilson <chris at chris-wilson.co.uk> writes:
> Quoting Mika Kuoppala (2018-12-05 13:46:12)
>> static void nop_submission_tasklet(unsigned long data)
>> @@ -1015,6 +1025,19 @@ static void process_csb(struct intel_engine_cs *engine)
>> } while (head != tail);
>>
>> execlists->csb_head = head;
>> +
>> + /*
>> + * Gen11 has proven to fail wrt global observation point between
>> + * entry and tail update, failing on the ordering and thus
>> + * we see an old entry in the context status buffer.
>> + *
>> + * Forcibly evict out entries for the next gpu csb update,
>> + * to increase the odds that we get a fresh entries with non
>> + * working hardware. The cost for doing so comes out mostly with
>> + * the wash as hardware, working or not, will need to do the
>> + * invalidation before.
>> + */
>> + invalidate_csb_entries(&buf[0], &buf[GEN8_CSB_ENTRIES - 1]);
>
> If it works, this is a stroke of genius.
>
> If we hypothesize that the GPU did write the CSB entries before the head
> pointer and inserted a Global Observation point beforehand, then we
> theorize that they merely forgot the cc protocol, the writes to system memory is
> correctly, but unordered into the cpu cache.
>
> By using the clflush to evict our used cacheline, on the next pass we
> will pull in that CSB entry cacheline back in from memory (ordered by
> the rmb used for the ringbuffer) and so, if the HW engineer's
> insistence that they did remember their wmb, the CSB entries will be
> coherent with the head pointer.
>
> So we remove one piece of the puzzle at what should be negligible cost,
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
Thank you for review and kind words, pushed.
-Mika
More information about the Intel-gfx
mailing list