[Intel-gfx] [PATCH] drm/i915/icl: Forcibly evict stale csb entries

Fri Dec 7 12:37:33 UTC 2018

Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-12-05 13:46:12)
>>  static void nop_submission_tasklet(unsigned long data)
>> @@ -1015,6 +1025,19 @@ static void process_csb(struct intel_engine_cs *engine)
>>         } while (head != tail);
>>  
>>         execlists->csb_head = head;
>> +
>> +       /*
>> +        * Gen11 has proven to fail wrt global observation point between
>> +        * entry and tail update, failing on the ordering and thus
>> +        * we see an old entry in the context status buffer.
>> +        *
>> +        * Forcibly evict out entries for the next gpu csb update,
>> +        * to increase the odds that we get a fresh entries with non
>> +        * working hardware. The cost for doing so comes out mostly with
>> +        * the wash as hardware, working or not, will need to do the
>> +        * invalidation before.
>> +        */
>> +       invalidate_csb_entries(&buf[0], &buf[GEN8_CSB_ENTRIES - 1]);
>
> If it works, this is a stroke of genius.
>
> If we hypothesize that the GPU did write the CSB entries before the head
> pointer and inserted a Global Observation point beforehand, then we
> theorize that they merely forgot the cc protocol, the writes to system memory is
> correctly, but unordered into the cpu cache.
>
> By using the clflush to evict our used cacheline, on the next pass we
> will pull in that CSB entry cacheline back in from memory (ordered by
> the rmb used for the ringbuffer) and so, if the HW engineer's
> insistence that they did remember their wmb, the CSB entries will be
> coherent with the head pointer.
>
> So we remove one piece of the puzzle at what should be negligible cost,
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>

Thank you for review and kind words, pushed.
-Mika