[Intel-gfx] [RFC] drm/i915/tgl: Advanced preparser support for GPU relocs

Fri Aug 23 07:27:25 UTC 2019

Quoting Daniele Ceraolo Spurio (2019-08-23 03:09:09)
> TGL has an improved CS pre-parser that can now pre-fetch commands across
> batch boundaries. This improves performances when lots of small batches
> are used, but has an impact on self-modifying code. If we want to modify
> the content of a batch from another ring/batch, we need to either
> guarantee that the memory location is updated before the pre-parser gets
> to it or we need to turn the pre-parser off around the modification.
> In i915, we use self-modifying code only for GPU relocations.
> 
> The pre-parser fetches across memory synchronization commands as well,
> so the only way to guarantee that the writes land before the parser gets
> to it is to have more instructions between the sync and the destination
> than the parser FIFO depth, which is not an optimal solution.

Well, our ABI is that memory is coherent before the breadcrumb of *each*
batch. That is a fundamental requirement for our signaling to userspace.
Please tell me that there is a context flag to turn this off, or we else
we need to emit 32x flushes or whatever it takes.
-Chris