[Intel-gfx] [RFC] drm/i915/tgl: Advanced preparser support for GPU relocs

Daniele Ceraolo Spurio daniele.ceraolospurio at intel.com
Fri Aug 23 17:01:03 UTC 2019



On 8/23/19 9:31 AM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-08-23 16:56:54)
>>
>>
>> On 8/23/19 8:52 AM, Chris Wilson wrote:
>>> Quoting Daniele Ceraolo Spurio (2019-08-23 16:39:14)
>>>>
>>>>
>>>> On 8/23/19 8:28 AM, Chris Wilson wrote:
>>>>> Quoting Chris Wilson (2019-08-23 16:10:48)
>>>>>> Quoting Daniele Ceraolo Spurio (2019-08-23 16:05:45)
>>>>>>>
>>>>>>>
>>>>>>> On 8/23/19 7:26 AM, Chris Wilson wrote:
>>>>>>>> Quoting Chris Wilson (2019-08-23 08:27:25)
>>>>>>>>> Quoting Daniele Ceraolo Spurio (2019-08-23 03:09:09)
>>>>>>>>>> TGL has an improved CS pre-parser that can now pre-fetch commands across
>>>>>>>>>> batch boundaries. This improves performances when lots of small batches
>>>>>>>>>> are used, but has an impact on self-modifying code. If we want to modify
>>>>>>>>>> the content of a batch from another ring/batch, we need to either
>>>>>>>>>> guarantee that the memory location is updated before the pre-parser gets
>>>>>>>>>> to it or we need to turn the pre-parser off around the modification.
>>>>>>>>>> In i915, we use self-modifying code only for GPU relocations.
>>>>>>>>>>
>>>>>>>>>> The pre-parser fetches across memory synchronization commands as well,
>>>>>>>>>> so the only way to guarantee that the writes land before the parser gets
>>>>>>>>>> to it is to have more instructions between the sync and the destination
>>>>>>>>>> than the parser FIFO depth, which is not an optimal solution.
>>>>>>>>>
>>>>>>>>> Well, our ABI is that memory is coherent before the breadcrumb of *each*
>>>>>>>>> batch. That is a fundamental requirement for our signaling to userspace.
>>>>>>>>> Please tell me that there is a context flag to turn this off, or we else
>>>>>>>>> we need to emit 32x flushes or whatever it takes.
>>>>>>>>
>>>>>>> Are you referring to the specific case where we have a request modifying
>>>>>>> an object that is then used as a batch in the next request? Because
>>>>>>> coherency of objects that are not executed as batches is not impacted.
>>>>>>
>>>>>> "Fetches across memory sync" sounds like a major ABI break. The batches
>>>>>> are a hard serialisation barrier, with memory coherency guaranteed prior
>>>>>> to the signaling at the end of one batch and clear caches guaranteed at
>>>>>> the start of the next.
>>>>>
>>>>> We have relocs, oa and sseu all using self-modifying code. I expect we
>>>>> will have PTE modifications and much more done via the GPU in the near
>>>>> future. All rely on the CS_STALL doing exactly what it says on the tin.
>>>>> -Chris
>>>>>
>>>>
>>>> I guess the easiest solution is then to keep the parser off outside of
>>>> user batches. We can default to off and then restore what the user has
>>>> programmed before the BBSTART. It's not a breach of contract if we say
>>>> that if you opt-in to the parser then you need to make sure your batches
>>>> are not self-modifying, right?
>>>
>>> Is it just the MI_ARB_ONOFF bits, and is that still a privileged
>>> command? i.e. can userspace change mode by itself, or it is a
>>> context-param?
>>
>> It's the ARB_CHECK, not the ARB_ONOFF, so yes, it is not privileged and
>> userspace can modify it itself. It would've been easier if it was a
>> context param :)
> 
> Does it go across a context switch? That might be an easy solution for
> our internal requests (already true for oa/sseu where we use one context
> to modify another). I do worry though if there might be leakage
> across our flush-invalidate barriers between userspace batches.

The pre-fetching? no, AFAIK that's confined to the context, so moving 
the relocs to another context would work; the status of the parser is 
also ctx save/restored. What leakage case are you worried about? The 
memory synchronization between contexts is unchanged, a context can only 
mess up its own instructions. AFAIU the only thing that's possible is 
that if a batch modifies the next batch in the same context then the CS 
won't see the update.

Daniele

> -Chris
> 


More information about the Intel-gfx mailing list