[Mesa-dev] [PATCH] drirc: add option to disable ARB_draw_indirect

Eric Anholt eric at anholt.net
Mon Dec 18 18:17:38 UTC 2017


Nicolai Hähnle <nicolai.haehnle at amd.com> writes:

> On 15.12.2017 12:37, Rob Clark wrote:
>> On Fri, Dec 15, 2017 at 4:41 AM, Nicolai Hähnle <nicolai.haehnle at amd.com> wrote:
>>> On 15.12.2017 00:56, Rob Clark wrote:
>>>>
>>>> On Wed, Dec 6, 2017 at 3:31 PM, Ian Romanick <idr at freedesktop.org> wrote:
>>>>>
>>>>> On 12/05/2017 08:25 AM, Ilia Mirkin wrote:
>>>>>>
>>>>>> On Tue, Dec 5, 2017 at 8:18 AM, Emil Velikov <emil.l.velikov at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Rob,
>>>>>>>
>>>>>>> On 5 December 2017 at 12:54, Rob Clark <robdclark at gmail.com> wrote:
>>>>>>>>
>>>>>>>> This is a bit sad/annoying.  But with current GPU firmware (at least
>>>>>>>> on
>>>>>>>> a5xx) we can support both draw-indirect and base-instance.  But we
>>>>>>>> can't
>>>>>>>> support draw-indirect with a non-zero base-instance specified.  So add
>>>>>>>> a
>>>>>>>> driconf option to hide the extension from games that are known to use
>>>>>>>> both.
>>>>>>>>
>>>>>>>> Signed-off-by: Rob Clark <robdclark at gmail.com>
>>>>>>>> ---
>>>>>>>> Tbh, I'm also not really sure what to do when/if we got updated
>>>>>>>> firmware
>>>>>>>> which handled draw-indirect with base-instance, since we'd need to
>>>>>>>> make
>>>>>>>> this option conditional on fw version.  For STK that probably isn't a
>>>>>>>> big deal since it doesn't use draw-indirect in a particularly useful
>>>>>>>> way
>>>>>>>> (the indirect buffer is generated on CPU).
>>>>>>>>
>>>>>>> Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
>>>>>>> disable the extension) as it detects buggy FW?
>>>>>>> This is what radeons have been doing as they encounter iffy firmware or
>>>>>>> LLVM.
>>>>>>>
>>>>>>> AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
>>>>>>
>>>>>>
>>>>>> Rob is this -><- close to ES 3.1, so that's not a great option.
>>>>>
>>>>>
>>>>> And I don't suppose there's a way to get updated firmware?  i965 has
>>>>> similar sorts of cases where higher versions are disabled due to missing
>>>>> kernel features.
>>>>>
>>>>
>>>> so after r/e the instruction set for the CP microcontrollers and
>>>> writing a disassembler and assembler[1], and figuring out how the fw
>>>> handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
>>>> to the conclusion that the issue isn't actually with draw-indirect vs
>>>> base-instance (at least not w/ the fw from my pixel2 which md5sum
>>>> claims is the same as what is in linux-firmware.. it is possible that
>>>> I was using an earlier version of the fw before when I came to this
>>>> conclusion).  On the plus side, the PFP/ME microcontrollers that parse
>>>> the cmdstream are pretty neat and I learned some useful stuff along
>>>> the way.
>>>>
>>>> But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
>>>> and update the draw-indirect buffers, it seems to me there are plenty
>>>> of ways this can go wrong w/ tilers (and even more when you throw
>>>> re-ordering into the mix).  Possibly I should disable reordering when
>>>> the indirect buffer is mapped w/ PERSISTENT bit, although for games
>>>> like stk this is probably counter-productive vs just hiding the
>>>> draw-indirect extension.. for games that actually use the GPU to write
>>>> the draw-indirect buffer it shouldn't be a problem.  So I think a
>>>> driconf patch like this probably still ends up being useful in the
>>>> end.
>>>
>>>
>>> Can you detail a bit what you think could go wrong? I believe that the
>>> intention of the GL spec is that reordering in tilers should be possible at
>>> least for buffers that are mapped PERSISTENT but not COHERENT.
>>>
>>> You may only have to block reordering if the buffer is mapped both
>>> PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.
>>>
>>> Granted, the spec is unclear as usual when it comes to these memory
>>> synchronization issues -- the description of the MAP_COHERENT_BIT in section
>>> 6.2 does not mention WAR hazards (in particular, Write by client after Read
>>> by server) -- but perhaps that can be fixed.
>>>
>>> To go into a bit more detail, what I suspect you're worried about is
>>> applications doing stuff like:
>>>
>>> 1. Write to indirect buffer (persistently & coherently mapped)
>>> 2. Draw*Indirect
>>> 3. Write to the same location in the indirect buffer
>>> 4. Draw*Indirect
>>>
>>> ... but this is bound to fail with "normal" GPUs (like ours) as well.
>>> Perhaps you have a different scenario in mind?
>> 
>> yeah, this was basically the scenario I had in mind.. although I'm
>> perhaps more aggressive in deferring rendering, to the point of
>> re-ordering draws if unnecessary fbo switches are made.  Normally I
>> track which buffers are read and written in a given batch (draw pass)
>> in order to preserve correctness (and in some cases shadowing or doing
>> a staging transfer to update buffers/textures to avoid splitting a
>> batch).  Perhaps it is only an issue w/ persistent+coherent, but w/
>> cpu updating buffer without driver knowing when is kind of
>> sub-optimal.
>> 
>> I'm thinking I do need to keep track when there are outstanding
>> coherent+persistent transfers mapped and switch off some of the
>> cleverness.
>
> I'm not convinced that this is actually necessary. You probably only 
> need to break for things like glFlushMappedBufferRange() and glFenceSync().
>
> The reasoning is basically this: unless one of those synchronizing 
> commands is used, every desktop GPU will effectively re-order a sequence of:
>
> 1. Write #1 to persistent-mapped buffer
> 2. Draw #1
> 3. Write #2 to persistent-mapped buffer
> 4. Draw #2
>
> to:
>
> 1. Write #1 to persistent-mapped buffer
> 2. Write #2 to persistent-mapped buffer
> 3. Draw #1
> 4. Draw #2
>
> Correct me if I'm wrong, but the way I understand tiled GPUs, you might 
> in addition do some multi-passing that ends up as:
>
> 1. Write #1 to persistent-mapped buffer
> 2. Write #2 to persistent-mapped buffer
> 3.0. Draw #1, tile 0
> 4.0. Draw #2, tile 0
> 3.1. Draw #1, tile 1
> 4.1. Draw #2, tile 1
> ... etc. ...
>
> So as far as interactions with persistent-mapped buffers are concerned, 
> tiling just doesn't make a difference.

Yeah, that's right.

I don't see how being a tiler would affect persistent/coherent mappings.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20171218/312dd726/attachment-0001.sig>


More information about the mesa-dev mailing list