[Mesa-dev] [PATCH] drirc: add option to disable ARB_draw_indirect

Nicolai Hähnle nicolai.haehnle at amd.com
Fri Dec 15 18:08:03 UTC 2017


On 15.12.2017 12:37, Rob Clark wrote:
> On Fri, Dec 15, 2017 at 4:41 AM, Nicolai Hähnle <nicolai.haehnle at amd.com> wrote:
>> On 15.12.2017 00:56, Rob Clark wrote:
>>>
>>> On Wed, Dec 6, 2017 at 3:31 PM, Ian Romanick <idr at freedesktop.org> wrote:
>>>>
>>>> On 12/05/2017 08:25 AM, Ilia Mirkin wrote:
>>>>>
>>>>> On Tue, Dec 5, 2017 at 8:18 AM, Emil Velikov <emil.l.velikov at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Rob,
>>>>>>
>>>>>> On 5 December 2017 at 12:54, Rob Clark <robdclark at gmail.com> wrote:
>>>>>>>
>>>>>>> This is a bit sad/annoying.  But with current GPU firmware (at least
>>>>>>> on
>>>>>>> a5xx) we can support both draw-indirect and base-instance.  But we
>>>>>>> can't
>>>>>>> support draw-indirect with a non-zero base-instance specified.  So add
>>>>>>> a
>>>>>>> driconf option to hide the extension from games that are known to use
>>>>>>> both.
>>>>>>>
>>>>>>> Signed-off-by: Rob Clark <robdclark at gmail.com>
>>>>>>> ---
>>>>>>> Tbh, I'm also not really sure what to do when/if we got updated
>>>>>>> firmware
>>>>>>> which handled draw-indirect with base-instance, since we'd need to
>>>>>>> make
>>>>>>> this option conditional on fw version.  For STK that probably isn't a
>>>>>>> big deal since it doesn't use draw-indirect in a particularly useful
>>>>>>> way
>>>>>>> (the indirect buffer is generated on CPU).
>>>>>>>
>>>>>> Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
>>>>>> disable the extension) as it detects buggy FW?
>>>>>> This is what radeons have been doing as they encounter iffy firmware or
>>>>>> LLVM.
>>>>>>
>>>>>> AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
>>>>>
>>>>>
>>>>> Rob is this -><- close to ES 3.1, so that's not a great option.
>>>>
>>>>
>>>> And I don't suppose there's a way to get updated firmware?  i965 has
>>>> similar sorts of cases where higher versions are disabled due to missing
>>>> kernel features.
>>>>
>>>
>>> so after r/e the instruction set for the CP microcontrollers and
>>> writing a disassembler and assembler[1], and figuring out how the fw
>>> handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
>>> to the conclusion that the issue isn't actually with draw-indirect vs
>>> base-instance (at least not w/ the fw from my pixel2 which md5sum
>>> claims is the same as what is in linux-firmware.. it is possible that
>>> I was using an earlier version of the fw before when I came to this
>>> conclusion).  On the plus side, the PFP/ME microcontrollers that parse
>>> the cmdstream are pretty neat and I learned some useful stuff along
>>> the way.
>>>
>>> But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
>>> and update the draw-indirect buffers, it seems to me there are plenty
>>> of ways this can go wrong w/ tilers (and even more when you throw
>>> re-ordering into the mix).  Possibly I should disable reordering when
>>> the indirect buffer is mapped w/ PERSISTENT bit, although for games
>>> like stk this is probably counter-productive vs just hiding the
>>> draw-indirect extension.. for games that actually use the GPU to write
>>> the draw-indirect buffer it shouldn't be a problem.  So I think a
>>> driconf patch like this probably still ends up being useful in the
>>> end.
>>
>>
>> Can you detail a bit what you think could go wrong? I believe that the
>> intention of the GL spec is that reordering in tilers should be possible at
>> least for buffers that are mapped PERSISTENT but not COHERENT.
>>
>> You may only have to block reordering if the buffer is mapped both
>> PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.
>>
>> Granted, the spec is unclear as usual when it comes to these memory
>> synchronization issues -- the description of the MAP_COHERENT_BIT in section
>> 6.2 does not mention WAR hazards (in particular, Write by client after Read
>> by server) -- but perhaps that can be fixed.
>>
>> To go into a bit more detail, what I suspect you're worried about is
>> applications doing stuff like:
>>
>> 1. Write to indirect buffer (persistently & coherently mapped)
>> 2. Draw*Indirect
>> 3. Write to the same location in the indirect buffer
>> 4. Draw*Indirect
>>
>> ... but this is bound to fail with "normal" GPUs (like ours) as well.
>> Perhaps you have a different scenario in mind?
> 
> yeah, this was basically the scenario I had in mind.. although I'm
> perhaps more aggressive in deferring rendering, to the point of
> re-ordering draws if unnecessary fbo switches are made.  Normally I
> track which buffers are read and written in a given batch (draw pass)
> in order to preserve correctness (and in some cases shadowing or doing
> a staging transfer to update buffers/textures to avoid splitting a
> batch).  Perhaps it is only an issue w/ persistent+coherent, but w/
> cpu updating buffer without driver knowing when is kind of
> sub-optimal.
> 
> I'm thinking I do need to keep track when there are outstanding
> coherent+persistent transfers mapped and switch off some of the
> cleverness.

I'm not convinced that this is actually necessary. You probably only 
need to break for things like glFlushMappedBufferRange() and glFenceSync().

The reasoning is basically this: unless one of those synchronizing 
commands is used, every desktop GPU will effectively re-order a sequence of:

1. Write #1 to persistent-mapped buffer
2. Draw #1
3. Write #2 to persistent-mapped buffer
4. Draw #2

to:

1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3. Draw #1
4. Draw #2

Correct me if I'm wrong, but the way I understand tiled GPUs, you might 
in addition do some multi-passing that ends up as:

1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3.0. Draw #1, tile 0
4.0. Draw #2, tile 0
3.1. Draw #1, tile 1
4.1. Draw #2, tile 1
... etc. ...

So as far as interactions with persistent-mapped buffers are concerned, 
tiling just doesn't make a difference.


> All the same, the way stk uses draw-indirect is not useful and it
> would be better to shut it off, at least for me.  Although perhaps
> instead of adding things like this to driconf extension by extension,
> it would be more useful to have a way to provide a default
> MESA_EXTENSION_OVERRIDE via driconf?

Yeah, that makes sense.

Cheers,
Nicolai


> 
> BR,
> -R
> 



More information about the mesa-dev mailing list