[Mesa-dev] Implementation of VK_KHR_draw_indirect_count extension for anv

Danylo Piliaiev danylo.piliaiev at gmail.com
Tue Sep 18 09:08:39 UTC 2018

On 9/17/18 7:03 PM, Jason Ekstrand wrote:
> On Mon, Sep 17, 2018 at 10:08 AM Danylo Piliaiev 
> <danylo.piliaiev at gmail.com <mailto:danylo.piliaiev at gmail.com>> wrote:
>     On 9/17/18 5:34 PM, Jason Ekstrand wrote:
>>     On Mon, Sep 17, 2018 at 8:34 AM Danylo Piliaiev
>>     <danylo.piliaiev at gmail.com <mailto:danylo.piliaiev at gmail.com>> wrote:
>>         Hi Jason,
>>         I have implemented the extension and it works, however before
>>         sending the patch I decided to see how it can interact with
>>         other extension - VK_EXT_conditional_render
>>         and got confused:
>>         From the spec it is not disallowed to call functions of
>>         VK_KHR_draw_indirect_count in conditional rendering block. So
>>         let's say that predicate of conditional rendering
>>         will result in FALSE, we call vkCmdDrawIndirectCountKHR which
>>         sees that there is already a predicate emitted and it should
>>         be taken into account, since it will be FALSE
>>         all next predicates should result in FALSE. The issue is that
>>         I don't see an easy way to do this.
>>         My current implementation uses the next predicate (it is same
>>         as in GL implementation):
>>                /* While draw_index < maxDrawCount the predicate's
>>         result will be
>>                 *  (draw_index == maxDrawCount) ^ TRUE = TRUE
>>                 * When draw_index == maxDrawCount the result is
>>                 *  (TRUE) ^ TRUE = FALSE
>>                 * After this all results will be:
>>                 *  (FALSE) ^ FALSE = FALSE
>>                 */
>>         anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
>>                   mip.LoadOperation    = LOAD_LOAD;
>>                   mip.CombineOperation = COMBINE_XOR;
>>                   mip.CompareOperation = COMPARE_SRCS_EQUAL;
>>                }
>>         But if the initial predicate state is FALSE then when
>>         draw_index equals maxDrawCount the result will be
>>         (FALSE) ^ TRUE = TRUE
>>         Which isn't something we want. But without "not equal"
>>         operation or without MI_MATH I don't see how to fix this.
>>     First off, thanks for looking into the combination of these two
>>     features.  Getting them to work together nicely is half of the
>>     difficulty of these two extensions.
>>     On platforms which support MI_MATH, I think we're probably better
>>     off just using it.  For Ivy Bridge, the only thing I could think
>>     to do when both are in use would be to do two MI_PREDICATEs for
>>     every draw call.  The first would be what you describe above and
>>     the second would be the MI_PREDICATE for the conditional render
>>     with COMBINE_AND.  When the condition is true, the AND would have
>>     no effect and you would get the behavior above.  If the condition
>>     is false, the above logic for implementing draw_indirect_count
>>     wouldn't matter because it would get ANDed with false.  On
>>     Haswell and later, it's likely more efficient to just use MI_MATH
>>     and avoid re-loading the draw count and condition on every draw
>>     call. (We could just leave the draw count in CS_GPR0, for
>>     instance.)  Does that work?
>     Looks like a plan. I'll try to go this path.
>     Also there is another interaction which wasn't thought of before:
>     Several vkCmdDrawIndirectCountKHR in conditional render block but
>     using MI_MATH should solve it.
> In that case, we'll have to basically re-do the conditional bit for 
> every draw call.  There may be some interesting interactions here with 
> secondary command buffers as well.  I don't remember what we decided 
> about inheriting conditions in secondaries.  Again, if we decide we 
> need MI_MATH, then we'll just drop support for one or both extensions 
> on Ivy Bridge.
About the secondary command buffers:

If inherited conditional rendering 
is supported it means that secondary buffers can be executed inside 
conditional rendering block and commands which can be affected by 
conditional rendering are affected by it in secondary buffer and also in 
primary, is it right?
However at this point the secondary buffer is already composed without 
commands for conditions and since our implementation depends on commands 
emitted to the buffer making its commands to depend on condition either 
highly tricky to do (secondary buffer needs to have certain points where 
to inject conditions?) or just impossible.
And this secondary buffer may have been formed inside conditional render 
block so they could be affected by two conditions if I understand correctly.

Is is doable to implement?

>>     Since you're already looking at it, it may be best to implement
>>     the two extensions together as one patch series so we can be sure
>>     we have the interactions right.  If we can't get them to play
>>     nicely together, we may have to disable one of them on Ivy Bridge
>>     and I'd rather not enable an extension and then take the
>>     functionality away later.
>     I agree, the extensions are too interweaved which I realized when
>     implemented the most basic version of EXT_conditional_render. I'll
>     also make sure to test all of these on Ivy Bridge.
>>         I don't see anything related in Vulkan or GL specs neither I
>>         see anything in Piglit and CTS tests.
>>         Maybe I'm missing something obvious, could you help me here?
>>     There's nothing preventing the two from being used together.  If
>>     we don't have piglit tests that exercise the GL versions
>>     together, that would be bad.  Have you found good Vulkan CTS
>>     tests for either of those two extensions? VK_KHR_indirect_count
>>     should have tests since it's a KHR extension but we may need to
>>     write the tests for EXT_conditional_render.
>     There are no tests of how these features work together in Piglit
>     or Vulkan CTS. Also my previous observations are true for GL so it
>     also ought to be fixed (I'll write a test for Piglit first to
>     confirm this).
>     There are tests for VK_KHR_indirect_count in Vulkan CTS which my
>     current implementation passes. There aren't any tests for 
>     EXT_conditional_render however I used an example from
>     https://github.com/SaschaWillems/Vulkan to test my initial
>     implementation of it.
>     Should tests for EXT_conditional_render go into Vulkan CTS?
> Yes, tests should go into the Vulkan CTS whenever possible.
>     Also since the scope of the work grew quite a lot and I'll be soon
>     on vacation the implementation won't be ready until at least
>     second week of October (just making sure no one will think I ran
>     away scared =) )
> That's ok, I understand.
> --Jason
>>     --Jason
>>         You can find current implementation in
>>         https://gitlab.freedesktop.org/GL/mesa/commit/9d1c7ae0db618c6f7281d5f667c96612ff0bb2c2
>>         - Danil
>>         On 9/12/18 6:30 PM, Danylo Piliaiev wrote:
>>>         Hi,
>>>         Thank you for the directions!
>>>         On 9/12/18 6:13 PM, Jason Ekstrand wrote:
>>>>         Danylo,
>>>>         You're free to implement anything not already implemented. 
>>>>         Here are some other (probably simpler) extensions that I
>>>>         think can be reasonably implemented on Intel HW:
>>>>          - VK_EXT_conservative_rasterization
>>>>          - VK_EXT_conditional_render
>>>         Didn't see them, will take closer look later.
>>>>         As far as VK_KHR_draw_indirect_count go, I haven't
>>>>         implemented it yet because the "proper" implementation is
>>>>         actually kind-of painful though not impossible. In general,
>>>>         there are two ways it can be done:
>>>>         ## 1. The cheap and easy way
>>>>         The spec explicitly allows for the cheap and easy way by
>>>>         requiring the caller to pass in a maxDrawCount.  The idea
>>>>         here would be to emit maxDrawCount draw calls only have
>>>>         each one of them predicated on draw_id <
>>>>         draw_count_from_buffer.  This one probably wouldn't take
>>>>         much to wire up but it does mean doing maxDrawCount
>>>>         3DPRIMITIVE commands no matter how many of them are
>>>>         actually needed.
>>>         I saw such implementation for i965, looked straightforward
>>>         and I thought it will easily translate into Vulkan
>>>         implementation. Didn't know that it's possible to do it
>>>         other way on Intel.
>>>>         ## 2. The hard but maybe more correct way
>>>>         The Intel command streamer does have the ability, if used
>>>>         carefully, to loop.  The difficulty here isn't in looping;
>>>>         that can be done fairly easily on gen8+ by emitting a
>>>>         predicated MI_BATCH_BUFFER_START that's predicated off of
>>>>         the looping condition which jumps to the top of the loop. 
>>>>         The real difficult bit is taking your loop counter and
>>>>         using it to indirectly access the array of draw
>>>>         information. In order to do this, you have to have a
>>>>         self-modifying batch buffer.  In short, you would emit MI
>>>>         commands which read the draw information into registers and
>>>>         also emit MI commands (which would probably come before the
>>>>         first set) which write the actual address into the location
>>>>         in the batch where the first set of MI commands has their
>>>>         address to read from.  This would be a painful to debug
>>>>         mess of GPU hangs but could actually be kind-of fun to
>>>>         implement.
>>>         The correct way looks interesting, I'll need some time to
>>>         understand details.
>>>>         I hope I haven't scarred you away from working on anv; I
>>>>         just wanted to make it clear what you're getting yourself
>>>>         into.  Both ways are totally implementable and I think
>>>>         you'd pretty much have to do the first method on gen7 if we
>>>>         really care about supporting it there.  The second is
>>>>         totally doable, it'll just involve some headaches when it's
>>>>         broken.  If you want to continue with this project after
>>>>         reading my scarry e-mail, I recommend starting with method
>>>>         1 to get your feet wet and then we can look into method 2
>>>>         once you have that working.
>>>         I'll follow your recommendation and will start from the
>>>         first method.
>>>         - Danil
>>>>         --Jason
>>>>         On Wed, Sep 12, 2018 at 6:36 AM Danylo Piliaiev
>>>>         <danylo.piliaiev at gmail.com
>>>>         <mailto:danylo.piliaiev at gmail.com>> wrote:
>>>>             Hello everyone,
>>>>             I would like to try to implement one of the Vulkan
>>>>             extensions -
>>>>             VK_KHR_draw_indirect_count for anv,
>>>>             unless someone is already working on it.
>>>>             It's a relatively minor extension and I saw that the
>>>>             same functionality
>>>>             is already implemented
>>>>             for ARB_indirect_parameters in i965.
>>>>             Also I would appreciate any tips if there are any known
>>>>             possible tricky
>>>>             parts.
>>>>             - Danil
>>>>             _______________________________________________
>>>>             mesa-dev mailing list
>>>>             mesa-dev at lists.freedesktop.org
>>>>             <mailto:mesa-dev at lists.freedesktop.org>
>>>>             https://lists.freedesktop.org/mailman/listinfo/mesa-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180918/d26ca415/attachment-0001.html>

More information about the mesa-dev mailing list