[Mesa-dev] [PATCH v3] r600g: Implement GL_ARB_draw_indirect for EG/CM

Marek Olšák maraeo at gmail.com
Sat Feb 7 01:27:23 PST 2015


I'm not sure if I fully understand, but it seems the problem is that
the start/index_bias is copied to SQ_VTX_BASE_VTX and not
VGT_INDX_OFFSET. I guess you can avoid using BASE_VTX by setting
fetch_type=NO_INDEX_OFFSET, right? If yes, then there is a simple
solution: The COPY_DW packet. It can be used to copy one dword between
memory and registers (it supports copying mem->mem, reg->reg,
reg->mem, mem->reg). It can be used to copy start/index_bias into
VGT_INDX_OFFSET. All you have to do is to disable SQ_VTX_BASE_VTX use
in the shader.

What do you think?

Marek

On Sat, Feb 7, 2015 at 1:42 AM, Glenn Kennard <glenn.kennard at gmail.com> wrote:
> On Fri, 06 Feb 2015 17:08:46 +0100, Marek Olšák <maraeo at gmail.com> wrote:
>
>> Please bump the size of vgt_state for the SQ_VTX_BASE_VTX_LOC
>> register. It's set by r600_init_atom in r600_state.c and
>> evergreen_state.c
>>
>> Please bump R600_MAX_DRAW_CS_DWORDS. It's an upper bound of how many
>> dwords draw_vbo can emit.
>>
>
> Thanks, will fix.
>
>> I don't understand what get_vfetch_type is good for. Could you please
>> explain it in the code? Also, I don't understand what constant buffer
>> fetches have to do with VertexID.
>>
>
> Will add some more blurb to get_vfetch_type, in particular i can point at
> the appropriate parts of gpu documentation.
>
> As for the interaction of buffer fetches and VertexID, i'll attempt to
> explain:
>
> The way R_03CFF0_SQ_VTX_BASE_VTX_LOC is delivered to the vertex shader is
> basically, it isn't. Instead what the
> hardware does is poke the 64 unique values (one per wavefront thread, "64
> state" in the documentation) into the fetch units into a hidden state
> hardware register which the shader cannot read, at least not in any way that
> i've been able to find.
>
> Setting FETCH_MODE=SQ_VTX_FETCH_VERTEX_DATA (=0) on a VFETCH instruction
> then tells the fetch unit to add the BASE_VTX and start instance offsets
> before reading the value - see r600_asm.c:r600_create_vertex_fetch_shader()
> which open codes 0 as the fetch mode for vertex fetches.
>
> This creates a problem for GLSL gl_VertexId, since the shader cannot apply
> the offset. Lets look at the shader for the
> tests/spec/arb_draw_indirect/vertexid.c piglit test case:
>
>                 "#version 140\n"
>                 "\n"
>                 "in vec4 piglit_vertex;\n"
>                 "out vec3 c;\n"
>                 "\n"
>                 "const vec3 colors[] = vec3[](\n"
>                 "       vec3(1, 0, 0),\n"
>                 "       vec3(1, 0, 0),\n"
>                 "       vec3(1, 0, 0),\n"
>                 "       vec3(1, 0, 0),\n"
>                 "\n"
> ...
>                 "       vec3(1, 0, 1),\n"
>                 "       vec3(1, 0, 1),\n"
>                 "       vec3(1, 0, 1),\n"
>                 "       vec3(1, 0, 1)\n"
>                 ");\n"
>                 "void main() {\n"
>                 "       c = colors[gl_VertexID];\n"
>                 "       gl_Position = piglit_vertex;\n"
>                 "}\n"
>
> Colors here is a constant array, and base offset needs to be applied to look
> up the correct color value - the GL 4.5 spec is quite clear that it should
> be applied to gl_VertexID. Since the hardware offers no way to add base
> instance to gl_VertexID, i do the next best thing and enable offset on the
> array fetch operation instead.
>
> The detection logic is quite hacky, since really it needs to look if the
> array expression depends in any way on gl_VertexId which requires looking at
> def use chains, which aren't available in r600_asm.c - can probably have SB
> compute the bit instead, but that sort of violates its "don't change program
> meaning" principle, not to mention different behavior with SB disabled.
>
> All the actual shaders that i've found using gl_VertexId in conjunction with
> indirect draws only use one constant array. I figure partial support at
> least approximately matches what the binary driver supports, which doesn't
> produce the correct value for gl_VertexId either for indirect draws in
> various cases - in particular if the shader tries to compare gl_VertexID
> against some other expression you get an incorrect value.
>
>
> The driver does something totally different for direct draws, it adds the
> base offset and start offset manually and feeds that to the hardware, with
> BASE_VTX always set to 0, which allows it to work for all cases. Not an
> option for indirect draws if you want any sort of performance out of them.
>
>
> So to sum up, gl_VertexID i don't see the hardware being fully capable of
> following the spec in conjunction with indirect drawing for all cases, at
> least not without some very slow fallbacks reading back the draw parameters
> to the cpu which is useless. One option would be to just drop the attempt at
> supporting gl_VertexID from this patch if it's deemed too hacky.


More information about the mesa-dev mailing list