[Mesa-dev] [PATCH 30/53] r600: create LDS info constants buffer and write LDS registers.
Marek Olšák
maraeo at gmail.com
Mon Nov 30 15:38:25 PST 2015
On Mon, Nov 30, 2015 at 1:30 PM, Marek Olšák <maraeo at gmail.com> wrote:
> On Mon, Nov 30, 2015 at 7:20 AM, Dave Airlie <airlied at gmail.com> wrote:
>> From: Dave Airlie <airlied at redhat.com>
>>
>> This creates a constant buffer with the information about
>> the layout of the LDS memory that is given to the vertex, tess
>> control and tess evaluation shaders.
>>
>> This also programs the LDS size and the LS_HS_CONFIG registers,
>> on evergreen only.
>>
>> Signed-off-by: Dave Airlie <airlied at redhat.com>
>> ---
>> src/gallium/drivers/r600/evergreen_state.c | 128 +++++++++++++++++++++++++++
>> src/gallium/drivers/r600/r600_pipe.h | 24 ++++-
>> src/gallium/drivers/r600/r600_state_common.c | 13 +++
>> 3 files changed, 162 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c
>> index c01e8e3..edc6f28 100644
>> --- a/src/gallium/drivers/r600/evergreen_state.c
>> +++ b/src/gallium/drivers/r600/evergreen_state.c
>> @@ -3763,3 +3763,131 @@ void evergreen_init_state_functions(struct r600_context *rctx)
>>
>> evergreen_init_compute_state_functions(rctx);
>> }
>> +
>> +/**
>> + * This calculates the LDS size for tessellation shaders (VS, TCS, TES).
>> + *
>> + * The information about LDS and other non-compile-time parameters is then
>> + * written to the const buffer.
>> +
>> + * const buffer contains -
>> + * uint32_t input_patch_size
>> + * uint32_t input_vertex_size
>> + * uint32_t num_tcs_input_cp
>> + * uint32_t num_tcs_output_cp;
>> + * uint32_t output_patch_size
>> + * uint32_t output_vertex_size
>> + * uint32_t output_patch0_offset
>> + * uint32_t perpatch_output_offset
>> + * and the same constbuf is bound to LS/HS/VS(ES).
>> + */
>> +void evergreen_setup_tess_constants(struct r600_context *rctx, const struct pipe_draw_info *info, unsigned *num_patches, uint32_t *lds_alloc)
>> +{
>> + struct pipe_constant_buffer constbuf = {0};
>> + struct r600_pipe_shader_selector *tcs = rctx->tcs_shader ? rctx->tcs_shader : rctx->tes_shader;
>> + struct r600_pipe_shader_selector *ls = rctx->vs_shader;
>> + unsigned num_tcs_input_cp = info->vertices_per_patch;
>> + unsigned num_tcs_outputs;
>> + unsigned num_tcs_output_cp;
>> + unsigned num_tcs_patch_outputs;
>> + unsigned num_tcs_inputs;
>> + unsigned input_vertex_size, output_vertex_size;
>> + unsigned input_patch_size, pervertex_output_patch_size, output_patch_size;
>> + unsigned output_patch0_offset, perpatch_output_offset, lds_size;
>> + uint32_t values[16];
>> + uint32_t tmp;
>> +
>> + if (!rctx->tes_shader)
>> + return;
>> +
>> + *num_patches = 1;
>
> num_patches should be set before returning.
>
>> +
>> + num_tcs_inputs = util_last_bit64(ls->lds_outputs_written_mask);
>> +
>> + if (rctx->tcs_shader) {
>> + num_tcs_outputs = util_last_bit64(tcs->lds_outputs_written_mask);
>> + num_tcs_output_cp = tcs->info.properties[TGSI_PROPERTY_TCS_VERTICES_OUT];
>> + num_tcs_patch_outputs = util_last_bit64(tcs->lds_patch_outputs_written_mask);
>> + } else {
>> + num_tcs_outputs = num_tcs_inputs;
>> + num_tcs_output_cp = num_tcs_input_cp;
>> + num_tcs_patch_outputs = 2; /* TESSINNER + TESSOUTER */
>> + }
>> +
>> + /* size in bytes */
>> + input_vertex_size = num_tcs_inputs * 16;
>> + output_vertex_size = num_tcs_outputs * 16;
>> +
>> + input_patch_size = num_tcs_input_cp * input_vertex_size;
>> +
>> + pervertex_output_patch_size = num_tcs_output_cp * output_vertex_size;
>> + output_patch_size = pervertex_output_patch_size + num_tcs_patch_outputs * 16;
>> +
>> + output_patch0_offset = rctx->tcs_shader ? input_patch_size * *num_patches : 0;
>> + perpatch_output_offset = output_patch0_offset + pervertex_output_patch_size;
>> +
>> + lds_size = output_patch0_offset + output_patch_size * *num_patches;
>> +
>> + values[0] = input_patch_size;
>> + values[1] = input_vertex_size;
>> + values[2] = num_tcs_input_cp;
>> + values[3] = num_tcs_output_cp;
>> +
>> + values[4] = output_patch_size;
>> + values[5] = output_vertex_size;
>> + values[6] = output_patch0_offset;
>> + values[7] = perpatch_output_offset;
>> +
>> + /* docs say HS_NUM_WAVES - CEIL((LS_HS_CONFIG.NUM_PATCHES *
>> + LS_HS_CONFIG.HS_NUM_OUTPUT_CP) / (NUM_GOOD_PIPES * 16)) */
>> + tmp = (lds_size | (1 << 14)); /* TODO */
>
> If I understand this correctly, num_good_pipes can be between 1 and 4.
> Assume the worst case, which is 1. This gives us:
> ceil(NUM_PATCHES * NUM_OUTPUT_CP / 16)
>
> That equals 2 if NUM_OUTPUT_CP > 16 and NUM_PATCHES = 1.
BTW, HS_NUM_WAVES means how many waves share the same LDS memory.
1 pipe = 16 threads per wave, (GCN always has 4 pipes = 64 threads per
wave). That's where "16" in the equation comes from. The equation only
ensures that all vertices within a patch are assigned the same LDS
memory. (that's why you need at least 2 for 1-pipe chips and
NUM_OUTPUT_CP > 16)
Marek
More information about the mesa-dev
mailing list