[Mesa-dev] [PATCH] i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
Alejandro Piñeiro
apinheiro at igalia.com
Mon Jan 29 07:24:36 UTC 2018
The patch looks good, just some nitpicks below.
On 29/01/18 01:26, Andres Gomez wrote:
> The emission of vertex attributes corresponding to dvec3 and dvec4
> vertex shader input variables was not correct when the <size> passed
> to the VertexAttribL* commands was <= 2.
>
> In 61a8a55f557 ("i965/gen8: Fix vertex attrib upload for dvec3/4
> shader inputs"), for gen8+ we needed to determine if the attrib was
> dual slot to emit 128 or 256-bit, independently of the VAO size.
>
> Similarly, for gen < 8 we also need to determine whether the attrib is
> dual slot to force the emission of 265-bits through 2 uploads.
>
> Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
> second upload to fill these unspecified components with zeros, as we
> also do for gen8+.
>
> Fixes the following test on Haswell:
> KHR-GL46.vertex_attrib_binding.basic-inputL-case1
>
> Fixes: 75968a668e4 ("i965/gen7: expose OpenGL 4.2 on Haswell when
> supported")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
> Cc: Alejandro Piñeiro <apinheiro at igalia.com>
> Cc: Juan A. Suarez Romero <jasuarez at igalia.com>
> Cc: Antia Puentes <apuentes at igalia.com>
> Cc: Rafael Antognolli <rafael.antognolli at intel.com>
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> Signed-off-by: Andres Gomez <agomez at igalia.com>
> ---
> src/mesa/drivers/dri/i965/genX_state_upload.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index aa4d64d08e2..d4c89c05b41 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -364,11 +364,15 @@ is_passthru_format(uint32_t format)
> }
>
> UNUSED static int
> -uploads_needed(uint32_t format)
> +uploads_needed(uint32_t format,
> + bool is_dual_slot)
> {
> if (!is_passthru_format(format))
> return 1;
>
> + if (is_dual_slot)
> + return 2;
> +
> switch (format) {
> case ISL_FORMAT_R64_PASSTHRU:
> case ISL_FORMAT_R64G64_PASSTHRU:
> @@ -399,9 +403,11 @@ downsize_format_if_needed(uint32_t format,
>
> switch (format) {
> case ISL_FORMAT_R64_PASSTHRU:
> - return ISL_FORMAT_R32G32_FLOAT;
> + return !upload ? ISL_FORMAT_R32G32_FLOAT
> + : ISL_FORMAT_R32_FLOAT;
This download is somewhat hard to understand if you don't look at the
commit message. I think that it would be good to add a comment. Although
it is not a big deal.
> case ISL_FORMAT_R64G64_PASSTHRU:
> - return ISL_FORMAT_R32G32B32A32_FLOAT;
> + return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
> + : ISL_FORMAT_R32_FLOAT;
> case ISL_FORMAT_R64G64B64_PASSTHRU:
> return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
> : ISL_FORMAT_R32G32_FLOAT;
> @@ -420,6 +426,8 @@ static int
> upload_format_size(uint32_t upload_format)
> {
> switch (upload_format) {
> + case ISL_FORMAT_R32_FLOAT:
> + return 0;
This is even harder to understand, even reading the commit message, as
this mentions what we do in the end but now how. I needed to go to the
full code to understand it. As far as I understand, when this method is
called later, if it founds this format, returns 0 so VFCOMP_STORE_0 flag
is used, so as mentioned in the commit message, it is filled with
zeroes. Right? I would really add a comment explaining it a little. And
if Im wrong with my conclusion, then more reasons to add that.
> case ISL_FORMAT_R32G32_FLOAT:
> return 2;
> case ISL_FORMAT_R32G32B32A32_FLOAT:
> @@ -517,7 +525,7 @@ genX(emit_vertices)(struct brw_context *brw)
> struct brw_vertex_element *input = brw->vb.enabled[i];
> uint32_t format = brw_get_vertex_surface_type(brw, input->glarray);
>
> - if (uploads_needed(format) > 1)
> + if (uploads_needed(format, input->is_dual_slot) > 1)
> nr_elements++;
> }
> #endif
> @@ -613,7 +621,8 @@ genX(emit_vertices)(struct brw_context *brw)
> uint32_t comp1 = VFCOMP_STORE_SRC;
> uint32_t comp2 = VFCOMP_STORE_SRC;
> uint32_t comp3 = VFCOMP_STORE_SRC;
> - const unsigned num_uploads = GEN_GEN < 8 ? uploads_needed(format) : 1;
> + const unsigned num_uploads = GEN_GEN < 8 ?
> + uploads_needed(format, input->is_dual_slot) : 1;
>
> #if GEN_GEN >= 8
> /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):
With at least the second comment added:
Reviewed-by: Alejandro Piñeiro <apinheiro at igalia.com>
More information about the mesa-dev
mailing list