[Mesa-dev] [PATCH 05/36] i965: old VS: use the VUE map to compute the URB entry size.
Kenneth Graunke
kenneth at whitecape.org
Sat Sep 3 11:38:05 PDT 2011
On 09/02/2011 09:06 AM, Paul Berry wrote:
> Previously, the old VS backend computed the URB entry size by adding
> the number of vertex shader outputs to the size of the URB header.
> This often produced a larger result than necessary, because some
> vertex shader outputs are stored in the header, so they were being
> double counted. This patch changes the old VS backend to compute the
> URB entry size directly from the number of slots in the VUE map.
> ---
> src/mesa/drivers/dri/i965/brw_vs.h | 1 -
> src/mesa/drivers/dri/i965/brw_vs_emit.c | 31 +++++++------------------------
> 2 files changed, 7 insertions(+), 25 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.h b/src/mesa/drivers/dri/i965/brw_vs.h
> index a02c06d..c31869c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.h
> +++ b/src/mesa/drivers/dri/i965/brw_vs.h
> @@ -65,7 +65,6 @@ struct brw_vs_compile {
>
> struct brw_vue_map vue_map;
> GLuint first_output;
> - GLuint nr_outputs;
> GLuint last_scratch;
>
> GLuint first_tmp;
> diff --git a/src/mesa/drivers/dri/i965/brw_vs_emit.c b/src/mesa/drivers/dri/i965/brw_vs_emit.c
> index e7667c8..7c430ce 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs_emit.c
> @@ -402,36 +402,19 @@ static void brw_vs_alloc_regs( struct brw_vs_compile *c )
> /* The VS VUEs are shared by VF (outputting our inputs) and VS, so size
> * them to fit the biggest thing they need to.
> */
> - attributes_in_vue = MAX2(c->nr_outputs, c->nr_inputs);
> + attributes_in_vue = MAX2(c->vue_map.num_slots, c->nr_inputs);
>
> - /* See emit_vertex_write() for where the VUE's overhead on top of the
> - * attributes comes from.
> - */
> - if (intel->gen >= 7) {
> - int header_regs = 2;
> - if (c->key.nr_userclip)
> - header_regs += 2;
> -
> - /* Each attribute is 16 bytes (1 vec4), so dividing by 4 gives us the
> - * number of 64-byte (512-bit) units.
> - */
> - c->prog_data.urb_entry_size = (attributes_in_vue + header_regs + 3) / 4;
> - } else if (intel->gen == 6) {
> - int header_regs = 2;
> - if (c->key.nr_userclip)
> - header_regs += 2;
> -
> - /* Each attribute is 16 bytes (1 vec4), so dividing by 8 gives us the
> + if (intel->gen == 6) {
> + /* Each attribute is 32 bytes (2 vec4s), so dividing by 8 gives us the
> * number of 128-byte (1024-bit) units.
> */
> - c->prog_data.urb_entry_size = (attributes_in_vue + header_regs + 7) / 8;
> - } else if (intel->gen == 5)
> + c->prog_data.urb_entry_size = (attributes_in_vue + 7) / 8;
> + } else {
> /* Each attribute is 16 bytes (1 vec4), so dividing by 4 gives us the
> * number of 64-byte (512-bit) units.
> */
> - c->prog_data.urb_entry_size = (attributes_in_vue + 6 + 3) / 4;
> - else
> - c->prog_data.urb_entry_size = (attributes_in_vue + 2 + 3) / 4;
> + c->prog_data.urb_entry_size = (attributes_in_vue + 3) / 4;
> + }
>
> c->prog_data.total_grf = reg;
Totally non-obvious, but seems correct and is _much_ cleaner.
While we're at it, it might be nicer to do:
if (intel->gen == 6)
c->prog_data.urb_entry_size = ALIGN(attributes_in_vue, 8) / 8;
else
c->prog_data.urb_entry_size = ALIGN(attributes_in_vue, 4) / 4;
Otherwise, I get confused, thinking there are +7 or +3 registers
dedicated to something or other. Most likely because in the original
code, (attributes_in_vue + 6 + 3) / 4, the +6 _is_ adding some header
registers, while the +3 is purely a "round up!" factor.
We can always do that in a follow-up patch if you prefer (or just not,
because this is all going away soon enough.)
More information about the mesa-dev
mailing list