[Mesa-dev] [PATCH] i965/vec4: use a temp register to compute offsets for pull loads

Tue Feb 20 14:38:27 UTC 2018

Iago, this looks like a good candidate to nominate for inclusion in the
17.3 stable queue.

What do you think?

On Wed, 2017-11-29 at 11:49 +0100, Iago Toral Quiroga wrote:
> 64-bit pull loads are implemented by emitting 2 separate
> 32-bit pull load messages, where the second message loads from
> an offset at +16B.
> 
> That addition of 16B to the original offset should not alter the
> original offset register used as source for the pull load instruction
> though, since the compiler might use that same offset register in other
> instructions (for example, for other pull loads in the shader code
> that take that same offset as reference).
> 
> If the pull load is 32-bit then we only need to emit one message and
> we don't need to do offset calculations, but in that case the optimizer
> should be able to drop the redundant MOV.
> 
> Fixes the following test on Haswell:
> KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components
> ---
>  src/intel/compiler/brw_vec4_nir.cpp | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/compiler/brw_vec4_nir.cpp b/src/intel/compiler/brw_vec4_nir.cpp
> index 0a1caa9fad..84f5b37a9d 100644
> --- a/src/intel/compiler/brw_vec4_nir.cpp
> +++ b/src/intel/compiler/brw_vec4_nir.cpp
> @@ -888,7 +888,9 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
>        if (const_offset) {
>           offset_reg = brw_imm_ud(const_offset->u32[0] & ~15);
>        } else {
> -         offset_reg = get_nir_src(instr->src[1], nir_type_uint32, 1);
> +         offset_reg = src_reg(this, glsl_type::uint_type);
> +         emit(MOV(dst_reg(offset_reg),
> +                  get_nir_src(instr->src[1], nir_type_uint32, 1)));
>        }
>  
>        src_reg packed_consts;
-- 
Br,

Andres