[Mesa-dev] [PATCH v2 12/78] i965/nir/vec4: Implement load_const intrinsic

Mon Jul 27 09:22:58 PDT 2015

Hi! Jason,

On jue, 2015-07-23 at 15:46 -0700, Jason Ekstrand wrote:

> > @@ -332,7 +334,22 @@ vec4_visitor::nir_emit_instr(nir_instr *instr)
> >  void
> >  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
> >  {
> > -   /* @TODO: Not yet implemented */
> > +   dst_reg reg = dst_reg(GRF, alloc.allocate(1));
> > +   reg.type =  BRW_REGISTER_TYPE_F;
> > +
> > +   /* @FIXME: consider emitting vector operations to save some MOVs in
> > +    * cases where the components are representable in 8 bits.
> > +    * By now, we emit a MOV for each component.
> > +    */
> > +   for (unsigned i = 0; i < instr->def.num_components; ++i) {
> > +      reg.writemask = 1 << i;
> > +      emit(MOV(reg, src_reg(instr->value.f[i])));
> 
> In the FS, we use integers here.  I don't know that using floats is a
> problem, but we should probably be consistent.

We are using floats as the default type when loading constants, because
otherwise we have a number of regressions in Piglit that happen because
clamping is not done when it is required. This lack of clamping is
related to the vec4_visitor:opt_register_coalesce optimization relaying
on the constants being loaded with their real type, this is in effect
how the IR-vec4 pass works.

Let me explain it with an example. Imagine that we use integers as the
default type in load_const as you suggest, and we have in our shader's
code the following instruction:

gl_FrontColor = vec4(-2, -1, 0.5, 3);

This is translated into the following vec4 instructions:

/* Load constant */
mov vgrf10.0.x:D, -1073741824D
mov vgrf10.0.y:D, -1082130432D
mov vgrf10.0.z:D, 1056964608D
mov vgrf10.0.w:D, 1077936128D

/* Store the value */
mov.sat m4:F, vgrf10.xyzw:F

The error comes when that bunch of instructions is translated by the
opt_register_coalesce as:
mov.sat m4.x:D, -1073741824D
mov.sat m4.y:D, -1082130432D
mov.sat m4.z:D, 1056964608D
mov.sat m4.w:D, 1077936128D

As you can see, although the instruction saturation is set, nothing will
be done because the operands are integers. I see a number of ways to fix
this:

1) Provide the constant's type information in nir_load_const_instr and
set the real type when loading the constant.

2) Set the type always to float and rely on receiving the correct value
for the instruction->saturation, so the clamping is only done when
needed.

3) Modify the opt_register_coalesce optimization to preserve the types
of the instruction containing the final destination, if the instruction
saturation is true.