[Mesa-dev] [PATCH] i965/vs/gen7: Emit code for GLSL ES 3.00 pack/unpack operations (v2)

Wed Jan 23 19:18:05 PST 2013

Chad Versace <chad.versace at linux.intel.com> writes:
> +void
> +vec4_visitor::emit_unpack_half_2x16(dst_reg dst, src_reg src0)
> +{
> +   if (intel->gen < 7)
> +      assert(!"ir_unop_unpack_half_2x16 should be lowered");
> +
> +   assert(dst.type == BRW_REGISTER_TYPE_F);
> +   assert(src0.type == BRW_REGISTER_TYPE_UD);
> +
> +   /* From the Ivybridge PRM, Vol4, Part3, Section 6.26 f32to16:
> +    *
> +    *   Because this instruction does not have a 16-bit floating-point type,
> +    *   the source data type must be Word (W). The destination type must be
> +    *   F (Float).
> +    *
> +    * To use W as the source data type, we must adjust horizontal strides,
> +    * which is only possible in align1 mode. All my [chadv] attempts at
> +    * emitting align1 instructions for unpackHalf2x16 failed to pass the
> +    * Piglit tests, so I gave up.
> +    *
> +    * I've verified that, on gen7, it is safe to emit f16to32 in align16 mode
> +    * with UD as source data type.
> +    */

Have you tested this on something like:

in uvec4 v;
vec2 result = unpackHalf2x16(v.w);

Those kinds of "the type must be X and the stride must by Y" have
sometimes meant that it's just hardcoded and they don't look at what you
program, so I'm concerned that some of your regioning
(swizzle/abs/neg/uniformness) will just get thrown out by the hardware.

But if it's passing on your tests with uniforms, it's probably OK.

> +   dst_reg tmp_dst(this, glsl_type::uvec2_type);
> +   src_reg tmp_src(tmp_dst);
> +
> +   /* tmp.x = src0 & 0xffffu; */
> +   tmp_dst.writemask = WRITEMASK_X;
> +   emit(new(mem_ctx) vec4_instruction(this, BRW_OPCODE_AND,
> +                                      tmp_dst, src0, src_reg(0xffffu)));

These ought to use the helper functions for simplicity:
"emit(AND(tmp_dst, src0, src_reg(0xffffu)));" Check out the ALU1 macro
for how to set up one of those to have a similar helper for F16TO32 if
you want to match up the style.

> +
> +   /* tmp.y = src0 >> 16u; */
> +   tmp_dst.writemask = WRITEMASK_Y;
> +   emit(new(mem_ctx) vec4_instruction(this, BRW_OPCODE_SHR,
> +                                      tmp_dst, src0, src_reg(16u)));
> +
> +   /* dst.xy = f16to32(tmp); */
> +   dst.writemask = WRITEMASK_XY;
> +   emit(new(mem_ctx) vec4_instruction(this, BRW_OPCODE_F16TO32,
> +                                      dst, tmp_src));
> +}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130123/7a4df07f/attachment.pgp>