[Mesa-dev] [PATCH] i965/fs/gen7: Emit code for GLSL 3.00 pack/unpack operations (v3)
Eric Anholt
eric at anholt.net
Wed Jan 23 18:45:55 PST 2013
Chad Versace <chad.versace at linux.intel.com> writes:
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> index 324e665..9b54796 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> @@ -923,6 +923,96 @@ fs_generator::generate_set_global_offset(fs_inst *inst,
> }
>
> void
> +fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
> + struct brw_reg dst,
> + struct brw_reg x,
> + struct brw_reg y)
> +{
> + assert(intel->gen >= 7);
> + assert(dst.type == BRW_REGISTER_TYPE_UD);
> + assert(x.type = BRW_REGISTER_TYPE_F);
> + assert(y.type = BRW_REGISTER_TYPE_F);
> +
> + /* From the Ivybridge PRM, Vol4, Part3, Section 6.27 f32to16:
> + *
> + * Because this instruction does not have a 16-bit floating-point type,
> + * the destination data type must be Word (W).
> + *
> + * The destination must be DWord-aligned and specify a horizontal stride
> + * (HorzStride) of 2. The 16-bit result is stored in the lower word of
> + * each destination channel and the upper word is not modified.
> + */
> +
> + /* Give each 32-bit channel of dst the form below , where "." means
> + * unchanged.
> + * 0x....hhhh
> + *
> + * Per the PRM, change the source data type to W. To compensate for
> + * halving the data type width, double the horizontal stride. (The
> + * BRW_*_STRIDE enums are defined so that incrementing the field doubles
> + * the real stride).
> + */
> + dst.type = BRW_REGISTER_TYPE_W;
> + if (dst.hstride != 0)
> + ++dst.hstride;
> + brw_F32TO16(p, dst, y);
> +
> + /* Now the form:
> + * 0xhhhh0000
> + */
> + dst.type = BRW_REGISTER_TYPE_UD;
> + if (dst.hstride != 0)
> + --dst.hstride;
Perhaps a temporary named "dst_uw" that's uw-typed instead of popping in
and out of being word-typed? And a local static function to make one
From a uint fs_reg, so you don't have to explain the hstride and vstride
increment twice (and vstride increment is apparently dropped from this
function but still present in the next one, which is odd).
Functionally, the code looks good now. The missing vstride happens to
be safe because, because either it's 0, or width == execsize so it's
unused.
> + brw_SHL(p, dst, dst, brw_imm_ud(16u));
> +
> + /* And, finally the form of packHalf2x16's output:
> + * 0xhhhhllll
> + */
> + dst.type = BRW_REGISTER_TYPE_W;
> + if (dst.hstride != 0)
> + ++dst.hstride;
> + brw_F32TO16(p, dst, x);
> +}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130123/d8c8d3ad/attachment.pgp>
More information about the mesa-dev
mailing list