[Mesa-dev] [PATCH 34/59] intel/compiler: fix ddy for half-float in gen8
Pohjolainen, Topi
topi.pohjolainen at gmail.com
Fri Dec 7 13:06:17 UTC 2018
On Tue, Dec 04, 2018 at 08:16:58AM +0100, Iago Toral Quiroga wrote:
> We use ALign16 mode for this, since it is more convenient, but the PRM
> for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register region
> restrictions', Section '1. Special Restrictions':
>
> "In Align16 mode, the channel selects and channel enables apply to a
> pair of half-floats, because these parameters are defined for DWord
> elements ONLY. This is applicable when both source and destination
> are half-floats."
>
> This means that we cannot select individual HF elements using swizzles
> like we do with 32-bit floats so we can't implement the required
> regioning for this.
>
> Use the gen11 path for this instead, which uses Align1 mode.
>
> The restriction is not present in gen9 of gen10, where the Align16
or?
> implementation seems to work just fine.
> ---
> src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp
> index d8e4bae17e0..ba7ed07e692 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst *inst,
> const uint32_t type_size = type_sz(src.type);
>
> if (inst->opcode == FS_OPCODE_DDY_FINE) {
> - /* produce accurate derivatives */
> - if (devinfo->gen >= 11) {
> + /* produce accurate derivatives. We can do this easily in Align16
> + * but this is not supported in gen11+ and gen8 Align16 swizzles
> + * for Half-Float operands work in units of 32-bit and always
> + * select pairs of consecutive half-float elements, so we can't use
> + * use it for this.
> + */
> + if (devinfo->gen >= 11 ||
> + (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF)) {
> src = stride(src, 0, 2, 1);
> struct brw_reg src_0 = byte_offset(src, 0 * type_size);
> struct brw_reg src_2 = byte_offset(src, 2 * type_size);
> --
> 2.17.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list