[Mesa-dev] [PATCH v2 12/30] i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles

Fri May 13 22:43:28 UTC 2016

Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:

> From: Iago Toral Quiroga <itoral at igalia.com>
>
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4827dea..2383d2c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -194,8 +194,15 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld,
>     else
>        op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD;
>  
> +   /* The pull load message will load a vec4 (16 bytes). If we are loading
> +    * a double this means we are only loading 2 elements worth of data.
> +    * We also want to use a 32-bit data type for the dst of the load operation
> +    * so other parts of the driver don't get confused about the size of the
> +    * result.
> +    */
>     int regs_written = 4 * (bld.dispatch_width() / 8) * scale;
> -   fs_reg vec4_result = fs_reg(VGRF, alloc.allocate(regs_written), dst.type);
> +   fs_reg vec4_result = fs_reg(VGRF, alloc.allocate(regs_written),
> +                               BRW_REGISTER_TYPE_F);
>     fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset);
>     inst->regs_written = regs_written;
>  
> @@ -208,7 +215,15 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld,
>           inst->mlen = 1 + bld.dispatch_width() / 8;
>     }
>  
> -   bld.MOV(dst, offset(vec4_result, bld, ((const_offset & 0xf) / 4) * scale));
> +   if (type_sz(dst.type) == 8) {

Add 'assert(scale == 1)' here because you're not taking it into account.

> +      shuffle_32bit_load_result_to_64bit_data(
> +         bld, retype(vec4_result, dst.type), vec4_result, 2);
> +   }
> +
> +   vec4_result.type = dst.type;
> +   int type_slots = MAX2(type_sz(dst.type) / 4, 1);

This code is definitely not going to work for type sizes other than 8 or
4, the MAX2(..., 1) will only conceal the problem.  It seems weird that
you convert the type size into an integer number of 32-bit slots
(potentially introducing rounding errors) only to convert it back to
bytes in the next line (because const_offset is expressed in bytes
regardless).  How about:

| bld.MOV(dst, offset(vec4_result, bld,
|                     (const_offset & 0xf) / type_sz(vec4_result.type) * scale));

and remove the type_slots variable.  With that fixed:

Reviewed-by: Francisco Jerez <currojerez at riseup.net>

(It would be nice to do the shuffle_32bit_load_result_to_64bit_data()
directly into the destination and clean up the retyping slightly, but I
guess that can be done as a follow-up).

> +   bld.MOV(dst, offset(vec4_result, bld,
> +                       ((const_offset & 0xf) / (4 * type_slots)) * scale));
>  }
>  
>  /**
> -- 
> 2.5.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160513/70a9dec2/attachment.sig>