[Mesa-dev] [PATCH v2 12/30] i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles
Francisco Jerez
currojerez at riseup.net
Fri May 13 22:43:28 UTC 2016
Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> From: Iago Toral Quiroga <itoral at igalia.com>
>
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4827dea..2383d2c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -194,8 +194,15 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld,
> else
> op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD;
>
> + /* The pull load message will load a vec4 (16 bytes). If we are loading
> + * a double this means we are only loading 2 elements worth of data.
> + * We also want to use a 32-bit data type for the dst of the load operation
> + * so other parts of the driver don't get confused about the size of the
> + * result.
> + */
> int regs_written = 4 * (bld.dispatch_width() / 8) * scale;
> - fs_reg vec4_result = fs_reg(VGRF, alloc.allocate(regs_written), dst.type);
> + fs_reg vec4_result = fs_reg(VGRF, alloc.allocate(regs_written),
> + BRW_REGISTER_TYPE_F);
> fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset);
> inst->regs_written = regs_written;
>
> @@ -208,7 +215,15 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder &bld,
> inst->mlen = 1 + bld.dispatch_width() / 8;
> }
>
> - bld.MOV(dst, offset(vec4_result, bld, ((const_offset & 0xf) / 4) * scale));
> + if (type_sz(dst.type) == 8) {
Add 'assert(scale == 1)' here because you're not taking it into account.
> + shuffle_32bit_load_result_to_64bit_data(
> + bld, retype(vec4_result, dst.type), vec4_result, 2);
> + }
> +
> + vec4_result.type = dst.type;
> + int type_slots = MAX2(type_sz(dst.type) / 4, 1);
This code is definitely not going to work for type sizes other than 8 or
4, the MAX2(..., 1) will only conceal the problem. It seems weird that
you convert the type size into an integer number of 32-bit slots
(potentially introducing rounding errors) only to convert it back to
bytes in the next line (because const_offset is expressed in bytes
regardless). How about:
| bld.MOV(dst, offset(vec4_result, bld,
| (const_offset & 0xf) / type_sz(vec4_result.type) * scale));
and remove the type_slots variable. With that fixed:
Reviewed-by: Francisco Jerez <currojerez at riseup.net>
(It would be nice to do the shuffle_32bit_load_result_to_64bit_data()
directly into the destination and clean up the retyping slightly, but I
guess that can be done as a follow-up).
> + bld.MOV(dst, offset(vec4_result, bld,
> + ((const_offset & 0xf) / (4 * type_slots)) * scale));
> }
>
> /**
> --
> 2.5.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160513/70a9dec2/attachment.sig>
More information about the mesa-dev
mailing list