[Mesa-dev] [PATCH] i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
Matt Turner
mattst88 at gmail.com
Fri Oct 28 23:07:07 UTC 2016
On Fri, Oct 28, 2016 at 2:53 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> The address immediate field is only 9 bits and, since the value is in
> bytes, the highest GRF we can point to with it is g15. This makes it
> pretty close to useless for MOV_INDIRECT. There were already piles of
> restrictions preventing us from using it prior to Broadwell, so let's get
> rid of the gen8+ code path entirely.
>
> Signed-off-by: Jason Ekstrand <jason at jlekstrand.net>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779
> Cc: "12.0 13.0" <mesa-stable at lists.freedesktop.org>
> ---
> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 55 +++++++++++++-------------
> 1 file changed, 28 insertions(+), 27 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index d25d26a..7130bf5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -386,33 +386,34 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
> retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
>
> struct brw_reg ind_src;
> - if (devinfo->gen < 8) {
> - /* From the Haswell PRM section "Register Region Restrictions":
> - *
> - * "The lower bits of the AddressImmediate must not overflow to
> - * change the register address. The lower 5 bits of Address
> - * Immediate when added to lower 5 bits of address register gives
> - * the sub-register offset. The upper bits of Address Immediate
> - * when added to upper bits of address register gives the register
> - * address. Any overflow from sub-register offset is dropped."
> - *
> - * This restriction is only listed in the Haswell PRM but emperical
> - * testing indicates that it applies on all older generations and is
> - * lifted on Broadwell.
> - *
> - * Since the indirect may cause us to cross a register boundary, this
> - * makes the base offset almost useless. We could try and do
> - * something clever where we use a actual base offset if
> - * base_offset % 32 == 0 but that would mean we were generating
> - * different code depending on the base offset. Instead, for the
> - * sake of consistency, we'll just do the add ourselves.
> - */
> - brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
> - ind_src = brw_VxH_indirect(0, 0);
> - } else {
> - brw_MOV(p, addr, indirect_byte_offset);
> - ind_src = brw_VxH_indirect(0, imm_byte_offset);
> - }
> +
> + /* There are a number of reasons why we don't use the base offset here.
> + * One reason is that the field is only 9 bits which means we can only
> + * use it on the first 16 GRFs. Also, from the Haswell PRM section
s/on/to access/
> + * "Register Region Restrictions":
> + *
> + * "The lower bits of the AddressImmediate must not overflow to
> + * change the register address. The lower 5 bits of Address
> + * Immediate when added to lower 5 bits of address register gives
> + * the sub-register offset. The upper bits of Address Immediate
> + * when added to upper bits of address register gives the register
> + * address. Any overflow from sub-register offset is dropped."
> + *
> + * Since the indirect may cause us to cross a register boundary, this
> + * makes the base offset almost useless. We could try and do something
> + * clever where we use a actual base offset if base_offset % 32 == 0 but
> + * that would mean we were generating different code depending on the
> + * base offset. Instead, for the sake of consistency, we'll just do the
> + * add ourselves. This restriction is only listed in the Haswell PRM
> + * but emperical testing indicates that it applies on all older
empirical
> + * generations and is lifted on Broadwell.
> + *
> + * In the end, while base_offset is nice to look at in the generated
> + * code, using it saves us 0 instructions and would require quite a bit
> + * of case-by-case work. It's just not worth it.
> + */
> + brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
> + ind_src = brw_VxH_indirect(0, 0);
I think you can move the declaration of ind_src here as well.
Reviewed-by: Matt Turner <mattst88 at gmail.com>
More information about the mesa-dev
mailing list