[Mesa-dev] [PATCH 08/22] i965/fs: fix lower SIMD width for IVB/VLV's MOVE_INDIRECT
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Fri Jan 13 08:48:40 UTC 2017
On Thu, 2017-01-12 at 16:45 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
>
> > From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
> >
> > Previous to Broadwell, we have 8 registers for MOV_INDIRECT. But if
> > IVB/VLV deal with DFs, we will duplicate the exec_size from 8 to
> > 16.
> >
> > This patch limits the SIMD width to 4 in this case.
> > ---
> > src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index cfce364..45d320d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -4959,8 +4959,13 @@ get_lowered_simd_width(const struct
> > gen_device_info *devinfo,
> > return MIN2(8, inst->exec_size);
> >
> > case SHADER_OPCODE_MOV_INDIRECT:
> > - /* Prior to Broadwell, we only have 8 address subregisters
> > */
> > - return MIN3(devinfo->gen >= 8 ? 16 : 8,
> > + /* Prior to Broadwell, we only have 8 address subregisters.
> > Special case
> > + * for IVB/VLV and DF types: set to 4 (exec_size will be
> > later
> > + * duplicated).
>
> The comment seems rather misleading, exec size doubling is unlikely
> to
> have anything to do with this problem.
>
> > + */
> > + return MIN3(devinfo->gen >= 8 ? 16 : ((devinfo->gen == 7 &&
> > + !devinfo->is_haswell
> > &&
> > + inst-
> > >exec_data_size() == 8) ? 4 : 8),
> > 2 * REG_SIZE / (inst->dst.stride * type_sz(inst-
> > >dst.type)),
> > inst->exec_size);
>
> I'm amazed that this works at all on HSW, according to the IVB and
> HSW
> PRMs:
>
> "2.When the destination requires two registers and the sources are
> indirect, the sources must use 1x1 regioning mode. In addition, the
> sources must be assembled from GRF registers each accessed by
> adjacent
> index registers in 1x1 regioning modes."
>
> So for DF instructions the execution size is not limited by the
> number
> of address registers you have available, but by the EU decompression
> logic not handling VxH indirect addressing correctly.
>
Right. I will change the comment and the commit log too.
> I think this should be something along the lines of:
>
> > const unsigned max_size = (devinfo->gen >= 8 ? 2 : 1) * REG_SIZE;
> > return MIN3(devinfo->gen >= 8 ? 16 : 8,
> > max_size / (inst->dst.stride * type_sz(inst-
> > >dst.type)),
> > inst->exec_size);
> >
OK, thanks.
Sam
> > --
> > 2.9.3
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list