[Mesa-dev] [PATCH 08/22] i965/fs: fix lower SIMD width for IVB/VLV's MOVE_INDIRECT

Samuel Iglesias Gonsálvez siglesias at igalia.com
Fri Jan 13 08:48:40 UTC 2017


On Thu, 2017-01-12 at 16:45 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> 
> > From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
> > 
> > Previous to Broadwell, we have 8 registers for MOV_INDIRECT. But if
> > IVB/VLV deal with DFs, we will duplicate the exec_size from 8 to
> > 16.
> > 
> > This patch limits the SIMD width to 4 in this case.
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index cfce364..45d320d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -4959,8 +4959,13 @@ get_lowered_simd_width(const struct
> > gen_device_info *devinfo,
> >        return MIN2(8, inst->exec_size);
> >  
> >     case SHADER_OPCODE_MOV_INDIRECT:
> > -      /* Prior to Broadwell, we only have 8 address subregisters
> > */
> > -      return MIN3(devinfo->gen >= 8 ? 16 : 8,
> > +      /* Prior to Broadwell, we only have 8 address subregisters.
> > Special case
> > +       * for IVB/VLV and DF types: set to 4 (exec_size will be
> > later
> > +       * duplicated).
> 
> The comment seems rather misleading, exec size doubling is unlikely
> to
> have anything to do with this problem.
> 
> > +       */
> > +      return MIN3(devinfo->gen >= 8 ? 16 : ((devinfo->gen == 7 &&
> > +                                             !devinfo->is_haswell
> > &&
> > +                                             inst-
> > >exec_data_size() == 8) ? 4 : 8),
> >                    2 * REG_SIZE / (inst->dst.stride * type_sz(inst-
> > >dst.type)),
> >                    inst->exec_size);
> 
> I'm amazed that this works at all on HSW, according to the IVB and
> HSW
> PRMs:
> 
> "2.When the destination requires two registers and the sources are
>  indirect, the sources must use 1x1 regioning mode. In addition, the
>  sources must be assembled from GRF registers each accessed by
> adjacent
>  index registers in 1x1 regioning modes."
> 
> So for DF instructions the execution size is not limited by the
> number
> of address registers you have available, but by the EU decompression
> logic not handling VxH indirect addressing correctly.
> 

Right. I will change the comment and the commit log too.

> I think this should be something along the lines of:
> 
> >   const unsigned max_size = (devinfo->gen >= 8 ? 2 : 1) * REG_SIZE;
> >   return MIN3(devinfo->gen >= 8 ? 16 : 8,
> >               max_size / (inst->dst.stride * type_sz(inst-
> > >dst.type)),
> >               inst->exec_size);
> >  

OK, thanks.

Sam

> > -- 
> > 2.9.3
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list