[Mesa-stable] [PATCH 1/3] i965/vec4: fix vertical stride to avoid breaking region parameter rule
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Fri May 5 09:52:59 UTC 2017
On Fri, 2017-05-05 at 12:46 +0300, Andres Gomez wrote:
> Samuel, may it make sense to pick this series for 17.0 too?
>
Yes, please. These fixes apply also to HSW's FP64 support for vec4
backend, which was added in Mesa 17.0.
Thanks!
Sam
> Br.
>
> On Wed, 2017-04-26 at 13:57 +0200, Samuel Iglesias Gonsálvez wrote:
> > From IVB PRM, vol4, part3, "General Restrictions on Regioning
> > Parameters":
> >
> > "If ExecSize = Width and HorzStride ≠ 0, VertStride must
> > be set to Width * HorzStride."
> >
> > In next patch, we are going to modify the region parameter for
> > uniforms and vgrf. For uniforms that are the source of
> > DF align1 instructions, they will have <0, 4, 1> regioning and
> > the execsize for those instructions will be 4, so they will break
> > the regioning rule. This will be the same for VGRF sources where
> > we use the vstride == 0 exploit.
> >
> > As we know we are not going to cross the GRF boundary with that
> > execsize and parameters (not even with the exploit), we just fix
> > the vstride here.
> >
> > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> > Cc: "17.1" <mesa-stable at lists.freedesktop.org>
> > ---
> > src/intel/compiler/brw_reg.h | 15 +++++++++++++++
> > src/intel/compiler/brw_vec4.cpp | 19 +++++++++++++++++++
> > 2 files changed, 34 insertions(+)
> >
> > diff --git a/src/intel/compiler/brw_reg.h
> > b/src/intel/compiler/brw_reg.h
> > index 17a51fbd655..24e09a84fce 100644
> > --- a/src/intel/compiler/brw_reg.h
> > +++ b/src/intel/compiler/brw_reg.h
> > @@ -914,6 +914,21 @@ static inline unsigned cvt(unsigned val)
> > return 0;
> > }
> >
> > +static inline unsigned inv_cvt(unsigned val)
> > +{
> > + switch (val) {
> > + case 0: return 0;
> > + case 1: return 1;
> > + case 2: return 2;
> > + case 3: return 4;
> > + case 4: return 8;
> > + case 5: return 16;
> > + case 6: return 32;
> > + }
> > + return 0;
> > +}
> > +
> > +
> > static inline struct brw_reg
> > stride(struct brw_reg reg, unsigned vstride, unsigned width,
> > unsigned hstride)
> > {
> > diff --git a/src/intel/compiler/brw_vec4.cpp
> > b/src/intel/compiler/brw_vec4.cpp
> > index f9b805ea5a9..95f96ea69c0 100644
> > --- a/src/intel/compiler/brw_vec4.cpp
> > +++ b/src/intel/compiler/brw_vec4.cpp
> > @@ -38,6 +38,8 @@ using namespace brw;
> >
> > namespace brw {
> >
> > +static bool is_align1_df(vec4_instruction *inst);
> > +
> > void
> > src_reg::init()
> > {
> > @@ -2049,6 +2051,23 @@ vec4_visitor::convert_to_hw_regs()
> >
> > apply_logical_swizzle(®, inst, i);
> > src = reg;
> > +
> > + /* From IVB PRM, vol4, part3, "General Restrictions on
> > Regioning
> > + * Parameters":
> > + *
> > + * "If ExecSize = Width and HorzStride ≠ 0, VertStride
> > must be set
> > + * to Width * HorzStride."
> > + *
> > + * We can break this rule with DF sources on DF align1
> > + * instructions, because the exec_size would be 4 and
> > width is 4.
> > + * As we know we are not accessing to next GRF, it is
> > safe to
> > + * set vstride to the formula given by the rule itself.
> > + */
> > + if (is_align1_df(inst) && inst->exec_size ==
> > inv_cvt(src.width + 1)) {
> > + const unsigned width = inv_cvt(src.width + 1);
> > + const unsigned hstride = inv_cvt(src.hstride);
> > + src.vstride = cvt(width * hstride);
> > + }
> > }
> >
> > if (inst->is_3src(devinfo)) {
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/mesa-stable/attachments/20170505/682b72e5/attachment.sig>
More information about the mesa-stable
mailing list