[Mesa-dev] [PATCH 48/95] i965/vec4: add a force_vstride0 flag to src_reg
Iago Toral
itoral at igalia.com
Tue Sep 13 07:25:12 UTC 2016
On Mon, 2016-09-12 at 14:05 -0700, Francisco Jerez wrote:
> Iago Toral Quiroga <itoral at igalia.com> writes:
>
> >
> > We will use this in cases where we want to force the vstride of a
> > src_reg
> > to 0 to exploit a particular behavior of the hardware. It will come
> > in
> > handy to implement access to components Z/W.
> > ---
> > src/mesa/drivers/dri/i965/brw_ir_vec4.h | 1 +
> > src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 ++
> > 2 files changed, 3 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > index f66c093..f3cce4b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > @@ -51,6 +51,7 @@ public:
> > explicit src_reg(const dst_reg ®);
> >
> > src_reg *reladdr;
> > + bool force_vstride0;
> I was wondering whether it would make more sense to unify this with
> the
> FS back-end's fs_reg::stride (a numeric stride field is also likely
> more
> convenient to do arithmetic on than a boolean) and promote it to
> backend_reg? It could be defined as the number of components to jump
> over for each logical channel of the register, which is just the
> vstride
> in single-precision SIMD4x2 and the hstride in scalar mode.
We could do that, but I thought it would be a good idea to make it
clear that here we are using the vstride=0 with a very specific
intention and we don't expect the hardware to do what it would be
expected (we are trying to exploit a hardware bug after all). If we
were to use a normal stride field for this I think we would make this
intention much less obvious and other people reading the code would
have a much harder time understanding what is really going on. Since we
are being tricky here I think the extra field to signal that we are
trying to do something "special" might be worth it: people can track
where we read and write that field and see exactly where it is being
used for the purpose of exploiting this particular hardware behavior.
> But thinking about it some more, I wonder if it's really necessary to
> expose vertical strides at the IR level? Aren't you planing to use
> this
> during the conversion to HW registers exclusively? Why don't you set
> the vstride field directly in that case?
Yes, this is used exclusively at that time. The conversion to hardware
registers in convert_to_hw_regs() happens in two stages now:
We call our 'expand_64bit_swizzle_to_32bit()' helper first. This one
takes care of checking the regioning on DF instructions, translate
swizzles and set force_vstrid0 to true when needed (which is also the
only place that would set this to true). Then the rest of the code in
convert_to_hw_regs() just operates as usual, only that it will check
the force_vstride0 setting to decide the vstride to use for DF regions.
I did it like this because it allows us to keep the DF swizzle
translation and regioning checking logic separated from the conversion
to hardware registers, but this separation means that we need to tell
the latter when it has to set the vstride to 0, thus the addition of
the forcE_vstride0 field. I think having these two things separated
makes sense and makes the code easier to read. We can keep both things
separate and still avoid the force_vstride0 field by using a stride
field as you suggest above, but as I said, I think we might be doing a
rather tricky thing a bit less obvious than it should to other people.
> >
> > };
> >
> > static inline src_reg
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > index a20b2fd..bfbbd96 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > @@ -70,6 +70,7 @@ src_reg::src_reg(struct ::brw_reg reg) :
> > {
> > this->reg_offset = 0;
> > this->reladdr = NULL;
> > + this->force_vstride0 = false;
> > }
> >
> > src_reg::src_reg(const dst_reg ®) :
> > @@ -77,6 +78,7 @@ src_reg::src_reg(const dst_reg ®) :
> > {
> > this->reladdr = reg.reladdr;
> > this->swizzle = brw_swizzle_for_mask(reg.writemask);
> > + this->force_vstride0 = false;
> > }
> >
> > void
More information about the mesa-dev
mailing list