[Mesa-dev] [PATCH 06/22] i965/fs: double-precision execution does not use 2 channels per DF in IVB/VLV
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Wed Jan 11 08:14:08 UTC 2017
On Tue, 2017-01-10 at 14:01 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
>
> > On Mon, 2017-01-09 at 16:18 -0800, Francisco Jerez wrote:
> > > Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> > >
> > > > From: Iago Toral Quiroga <itoral at igalia.com>
> > > >
> > > > It seems to use 1 channel por DF, just like later hardware. The
> > > > docs say things
> > > > like:
> > > >
> > > > "Each DF operand uses a pair of channels and all masking and
> > > > swizzling
> > > > should be adjusted appropriately."
> > > >
> > > > "In Align16, all regioning parameters must use the syntax of a
> > > > pair
> > > > of packed
> > > > floats, including channel selects and channel enables."
> > > >
> > > > In these cases, masking and channel select/enables seem to
> > > > refere
> > > > only to
> > > > Align16's swizzle and writemasks respectively, not to execution
> > > > masking.
> > > >
> > > > In any case, this means that execution masking controls and
> > > > execsize go
> > > > in different units and we need to adjust the assertion on the
> > > > relation
> > > > between the two accordingly.
> > > > ---
> > > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 6 +++++-
> > > > 1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > index 90ee7c1..ac2d8ad 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > @@ -1646,7 +1646,11 @@ fs_generator::generate_code(const cfg_t
> > > > *cfg, int dispatch_width)
> > > > brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
> > > >
> > > > assert(inst->force_writemask_all || inst->exec_size >=
> > > > 4);
> > > > - assert(inst->force_writemask_all || inst->group % inst-
> > > > > exec_size == 0);
> > > >
> > > > + assert(inst->force_writemask_all ||
> > > > + ((devinfo->gen != 7 || devinfo->is_haswell) &&
> > > > + inst->group % inst->exec_size == 0) ||
> > > > + ((devinfo->gen == 7 && !devinfo->is_haswell) &&
> > > > + (2 * inst->group) % inst->exec_size == 0));
> > >
> > > NAK, exec size doubling happens at the codegen level so the IR
> > > group
> > > and
> > > exec_size controls should still be expressed in the same units.
> > >
> >
> > Actually the patch is right because the exec size doubling already
> > happened at this point, so this is fixing the difference in units.
> >
>
> It may match the expectations of PATCH 3 from this series, but at the
> same time it breaks the expectations of the rest of the back-end IR
> infrastructure. E.g. regs_read() will now return bogus results from
> the
> point you whacked the execution size of DF instructions. I don't
> want
> to have to think which portion of the public IR interface still
> behaves
> sensibly and which portion will give me garbage values when I use it
> From the generator.
>
Right.
> To make the matter scarier this happens from the back-end code
> generator
> which isn't even supposed to modify its input in a semantics-
> preserving
> way (it's supposed to emit machine code matching the IR given as
> input).
> PATCH 3 causes it to destroy its input (try invoking the code
> generator
> a second time on the same shader and you'll get garbage as
> result...).
>
I see. Thanks for the explanation!
> > One alternative to this patch is to move the exec_size doubling
> > code to
> > just after this assert, so we can discard this patch.
> >
> > What do you think?
> >
>
> Please let's not modify the IR from the codegen pass [this goes as
> modification request for PATCH 3 too]. Just have the generator emit
> machine code with twice the IR execution size for double-precision
> instructions on IVB.
>
OK, I'll do this change.
Sam
> > Sam
> >
> > > > assert(inst->base_mrf + inst->mlen <=
> > > > BRW_MAX_MRF(devinfo-
> > > > > gen));
> > > >
> > > > assert(inst->mlen <= BRW_MAX_MSG_LENGTH);
> > > >
> > > > --
> > > > 2.9.3
> > > >
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > mesa-dev at lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170111/75b24025/attachment.sig>
More information about the mesa-dev
mailing list