[Mesa-dev] [PATCH 33/59] intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments
Iago Toral
itoral at igalia.com
Mon Dec 10 08:52:44 UTC 2018
On Fri, 2018-12-07 at 13:13 -0600, Jason Ekstrand wrote:
> On Tue, Dec 4, 2018 at 1:18 AM Iago Toral Quiroga <itoral at igalia.com>
> wrote:
> > The implementation of these opcodes in the generator assumes that
> > their
> >
> > arguments are packed, and it generates register regions based on
> > that
> >
> > assumption. While this expectation is reasonable for 32-bit,
>
> Expectation, sure, but if someone does ddx(f2f32(d)) where d is a
> double, it's broken. Maybe we should back-port? Either way
Yes, that's a good point. I'll send this separately to -stable.
> Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
>
> > when we
> >
> > load 16-bit elements from UBOs we get them with a stride of 2 that
> > we
> >
> > then need to pack with a stride of 1. Copy propagation can see
> > through this
> >
> > and rewrite ddx/ddy operands to use the original, strided register,
> > breaking
> >
> > the implementation in the generator.
> >
> > ---
> >
> > .../compiler/brw_fs_copy_propagation.cpp | 21
> > +++++++++++++++++++
> >
> > 1 file changed, 21 insertions(+)
> >
> >
> >
> > diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp
> > b/src/intel/compiler/brw_fs_copy_propagation.cpp
> >
> > index 58d5080b4e9..c01d4ec4a4f 100644
> >
> > --- a/src/intel/compiler/brw_fs_copy_propagation.cpp
> >
> > +++ b/src/intel/compiler/brw_fs_copy_propagation.cpp
> >
> > @@ -361,6 +361,20 @@ can_take_stride(fs_inst *inst, unsigned arg,
> > unsigned stride,
> >
> > return true;
> >
> > }
> >
> >
> >
> > +static bool
> >
> > +instruction_requires_packed_data(fs_inst *inst)
> >
> > +{
> >
> > + switch (inst->opcode) {
> >
> > + case FS_OPCODE_DDX_FINE:
> >
> > + case FS_OPCODE_DDX_COARSE:
> >
> > + case FS_OPCODE_DDY_FINE:
> >
> > + case FS_OPCODE_DDY_COARSE:
> >
> > + return true;
> >
> > + default:
> >
> > + return false;
> >
> > + }
> >
> > +}
> >
> > +
> >
> > bool
> >
> > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry
> > *entry)
> >
> > {
> >
> > @@ -407,6 +421,13 @@ fs_visitor::try_copy_propagate(fs_inst *inst,
> > int arg, acp_entry *entry)
> >
> > inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE)
> >
> > return false;
> >
> >
> >
> > + /* Some instructions implemented in the generator backend, such
> > as
> >
> > + * derivatives, assume that their operands are packed so we
> > can't
> >
> > + * generally propagate strided regions to them.
> >
> > + */
> >
> > + if (instruction_requires_packed_data(inst) && entry->src.stride
> > > 1)
> >
> > + return false;
> >
> > +
> >
> > /* Bail if the result of composing both strides would exceed
> > the
> >
> > * hardware limit.
> >
> > */
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181210/1d422e15/attachment.html>
More information about the mesa-dev
mailing list