[Mesa-dev] [PATCH 18/22] intel: Use TXQ for image_size when we have a typed surface

Wed Aug 29 14:59:42 UTC 2018

On Wed, Aug 29, 2018 at 9:29 AM Kenneth Graunke <kenneth at whitecape.org>
wrote:

> On Wednesday, August 29, 2018 5:38:59 AM PDT Jason Ekstrand wrote:
> > On Wed, Aug 29, 2018 at 1:36 AM Kenneth Graunke <kenneth at whitecape.org>
> > wrote:
> >
> > > On Friday, August 17, 2018 1:06:24 PM PDT Jason Ekstrand wrote:
> > > > ---
> > > >  src/intel/compiler/brw_eu_defines.h           |  3 +++
> > > >  src/intel/compiler/brw_fs.cpp                 |  8 ++++++
> > > >  src/intel/compiler/brw_fs_generator.cpp       | 26
> ++++++++++++++++---
> > > >  src/intel/compiler/brw_fs_nir.cpp             | 26
> +++++++++++++++++++
> > > >  .../compiler/brw_nir_lower_image_load_store.c | 15 +++++++++++
> > > >  src/intel/compiler/brw_shader.cpp             |  5 ++++
> > > >  6 files changed, 79 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/src/intel/compiler/brw_eu_defines.h
> > > b/src/intel/compiler/brw_eu_defines.h
> > > > index 2289fc9be70..c36699d7676 100644
> > > > --- a/src/intel/compiler/brw_eu_defines.h
> > > > +++ b/src/intel/compiler/brw_eu_defines.h
> > > > @@ -354,6 +354,9 @@ enum opcode {
> > > >     SHADER_OPCODE_SAMPLEINFO,
> > > >     SHADER_OPCODE_SAMPLEINFO_LOGICAL,
> > > >
> > > > +   SHADER_OPCODE_IMAGE_SIZE,
> > > > +   SHADER_OPCODE_IMAGE_SIZE_LOGICAL,
> > > > +
> > > >     /**
> > > >      * Combines multiple sources of size 1 into a larger virtual GRF.
> > > >      * For example, parameters for a send-from-GRF message.  Or,
> updating
> > > > diff --git a/src/intel/compiler/brw_fs.cpp
> > > b/src/intel/compiler/brw_fs.cpp
> > > > index 5b87991652d..1e1954270e8 100644
> > > > --- a/src/intel/compiler/brw_fs.cpp
> > > > +++ b/src/intel/compiler/brw_fs.cpp
> > > > @@ -4946,6 +4946,14 @@ fs_visitor::lower_logical_sends()
> > > >           lower_sampler_logical_send(ibld, inst,
> > > SHADER_OPCODE_SAMPLEINFO);
> > > >           break;
> > > >
> > > > +      case SHADER_OPCODE_IMAGE_SIZE_LOGICAL:
> > > > +         /* Nothing needs to be done except changing the opcode and
> > > setting a
> > > > +          * message length.
> > > > +          */
> > > > +         inst->opcode = SHADER_OPCODE_IMAGE_SIZE;
> > > > +         inst->mlen = inst->exec_size / 8;
> > >
> > > I'm not sure it's even worth having a LOGICAL opcode for this...
> > > maybe just set the message length when creating it and call it a day?
> > >
> >
> > The problem is that it has to go through SIMD width lowering which
> doesn't
> > understand mlen.  We could make SIMD width lowering just shrink the mlen
> > but that's wrong for almost all opcodes.  Thoughts?
>
> Hmm.  SIMD width lowering seems a bit odd, given that the parameter is
> always a single immediate (LOD 0), and the result is the same for all
> channels (the size is the size).  Couldn't you just do it in SIMD8,
> and have MOVs that expand it?
>
> If we do need to go through SIMD-width lowering, then I agree, setting
> mlen at the end is probably best...
>

It looks like just always doing it SIMD8 will work out fine.  I'll do that.

--Jason

> > > > +         break;
> > > > +
> > > >        case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
> > > >           lower_surface_logical_send(ibld, inst,
> > > >
> SHADER_OPCODE_UNTYPED_SURFACE_READ,
> > > > diff --git a/src/intel/compiler/brw_fs_generator.cpp
> > > b/src/intel/compiler/brw_fs_generator.cpp
> > > > index e265d59ccbe..13293d7efbd 100644
> > > > --- a/src/intel/compiler/brw_fs_generator.cpp
> > > > +++ b/src/intel/compiler/brw_fs_generator.cpp
> > > > @@ -1017,6 +1017,9 @@ fs_generator::generate_tex(fs_inst *inst,
> struct
> > > brw_reg dst, struct brw_reg src
> > > >        case SHADER_OPCODE_SAMPLEINFO:
> > > >           msg_type = GEN6_SAMPLER_MESSAGE_SAMPLE_SAMPLEINFO;
> > > >           break;
> > > > +      case SHADER_OPCODE_IMAGE_SIZE:
> > > > +      msg_type = GEN5_SAMPLER_MESSAGE_SAMPLE_RESINFO;
> > > > +      break;
> > >
> > > Indentation here seems off.
> > >
> >
> > Wow.  Somehow I got tabs in there.  My .vimrc should make that almost
> > impossible...
> >
> >
> > > Maybe just add this case label above the
> > > existing case SHADER_OPCODE_TXS, which does the exact same thing?
> > >
> >
> > Sure, why not.
> >
> >
> > > >        default:
> > > >        unreachable("not reached");
> > > >        }
> > > > @@ -1126,10 +1129,20 @@ fs_generator::generate_tex(fs_inst *inst,
> struct
> > > brw_reg dst, struct brw_reg src
> > > >        }
> > > >     }
> > > >
> > > > -   uint32_t base_binding_table_index = (inst->opcode ==
> > > SHADER_OPCODE_TG4 ||
> > > > -         inst->opcode == SHADER_OPCODE_TG4_OFFSET)
> > > > -         ? prog_data->binding_table.gather_texture_start
> > > > -         : prog_data->binding_table.texture_start;
> > > > +   uint32_t base_binding_table_index;
> > > > +   switch (inst->opcode) {
> > > > +   case SHADER_OPCODE_TG4:
> > > > +   case SHADER_OPCODE_TG4_OFFSET:
> > > > +      base_binding_table_index =
> > > prog_data->binding_table.gather_texture_start;
> > > > +      break;
> > > > +   case SHADER_OPCODE_IMAGE_SIZE:
> > > > +      /* Image binding table indices are absolute */
> > >
> > > Uh, what?  That doesn't seem right.  Shouldn't this be
> > >
> > >       base_binding_table_index = prog_data->binding_table.image_start;
> > >
> >
> > No... It's dumb.  The binding table index that gets placed in the uniform
> > image param is an absolute binding table index.  If I move this patch
> after
> > the second-to-last patch, we should be able to avoid this mess.  Why
> don't
> > I try to do that.
>
> Oh, I see, I entirely missed how BTI was getting passed in...it is
> indeed coming from the uniform.  So this does seem right.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180829/5fa38dbf/attachment-0001.html>