[Mesa-dev] [PATCH] i965/fs: Implement SIMD16 dual source blending.

Iago Toral Quiroga itoral at igalia.com
Wed Sep 17 23:08:19 PDT 2014


Hi Jason,

On mié, 2014-09-17 at 11:39 -0700, Jason Ekstrand wrote:
> I haven't tested this yet, just looked it over.  I've got a couple of
> inline comments below.  One general comment though: I'm currently
> working on a bunch of compiler stuff that reworks the way we do FB
> writes.  In particular, it reworks things to use GRF registers instead
> of the MRF.  It probably wouldn't be too bad for me to rebase on top
> of this or to rebase your patch on top of what I'm doing.  I just
> thought I'd warn you about the conflict.

Thanks for the warning, I suppose it should be easy for me to rebase
this patch on top of your work if you happen to land yours sooner.

> On Wed, Sep 17, 2014 at 4:36 AM, Iago Toral Quiroga
> <itoral at igalia.com> wrote:
>         From the SNB PRM, volume 4, part 1, page 193:
>         "The dual source render target messages only have SIMD8 forms
>         due to
>         maximum message length limitations. SIMD16 pixel shaders must
>         send two of
>         these messages to cover all of the pixels. Each message
>         contains two colors
>         (4 channels each) for each pixel in the message payload."
>         
>         Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82831
>         ---
>          src/mesa/drivers/dri/i965/brw_eu.h             |  1 +
>          src/mesa/drivers/dri/i965/brw_eu_emit.c        |  3 +-
>          src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++++++--
>          src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 41
>         +++++++++++++++++++-------
>          4 files changed, 45 insertions(+), 14 deletions(-)
>         
>         I tested this on SandyBridge and IvyBridge. No piglit
>         regressions in these
>         platforms, but would be nice if someone could test this in
>         later platforms too.
>         
>         I only noticed these two tests for dual source blending in
>         piglit though:
>         tests/spec/ext_framebuffer_multisample/alpha-to-one-dual-src-blend.cpp
>         tests/spec/ext_framebuffer_multisample/alpha-to-coverage-dual-src-blend.cpp
>         
>         The first one fails, in both platforms with and without my
>         patch. The second one
>         passes in both platforms, with and without my patch.
>         
>         I also tested this with a seprate test program to verify that
>         it worked, at
>         least, in a simple case.
>         
>         diff --git a/src/mesa/drivers/dri/i965/brw_eu.h
>         b/src/mesa/drivers/dri/i965/brw_eu.h
>         index e6c26e3..5908ba5 100644
>         --- a/src/mesa/drivers/dri/i965/brw_eu.h
>         +++ b/src/mesa/drivers/dri/i965/brw_eu.h
>         @@ -266,6 +266,7 @@ void brw_fb_WRITE(struct brw_compile *p,
>                            unsigned msg_length,
>                            unsigned response_length,
>                            bool eot,
>         +                  bool last_render_target,
>                            bool header_present);
>         
>          void brw_SAMPLE(struct brw_compile *p,
>         diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>         b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>         index 39f94e9..ffdbe6d 100644
>         --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>         +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>         @@ -2251,6 +2251,7 @@ void brw_fb_WRITE(struct brw_compile *p,
>                            unsigned msg_length,
>                            unsigned response_length,
>                            bool eot,
>         +                  bool last_render_target,
>                            bool header_present)
>          {
>             struct brw_context *brw = p->brw;
>         @@ -2290,7 +2291,7 @@ void brw_fb_WRITE(struct brw_compile *p,
>                                     msg_type,
>                                     msg_length,
>                                     header_present,
>         -                           eot, /* last render target write
>         */
>         +                           last_render_target,
>                                     response_length,
>                                     eot,
>                                     0 /* send_commit_msg */);
>         diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>         b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>         index 1bc10f5..a4b84aa 100644
>         --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>         +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>         @@ -121,9 +121,12 @@ fs_generator::fire_fb_write(fs_inst
>         *inst,
>         
>             if (inst->opcode == FS_OPCODE_REP_FB_WRITE)
>                msg_control =
>         BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE_REPLICATED;
>         -   else if (prog_data->dual_src_blend)
>         -      msg_control =
>         BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN01;
>         -   else if (dispatch_width == 16)
>         +   else if (prog_data->dual_src_blend) {
>         +      if (dispatch_width == 8 || !inst->eot)
>         +         msg_control =
>         BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN01;
>         +      else
>         +         msg_control =
>         BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN23;
>         +   } else if (dispatch_width == 16)
>                msg_control =
>         BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE;
>             else
>                msg_control =
>         BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_SINGLE_SOURCE_SUBSPAN01;
>         @@ -131,6 +134,9 @@ fs_generator::fire_fb_write(fs_inst *inst,
>             uint32_t surf_index =
>                prog_data->binding_table.render_target_start +
>         inst->target;
>         
>         +   bool last_render_target = inst->eot ||
>         +                             (prog_data->dual_src_blend &&
>         dispatch_width == 16);
>         +
>             brw_fb_WRITE(p,
>                          dispatch_width,
>                          base_reg,
>         @@ -140,6 +146,7 @@ fs_generator::fire_fb_write(fs_inst *inst,
>                          nr,
>                          0,
>                          inst->eot,
>         +                last_render_target,
>                          inst->header_present);
>         
>             brw_mark_surface_used(&prog_data->base, surf_index);
>         @@ -254,6 +261,7 @@
>         fs_generator::generate_blorp_fb_write(fs_inst *inst)
>                          inst->mlen,
>                          0,
>                          true,
>         +                true,
>                          inst->header_present);
>          }
>         
>         diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>         b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>         index 5d2e7c8..21dc5bc 100644
>         --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>         +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>         @@ -3064,12 +3064,6 @@ fs_visitor::emit_fb_writes()
>             int reg_width = dispatch_width / 8;
>             bool src0_alpha_to_render_target = false;
>         
>         -   if (do_dual_src) {
>         -      no16("GL_ARB_blend_func_extended not yet supported in
>         SIMD16.");
>         -      if (dispatch_width == 16)
>         -         do_dual_src = false;
>         -   }
>         -
>             /* From the Sandy Bridge PRM, volume 4, page 198:
>              *
>              *     "Dispatched Pixel Enables. One bit per pixel
>         indicating
>         @@ -3109,11 +3103,22 @@ fs_visitor::emit_fb_writes()
>                nr += 1;
>             }
>         
>         -   /* Reserve space for color. It'll be filled in per MRT
>         below. */
>         +   /* Reserve space for color. It'll be filled in per MRT
>         below.
>         +    *
>         +    * From the SNB PRM, volume 4, part 1, page 193:
>         +    * "The dual source render target messages only have SIMD8
>         forms due to
>         +    * maximum message length limitations. SIMD16 pixel
>         shaders must send two of
>         +    * these messages to cover all of the pixels. Each message
>         contains two
>         +    * colors (4 channels each) for each pixel in the message
>         payload."
>         +    *
>         +    * So color data in dual source mode, whether this is a
>         SIMD8 or SIMD16
>         +    * program, always requires 8 MRF registers.
>         +    */
>             int color_mrf = nr;
>         -   nr += 4 * reg_width;
>             if (do_dual_src)
>         -      nr += 4;
>         +      nr += 8;
>         +   else
>         +      nr += 4 * reg_width;
>             if (src0_alpha_to_render_target)
>                nr += reg_width;
>         
>         @@ -3173,13 +3178,29 @@ fs_visitor::emit_fb_writes()
>                inst->target = 0;
>                inst->base_mrf = base_mrf;
>                inst->mlen = nr - base_mrf;
>         -      inst->eot = true;
>         +      inst->eot = dispatch_width == 8;
>                inst->header_present = header_present;
>                if ((brw->gen >= 8 || brw->is_haswell) &&
>         prog_data->uses_kill) {
>                   inst->predicate = BRW_PREDICATE_NORMAL;
>                   inst->flag_subreg = 1;
>                }
>  
> 
> It appears as if you don't copy the second half of the colors into the
> message before doing the second send.  Is there a reason for this?
> It's quite possible that the piglit tests are only using solid colors
> in which case they might not catch this.
> 

Yes, I noticed that after I sent the patch. It is a mistake, I'll send a
second version with this fixed. At least the standalone program I was
using to test this was using solid colors, and I suppose piglit tests do
the same.. maybe we should add one more test or something.

Iago

>         +      /* SIMD16 dual source blending requires to send two
>         SIMD8 dual source
>         +       * messages, so here goes the second one.
>         +       */
>         +      if (dispatch_width == 16) {
>         +         fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
>         +         inst->target = 0;
>         +         inst->base_mrf = base_mrf;
>         +         inst->mlen = nr - base_mrf;
>         +         inst->eot = true;
>         +         inst->header_present = header_present;
>         +         if ((brw->gen >= 8 || brw->is_haswell) &&
>         prog_data->uses_kill) {
>         +            inst->predicate = BRW_PREDICATE_NORMAL;
>         +            inst->flag_subreg = 1;
>         +         }
>         +      }
>         +
>                prog_data->dual_src_blend = true;
>                this->current_annotation = NULL;
>                return;
>         --
>         1.9.1
>         
>         _______________________________________________
>         mesa-dev mailing list
>         mesa-dev at lists.freedesktop.org
>         http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 




More information about the mesa-dev mailing list