[Mesa-dev] [PATCH 2/3] i965: Add runtime checks for line antialiasing in Gen < 6.

Iago Toral itoral at igalia.com
Thu May 29 01:21:57 PDT 2014


Hi Kenneth,

On Wed, 2014-05-28 at 12:42 -0700, Kenneth Graunke wrote:
> On 05/27/2014 03:50 AM, Iago Toral Quiroga wrote:
> > In Gen < 6 the hardware generates a runtime bit that indicates whether AA data
> > has to be sent as part of the framebuffer write SEND message. This affects the
> > specific case where we have setup antialiased line rendering and we render
> > polygons which have one face setup in GL_LINE mode (line antialiasing
> > will be used) and the other one in GL_FILL mode (no line antialiasing needed).
> > 
> > Currently we are not doing this runtime test and instead we always send AA
> > data, which produces incorrect rendering of the GL_FILL face of the polygon in
> > in the aforementioned scenario (verified in ironlake and gm45).
> > 
> > In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In
> > Gen5 this has never worked properly. Gen > 5 are not affected by this.
> > 
> > The patch fixes the problem by adding the appropriate runtime check and
> > adjusting the framebuffer write message accordingly in the conflictive
> > scenario (detected with fs_visitor::runtime_check_aads_emit == TRUE).
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.h           |  4 ++
> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 +++++++++++++++++-----------
> >  2 files changed, 58 insertions(+), 32 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
> > index 60a4906..ab8912f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.h
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> > @@ -452,6 +452,10 @@ public:
> >  
> >     void emit_color_write(int target, int index, int first_color_mrf);
> >     void emit_alpha_test();
> > +   void do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
> > +                         bool header_present);
> > +   void emit_fb_write(int target, int base_mrf, int mlen, bool eot,
> > +                      bool header_present);
> >     void emit_fb_writes();
> >  
> >     void emit_shader_time_begin();
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > index 171f063..4c3897b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > @@ -2731,6 +2731,54 @@ fs_visitor::emit_alpha_test()
> >  }
> >  
> >  void
> > +fs_visitor::do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
> > +                             bool header_present)
> > +{
> > +   fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
> > +   inst->target = target;
> > +   inst->base_mrf = base_mrf;
> > +   inst->mlen = mlen;
> > +   inst->eot = eot;
> > +   inst->header_present = header_present;
> > +   if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
> > +      inst->predicate = BRW_PREDICATE_NORMAL;
> > +      inst->flag_subreg = 1;
> > +   }
> > +}
> > +
> > +void
> > +fs_visitor::emit_fb_write(int target, int base_mrf, int mlen, bool eot,
> > +                          bool header_present)
> > +{
> > +   if (!runtime_check_aads_emit) {
> > +      do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
> > +   } else {
> > +      /* This can only happen in Gen < 6
> > +       */
> > +      fs_reg reg_tmp_ud = fs_reg(this, glsl_type::uint_type);
> > +      emit(AND(reg_tmp_ud,
> > +               fs_reg(get_element_ud(brw_vec8_grf(1,0), 6)),
> 
> I think
> 
>     retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)
> 
> might be a little clearer than:
> 
>     get_element_ud(brw_vec8_grf(1,0), 6))
> 
> since it just refers to r1.6 right away, rather than r1.0 modified to
> have a suboffset of 6.

Sure, that looks better.

> > +               fs_reg(brw_imm_ud(1<<26))));
> > +      emit(CMP(reg_null_ud,
> > +               reg_tmp_ud,
> > +               fs_reg(brw_imm_ud(0)),
> > +               BRW_CONDITIONAL_Z));
> 
> You can actually generate a flag condition directly from the AND
> instruction, and eliminate the CMP:
> 
> fs_inst *inst =
>    emit(AND(reg_null_ud,
>             fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
>             fs_reg(0)));
> inst->conditional_mod = BRW_CONDITIONAL_Z;
> 
> (you might have to use vec1(retype(brw_null_reg), BRW_REGISTER_TYPE_UD),
> rather than reg_null_ud.)

Oh, that's much nicer! We also get rid of the temporary register with
this.

I think you messed up the parameters to the AND though. I believe this
is what you meant:

fs_inst *inst =
    emit(AND(vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)),
             fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)),
             fs_reg(brw_imm_ud(1<<26))));
inst->conditional_mod = BRW_CONDITIONAL_Z;

> > +      emit(IF(BRW_PREDICATE_NORMAL));
> > +      {
> > +         /* Shift message header one register since we are not sending
> > +          * AA data stored in base_mrf+2
> > +          */
> > +         do_emit_fb_write(target, base_mrf + 1, mlen - 1, eot, header_present);
> > +      }
> > +      emit(BRW_OPCODE_ELSE);
> > +      {
> > +         do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
> > +      }
> > +      emit(BRW_OPCODE_ENDIF);
> 
> I suppose this probably works, but I've never seen a program end with an
> ENDIF.  I'm not really comfortable with doing that, since the ENDIF
> should never be executed.  With JMPI or ADD brw_ip_reg(), we'd just jump
> over one instruction and then execute one FB write or the other, which
> seems really straightforward.
> 
> But, looking at the bug, I see you tried both of those suggestions, and
> it didn't work for some reason.  Huh.  It really should...maybe I can
> look into why...

Right, I don't understand why that code did not work for me either. More
over, the code did not work as I expected in other ways, for example I
noticed that reverting the condition and the order of the fb_writes
(which should be completely the same thing) changed the behavior too...
I remember that it produced GPU hangs but I tried so many versions that
I can't tell... it felt as if the two fb_writes were being executed.  I
tried all kinds of things to make it work as intended, even including a
manual JMPI at the end of the IF branch to force the jump over the
second fb_write... but nothing.

I commented about this and included the C code and the generated
assembly for the JMPI version hoping that someone could tell if
something was off here:

http://lists.freedesktop.org/archives/mesa-dev/2014-May/059985.html

Maybe you can have a quick look at that see if there is something
obvious that I am missing...

Also, when I was trying the JMPI version I was developing the patches in
the generator rather than the visitor. I suppose that should not make
any difference though.

One last thing, JMPI was recently made private to the sf module, so if
we want to do go with it we need to revert f3cb2e6ed705.

> > +   }
> > +}
> > +
> > +void
> >  fs_visitor::emit_fb_writes()
> >  {
> >     this->current_annotation = "FB write header";
> > @@ -2848,16 +2896,7 @@ fs_visitor::emit_fb_writes()
> >        if (INTEL_DEBUG & DEBUG_SHADER_TIME)
> >           emit_shader_time_end();
> >  
> > -      fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
> > -      inst->target = 0;
> > -      inst->base_mrf = base_mrf;
> > -      inst->mlen = nr - base_mrf;
> > -      inst->eot = true;
> > -      inst->header_present = header_present;
> > -      if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
> > -         inst->predicate = BRW_PREDICATE_NORMAL;
> > -         inst->flag_subreg = 1;
> > -      }
> > +      emit_fb_write(0, base_mrf, nr - base_mrf, true, header_present);
> >  
> >        prog_data->dual_src_blend = true;
> >        this->current_annotation = NULL;
> > @@ -2894,19 +2933,10 @@ fs_visitor::emit_fb_writes()
> >              emit_shader_time_end();
> >        }
> >  
> > -      fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
> > -      inst->target = target;
> > -      inst->base_mrf = base_mrf;
> > -      if (src0_alpha_to_render_target && target == 0)
> > -         inst->mlen = nr - base_mrf - reg_width;
> > -      else
> > -         inst->mlen = nr - base_mrf;
> > -      inst->eot = eot;
> > -      inst->header_present = header_present;
> > -      if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
> > -         inst->predicate = BRW_PREDICATE_NORMAL;
> > -         inst->flag_subreg = 1;
> > -      }
> > +      int mlen = (src0_alpha_to_render_target && target == 0) ?
> > +         nr - base_mrf - reg_width : nr - base_mrf;
> 
> Might be a bit easier to read as:
> 
> int mlen = nr - base_mrf;
> if (src0_alpha_to_render_target && target == 0)
>    mlen -= reg_width;
> 
> But either way is fine.

Yeah, I think this looks better too.

> > +
> > +      emit_fb_write(target, base_mrf, mlen, eot, header_present);
> >     }
> >  
> >     if (key->nr_color_regions == 0) {
> > @@ -2919,15 +2949,7 @@ fs_visitor::emit_fb_writes()
> >        if (INTEL_DEBUG & DEBUG_SHADER_TIME)
> >           emit_shader_time_end();
> >  
> > -      fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
> > -      inst->base_mrf = base_mrf;
> > -      inst->mlen = nr - base_mrf;
> > -      inst->eot = true;
> > -      inst->header_present = header_present;
> > -      if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
> > -         inst->predicate = BRW_PREDICATE_NORMAL;
> > -         inst->flag_subreg = 1;
> > -      }
> > +      emit_fb_write(0, base_mrf, nr - base_mrf, true, header_present);
> >     }
> >  
> >     this->current_annotation = NULL;
> > 
> 
> 




More information about the mesa-dev mailing list