[Mesa-dev] [PATCH 2/3] i965: Add runtime checks for line antialiasing in Gen < 6.

Kenneth Graunke kenneth at whitecape.org
Thu May 29 08:31:55 PDT 2014


On 05/29/2014 01:21 AM, Iago Toral wrote:
> Hi Kenneth,
> 
> On Wed, 2014-05-28 at 12:42 -0700, Kenneth Graunke wrote:
>> On 05/27/2014 03:50 AM, Iago Toral Quiroga wrote:
>>> In Gen < 6 the hardware generates a runtime bit that indicates whether AA data
>>> has to be sent as part of the framebuffer write SEND message. This affects the
>>> specific case where we have setup antialiased line rendering and we render
>>> polygons which have one face setup in GL_LINE mode (line antialiasing
>>> will be used) and the other one in GL_FILL mode (no line antialiasing needed).
>>>
>>> Currently we are not doing this runtime test and instead we always send AA
>>> data, which produces incorrect rendering of the GL_FILL face of the polygon in
>>> in the aforementioned scenario (verified in ironlake and gm45).
>>>
>>> In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In
>>> Gen5 this has never worked properly. Gen > 5 are not affected by this.
>>>
>>> The patch fixes the problem by adding the appropriate runtime check and
>>> adjusting the framebuffer write message accordingly in the conflictive
>>> scenario (detected with fs_visitor::runtime_check_aads_emit == TRUE).
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs.h           |  4 ++
>>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 +++++++++++++++++-----------
>>>  2 files changed, 58 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
>>> index 60a4906..ab8912f 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>>> @@ -452,6 +452,10 @@ public:
>>>  
>>>     void emit_color_write(int target, int index, int first_color_mrf);
>>>     void emit_alpha_test();
>>> +   void do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> +                         bool header_present);
>>> +   void emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> +                      bool header_present);
>>>     void emit_fb_writes();
>>>  
>>>     void emit_shader_time_begin();
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> index 171f063..4c3897b 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> @@ -2731,6 +2731,54 @@ fs_visitor::emit_alpha_test()
>>>  }
>>>  
>>>  void
>>> +fs_visitor::do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> +                             bool header_present)
>>> +{
>>> +   fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
>>> +   inst->target = target;
>>> +   inst->base_mrf = base_mrf;
>>> +   inst->mlen = mlen;
>>> +   inst->eot = eot;
>>> +   inst->header_present = header_present;
>>> +   if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
>>> +      inst->predicate = BRW_PREDICATE_NORMAL;
>>> +      inst->flag_subreg = 1;
>>> +   }
>>> +}
>>> +
>>> +void
>>> +fs_visitor::emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> +                          bool header_present)
>>> +{
>>> +   if (!runtime_check_aads_emit) {
>>> +      do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
>>> +   } else {
>>> +      /* This can only happen in Gen < 6
>>> +       */
>>> +      fs_reg reg_tmp_ud = fs_reg(this, glsl_type::uint_type);
>>> +      emit(AND(reg_tmp_ud,
>>> +               fs_reg(get_element_ud(brw_vec8_grf(1,0), 6)),
>>
>> I think
>>
>>     retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)
>>
>> might be a little clearer than:
>>
>>     get_element_ud(brw_vec8_grf(1,0), 6))
>>
>> since it just refers to r1.6 right away, rather than r1.0 modified to
>> have a suboffset of 6.
> 
> Sure, that looks better.
> 
>>> +               fs_reg(brw_imm_ud(1<<26))));
>>> +      emit(CMP(reg_null_ud,
>>> +               reg_tmp_ud,
>>> +               fs_reg(brw_imm_ud(0)),
>>> +               BRW_CONDITIONAL_Z));
>>
>> You can actually generate a flag condition directly from the AND
>> instruction, and eliminate the CMP:
>>
>> fs_inst *inst =
>>    emit(AND(reg_null_ud,
>>             fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
>>             fs_reg(0)));
>> inst->conditional_mod = BRW_CONDITIONAL_Z;
>>
>> (you might have to use vec1(retype(brw_null_reg), BRW_REGISTER_TYPE_UD),
>> rather than reg_null_ud.)
> 
> Oh, that's much nicer! We also get rid of the temporary register with
> this.
> 
> I think you messed up the parameters to the AND though. I believe this
> is what you meant:
> 
> fs_inst *inst =
>     emit(AND(vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)),
>              fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)),
>              fs_reg(brw_imm_ud(1<<26))));
> inst->conditional_mod = BRW_CONDITIONAL_Z;

Whoops.  Yeah, that's what I meant :)

>>> +      emit(IF(BRW_PREDICATE_NORMAL));
>>> +      {
>>> +         /* Shift message header one register since we are not sending
>>> +          * AA data stored in base_mrf+2
>>> +          */
>>> +         do_emit_fb_write(target, base_mrf + 1, mlen - 1, eot, header_present);
>>> +      }
>>> +      emit(BRW_OPCODE_ELSE);
>>> +      {
>>> +         do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
>>> +      }
>>> +      emit(BRW_OPCODE_ENDIF);
>>
>> I suppose this probably works, but I've never seen a program end with an
>> ENDIF.  I'm not really comfortable with doing that, since the ENDIF
>> should never be executed.  With JMPI or ADD brw_ip_reg(), we'd just jump
>> over one instruction and then execute one FB write or the other, which
>> seems really straightforward.
>>
>> But, looking at the bug, I see you tried both of those suggestions, and
>> it didn't work for some reason.  Huh.  It really should...maybe I can
>> look into why...
> 
> Right, I don't understand why that code did not work for me either. More
> over, the code did not work as I expected in other ways, for example I
> noticed that reverting the condition and the order of the fb_writes
> (which should be completely the same thing) changed the behavior too...
> I remember that it produced GPU hangs but I tried so many versions that
> I can't tell... it felt as if the two fb_writes were being executed.  I
> tried all kinds of things to make it work as intended, even including a
> manual JMPI at the end of the IF branch to force the jump over the
> second fb_write... but nothing.
> 
> I commented about this and included the C code and the generated
> assembly for the JMPI version hoping that someone could tell if
> something was off here:
> 
> http://lists.freedesktop.org/archives/mesa-dev/2014-May/059985.html
> 
> Maybe you can have a quick look at that see if there is something
> obvious that I am missing...

Yeah, I'll try and take a look...

> Also, when I was trying the JMPI version I was developing the patches in
> the generator rather than the visitor. I suppose that should not make
> any difference though.
> 
> One last thing, JMPI was recently made private to the sf module, so if
> we want to do go with it we need to revert f3cb2e6ed705.

It wasn't, actually - brw_JMPI is still available in brw_eu_emit.c.
It's only brw_land_fwd_jump that moved to brw_sf_state.c, but all that
does is set the jump distance on the JMPI instruction.  It seems like
you know the jump distance when emitting the JMPI (jump over 1 FB write
and maybe some MOVs) so you could just set it straightaway, and skip
that function.  Or you could revert it - that'd be fine too.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140529/d9b9ca73/attachment-0001.sig>


More information about the mesa-dev mailing list