[Mesa-dev] [PATCH 2/3] i965: Add runtime checks for line antialiasing in Gen < 6.
Kenneth Graunke
kenneth at whitecape.org
Thu May 29 08:31:55 PDT 2014
On 05/29/2014 01:21 AM, Iago Toral wrote:
> Hi Kenneth,
>
> On Wed, 2014-05-28 at 12:42 -0700, Kenneth Graunke wrote:
>> On 05/27/2014 03:50 AM, Iago Toral Quiroga wrote:
>>> In Gen < 6 the hardware generates a runtime bit that indicates whether AA data
>>> has to be sent as part of the framebuffer write SEND message. This affects the
>>> specific case where we have setup antialiased line rendering and we render
>>> polygons which have one face setup in GL_LINE mode (line antialiasing
>>> will be used) and the other one in GL_FILL mode (no line antialiasing needed).
>>>
>>> Currently we are not doing this runtime test and instead we always send AA
>>> data, which produces incorrect rendering of the GL_FILL face of the polygon in
>>> in the aforementioned scenario (verified in ironlake and gm45).
>>>
>>> In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In
>>> Gen5 this has never worked properly. Gen > 5 are not affected by this.
>>>
>>> The patch fixes the problem by adding the appropriate runtime check and
>>> adjusting the framebuffer write message accordingly in the conflictive
>>> scenario (detected with fs_visitor::runtime_check_aads_emit == TRUE).
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
>>> ---
>>> src/mesa/drivers/dri/i965/brw_fs.h | 4 ++
>>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 +++++++++++++++++-----------
>>> 2 files changed, 58 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
>>> index 60a4906..ab8912f 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>>> @@ -452,6 +452,10 @@ public:
>>>
>>> void emit_color_write(int target, int index, int first_color_mrf);
>>> void emit_alpha_test();
>>> + void do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> + bool header_present);
>>> + void emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> + bool header_present);
>>> void emit_fb_writes();
>>>
>>> void emit_shader_time_begin();
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> index 171f063..4c3897b 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> @@ -2731,6 +2731,54 @@ fs_visitor::emit_alpha_test()
>>> }
>>>
>>> void
>>> +fs_visitor::do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> + bool header_present)
>>> +{
>>> + fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
>>> + inst->target = target;
>>> + inst->base_mrf = base_mrf;
>>> + inst->mlen = mlen;
>>> + inst->eot = eot;
>>> + inst->header_present = header_present;
>>> + if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
>>> + inst->predicate = BRW_PREDICATE_NORMAL;
>>> + inst->flag_subreg = 1;
>>> + }
>>> +}
>>> +
>>> +void
>>> +fs_visitor::emit_fb_write(int target, int base_mrf, int mlen, bool eot,
>>> + bool header_present)
>>> +{
>>> + if (!runtime_check_aads_emit) {
>>> + do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
>>> + } else {
>>> + /* This can only happen in Gen < 6
>>> + */
>>> + fs_reg reg_tmp_ud = fs_reg(this, glsl_type::uint_type);
>>> + emit(AND(reg_tmp_ud,
>>> + fs_reg(get_element_ud(brw_vec8_grf(1,0), 6)),
>>
>> I think
>>
>> retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)
>>
>> might be a little clearer than:
>>
>> get_element_ud(brw_vec8_grf(1,0), 6))
>>
>> since it just refers to r1.6 right away, rather than r1.0 modified to
>> have a suboffset of 6.
>
> Sure, that looks better.
>
>>> + fs_reg(brw_imm_ud(1<<26))));
>>> + emit(CMP(reg_null_ud,
>>> + reg_tmp_ud,
>>> + fs_reg(brw_imm_ud(0)),
>>> + BRW_CONDITIONAL_Z));
>>
>> You can actually generate a flag condition directly from the AND
>> instruction, and eliminate the CMP:
>>
>> fs_inst *inst =
>> emit(AND(reg_null_ud,
>> fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
>> fs_reg(0)));
>> inst->conditional_mod = BRW_CONDITIONAL_Z;
>>
>> (you might have to use vec1(retype(brw_null_reg), BRW_REGISTER_TYPE_UD),
>> rather than reg_null_ud.)
>
> Oh, that's much nicer! We also get rid of the temporary register with
> this.
>
> I think you messed up the parameters to the AND though. I believe this
> is what you meant:
>
> fs_inst *inst =
> emit(AND(vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)),
> fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)),
> fs_reg(brw_imm_ud(1<<26))));
> inst->conditional_mod = BRW_CONDITIONAL_Z;
Whoops. Yeah, that's what I meant :)
>>> + emit(IF(BRW_PREDICATE_NORMAL));
>>> + {
>>> + /* Shift message header one register since we are not sending
>>> + * AA data stored in base_mrf+2
>>> + */
>>> + do_emit_fb_write(target, base_mrf + 1, mlen - 1, eot, header_present);
>>> + }
>>> + emit(BRW_OPCODE_ELSE);
>>> + {
>>> + do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
>>> + }
>>> + emit(BRW_OPCODE_ENDIF);
>>
>> I suppose this probably works, but I've never seen a program end with an
>> ENDIF. I'm not really comfortable with doing that, since the ENDIF
>> should never be executed. With JMPI or ADD brw_ip_reg(), we'd just jump
>> over one instruction and then execute one FB write or the other, which
>> seems really straightforward.
>>
>> But, looking at the bug, I see you tried both of those suggestions, and
>> it didn't work for some reason. Huh. It really should...maybe I can
>> look into why...
>
> Right, I don't understand why that code did not work for me either. More
> over, the code did not work as I expected in other ways, for example I
> noticed that reverting the condition and the order of the fb_writes
> (which should be completely the same thing) changed the behavior too...
> I remember that it produced GPU hangs but I tried so many versions that
> I can't tell... it felt as if the two fb_writes were being executed. I
> tried all kinds of things to make it work as intended, even including a
> manual JMPI at the end of the IF branch to force the jump over the
> second fb_write... but nothing.
>
> I commented about this and included the C code and the generated
> assembly for the JMPI version hoping that someone could tell if
> something was off here:
>
> http://lists.freedesktop.org/archives/mesa-dev/2014-May/059985.html
>
> Maybe you can have a quick look at that see if there is something
> obvious that I am missing...
Yeah, I'll try and take a look...
> Also, when I was trying the JMPI version I was developing the patches in
> the generator rather than the visitor. I suppose that should not make
> any difference though.
>
> One last thing, JMPI was recently made private to the sf module, so if
> we want to do go with it we need to revert f3cb2e6ed705.
It wasn't, actually - brw_JMPI is still available in brw_eu_emit.c.
It's only brw_land_fwd_jump that moved to brw_sf_state.c, but all that
does is set the jump distance on the JMPI instruction. It seems like
you know the jump distance when emitting the JMPI (jump over 1 FB write
and maybe some MOVs) so you could just set it straightaway, and skip
that function. Or you could revert it - that'd be fine too.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140529/d9b9ca73/attachment-0001.sig>
More information about the mesa-dev
mailing list