[Mesa-dev] [PATCH 1/3] i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.

Matt Turner mattst88 at gmail.com
Wed Feb 4 11:06:38 PST 2015


On Wed, Feb 4, 2015 at 2:18 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Matt Turner <mattst88 at gmail.com> writes:
>
>> Prevents piglit regressions from the next patch.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 37 +++++++++++++++++++++++++-
>>  1 file changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index 77d4908..8cd36f8 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -1734,7 +1734,42 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>>           brw_F16TO32(p, dst, src[0]);
>>           break;
>>        case BRW_OPCODE_CMP:
>> -      brw_CMP(p, dst, inst->conditional_mod, src[0], src[1]);
>> +         /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says
>> +          * that when the destination is a GRF that the dependency-clear bit on
>> +          * the flag register is cleared early.
>> +          *
>> +          * Suggested workarounds are to disable coissuing CMP instructions
>> +          * or to split CMP(16) instructions into two CMP(8) instructions.
>> +          *
>> +          * We choose to split into CMP(8) instructions since disabling
>> +          * coissuing would affect CMP instructions not otherwise affected by
>> +          * the errata.
>> +          */
>> +         if (dispatch_width == 16 && brw->gen == 7 && !brw->is_haswell) {
>> +            if (dst.file == BRW_GENERAL_REGISTER_FILE) {
>> +               brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
>> +               brw_CMP(p, firsthalf(dst), inst->conditional_mod,
>> +                          firsthalf(src[0]), firsthalf(src[1]));
>> +               brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF);
>> +               brw_CMP(p, sechalf(dst), inst->conditional_mod,
>> +                          sechalf(src[0]), sechalf(src[1]));
>> +               brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED);
>> +
>> +               multiple_instructions_emitted = true;
>> +            } else if (dst.file == BRW_ARCHITECTURE_REGISTER_FILE) {
>> +               /* For unknown reasons, the aforementioned workaround is not
>> +                * sufficient. Overriding the type when the destination is the
>> +                * null register is necessary but not sufficient by itself.
>> +                */
>> +               assert(dst.nr == BRW_ARF_NULL);
>> +               dst.type = BRW_REGISTER_TYPE_D;
>> +               brw_CMP(p, dst, inst->conditional_mod, src[0], src[1]);
>
> What do you mean by not sufficient?  This is quite a common use-case of
> the CMP instruction...  Any idea what should be done?

Implementing the WaCMPInstFlagDepClearedEarly workaround by splitting
CMP(16) -> 2x CMP(8) and copying src0's type to dst in *_visitor::CMP
leads to some piglit failures (glsl-fs-atan-2.shader_test for
instance).

But overriding the null destination type to D after copying src0's
type to dst in *_visitor::CMP leads to other piglit failures
(fs-bool-less-compare-{true,false}).

So it seems that both of these are necessary but not sufficient, and
frustratingly I cannot find any documentation for why the things fail
when the null destination's type is float.


More information about the mesa-dev mailing list