[Mesa-dev] [PATCH 7/9] i965/vec4: avoid dependency control around Align1 instructions
Connor Abbott
cwabbott0 at gmail.com
Thu Nov 19 07:34:06 PST 2015
On Thu, Nov 19, 2015 at 10:31 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Thu, Nov 19, 2015 at 6:40 AM, Matt Turner <mattst88 at gmail.com> wrote:
>> On Thu, Nov 19, 2015 at 2:05 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
>>> From: Connor Abbott <connor.w.abbott at intel.com>
>>>
>>> It appears that not only math instructions, but also MOV_BYTES or
>>> any instruction that uses Align1 mode cannot be in the middle
>>> of a dependency control sequence or the GPU will hang (at least on my
>>> BDW). This fixes GPU hangs in some fp64 tests.
>>
>> I'm pretty surprised by this assessment. Doubtful even.
>>
>>> Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
>>> ---
>>> src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 ++++++++++++-----
>>> 1 file changed, 12 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>> index 3bcd5cb..bc0a33b 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>> @@ -838,6 +838,17 @@ vec4_visitor::is_dep_ctrl_unsafe(const vec4_instruction *inst)
>>> }
>>>
>>> /*
>>> + * Instructions that use Align1 mode cause the GPU to hang when inserted
>>> + * between a NoDDClr and NoDDChk in Align16 mode. Discovered empirically.
>>> + */
>>> +
>>> + if (inst->opcode == VEC4_OPCODE_PACK_BYTES ||
>>> + inst->opcode == VEC4_OPCODE_MOV_BYTES ||
>>
>> PACK_BYTES sets depctrl itself in the generator, and at the time I
>> added it I made a test that did
>>
>> vec4 foo = vec4(packUnorm4x8(...),
>> packUnorm4x8(...),
>> packUnorm4x8(...),
>> packUnorm4x8(...))
>>
>> and confirmed that it set depctrl properly on the whole sequence.
>> There could of course be bugs somewhere, but the "hardware doesn't
>> work if you mix align1 and align16 depctrl" seems unlikely.
>>
>> Do you know of a test that this affects?
>
> This only affects FP64 tests, since there we use an align1 mov to do
> double-to-float and float-to-double. However, I tried commenting out
> emit_nir_code() and just doing essentially:
>
> emit(MOV(...))->force_writemask_all = true;
> emit(VEC4_OPCODE_PACK_BYTES, ...);
> emit(MOV(...))->force_writemask_all = true;
Err, I meant:
emit(MOV(...))->no_dd_clear = true;
emit(VEC4_OPCODE_PACK_BYTES, ...);
emit(MOV(...))->no_dd_check = true;
(this also hangs with my DOUBLE_TO_FLOAT and FLOAT_TO_DOUBLE opcodes,
which don't use the data dependency bits at all).
>
> and on my BDW it hanged. In case it's not clear: this isn't about
> setting depctrl on the instruction itself, it just can't be inside of
> a depctrl sequence (which we were already disallowing for math
> instructions anyways).
More information about the mesa-dev
mailing list