[Mesa-dev] [PATCH 2/2] i965: Make DCE set null destinations on messages with side effects.

Francisco Jerez currojerez at riseup.net
Tue Jan 17 23:01:50 UTC 2017


Matt Turner <mattst88 at gmail.com> writes:

> On Tue, Jan 17, 2017 at 2:38 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>> Matt Turner <mattst88 at gmail.com> writes:
>>
>>> On Tue, Jan 17, 2017 at 11:40 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>> Kenneth Graunke <kenneth at whitecape.org> writes:
>>>>
>>>>> (Co-authored by Matt Turner.)
>>>>>
>>>>> Image atomics, for example, return a value - but the shader may not
>>>>> want to use it.  We assigned a useless VGRF destination.  This seemed
>>>>> harmless, but it can actually be quite harmful.  The register allocator
>>>>> has to assign that VGRF to a real register.  It may assign the same
>>>>> actual GRF to the destination of an instruction that follows soon after.
>>>>>
>>>>> This results in a write-after-write (WAW) dependency, and stall.
>>>>>
>>>>> A number of "Deus Ex: Mankind Divided" shaders use image atomics, but
>>>>> don't use the return value.  Several of these were hitting WAW stalls
>>>>> for nearly 14,000 (poorly estimated) cycles a pop.  Making dead code
>>>>> elimination null out the destination avoids this issue.
>>>>>
>>>>> This patch cuts one shader's estimated cycles by -98.39%!  Removing the
>>>>> message response should also help with data cluster bandwidth.
>>>>>
>>>>> On Skylake:
>>>>>
>>>>> total instructions in shared programs: 13366907 -> 13363051 (-0.03%)
>>>>> instructions in affected programs: 49635 -> 45779 (-7.77%)
>>>>> helped: 133
>>>>> HURT: 0
>>>>>
>>>>> total cycles in shared programs: 255433388 -> 248081818 (-2.88%)
>>>>> cycles in affected programs: 12370702 -> 5019132 (-59.43%)
>>>>> helped: 100
>>>>> HURT: 24
>>>>>
>>>>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>>>>> ---
>>>>>  .../dri/i965/brw_fs_dead_code_eliminate.cpp        | 56 ++++++++++++++++------
>>>>>  1 file changed, 41 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
>>>>> index 930dc733b45..885ae2638a8 100644
>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
>>>>> @@ -34,6 +34,43 @@
>>>>>   * yet in the tail end of this block.
>>>>>   */
>>>>>
>>>>> +static bool
>>>>> +can_eliminate(const fs_inst *inst, BITSET_WORD *flag_live)
>>>>> +{
>>>>> +    return inst->opcode != BRW_OPCODE_IF &&
>>>>> +           inst->opcode != BRW_OPCODE_WHILE &&
>>>>
>>>> Isn't this missing a bunch of other control flow instructions that
>>>> shouldn't be eliminated?  What's going on here, are they also being
>>>> DCE'ed?  Shouldn't this be !inst->is_control_flow()?
>>>
>>> IF/WHILE were checked specifically because on Sandybridge they can
>>> take a conditional-mod and a null destination.
>>>
>>> Other control-flow instructions (or even IF & WHILE without cmod) have
>>> BAD_FILE as a destination. Now that I think about it, I'm not really
>>> sure why if.cmod wouldn't have had a BAD_FILE destination as well...
>>>
>>
>> Heh, that explains why the code above wasn't getting rid of half the
>> control flow instructions from the program.  Thanks.
>>
>>> Since we never emit IF/WHILE with cmod anymore, it doesn't matter what
>>> we do with the code. We should probably just make sure those
>>> instructions have BAD_FILE dest. I'll be happy to clean this up after
>>> these patches go in.
>>
>> I think DCE should just refuse to eliminate any control flow
>> instructions, it's wrong regardless of what the destination register
>> file is -- Or do BAD_FILE registers have any implications regarding the
>> kind of side effects the instructions that carry them may have?  I hope
>> not.
>
> Right, I don't think BAD_FILE implies anything about side-effects.
> Changing it to check is_control_flow would be fine with me.

Sounds good -- Ken, since it's an obvious fix, do you feel like adding
the !inst->is_control_flow() check while you're refactoring that code?
With that and my other comment addressed patch gets my:

Reviewed-by: Francisco Jerez <currojerez at riseup.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170117/1205a093/attachment.sig>


More information about the mesa-dev mailing list