[Mesa-dev] [PATCH] i965: Fix virtual_grf_interferes() between calculate_live_intervals() and DCE.
Eric Anholt
eric at anholt.net
Fri Sep 7 08:20:14 PDT 2012
Kenneth Graunke <kenneth at whitecape.org> writes:
> On 09/05/2012 11:16 PM, Eric Anholt wrote:
>> This fixes the blue zombies bug in l4d2.
>>
>> NOTE: This is a candidate for the 9.0 branch.
>> ---
>> .../drivers/dri/i965/brw_fs_live_variables.cpp | 24 ++++++++++++++++++++
>> 1 file changed, 24 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
>> index 16480ad..392691d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
>> @@ -221,6 +221,30 @@ fs_visitor::calculate_live_intervals()
>> }
>>
>> this->live_intervals_valid = true;
>> +
>> + /* Note in the non-control-flow code above, that we only take def[] as the
>> + * first store, and use[] as the last use. We use this in dead code
>> + * elimination, to determine when a store never gets used. However, we
>> + * also use these arrays to answer the virtual_grf_interferes() question
>> + * (live interval analysis), which is used for register coalescing and
>> + * register allocation.
>> + *
>> + * So, there's a conflict over what the array should mean: if use[]
>> + * considers a def after the last use, then the dead code elimination pass
>> + * never does anything (and it's an important pass!). But if we don't
>> + * include dead code, then virtual_grf_interferes() lies and we'll do
>> + * horrible things like coalesce the register that is dead-code-written
>> + * into another register that was live across the dead write (causing the
>> + * use of the second register to take the dead write's source value instead
>> + * of the coalesced MOV's source value).
>> + *
>> + * To resolve the conflict, immediately after calculating live intervals,
>> + * detect dead code, nuke it, and if we changed anything, calculate again
>> + * before returning to the caller. Now we happen to produce def[] and
>> + * use[] arrays that will work for virtual_grf_interferes().
>> + */
>> + if (dead_code_eliminate())
>> + calculate_live_intervals();
>> }
>>
>> bool
>
> I am worried that this will come back to haunt us later. This is the
> first time calculate_live_intervals() has actually *changed* the IR,
> rather than analyze it.
>
> Specifically, if an optimization pass were to do:
>
> fs_cfg cfg(this);
> calculate_live_intervals();
>
> and the start/end of a basic block happened to be dead code, then when
> we hit
>
> for (fs_inst *inst = block->start;
> inst != block->end->next;
> inst = (fs_inst *) inst->next) {
>
> we'll most assuredly start traversing data and hit a NULL pointer (if we
> don't crash when dereferencing block->start) since you'll never hit
> block->end.
>
> Granted, no code actually uses both CFGs and live intervals today, but
> it sounds plausible.
>
> Here's one idea for safeguarding against this:
>
> 1. Add a new "int using_cfg" variable to fs_visitor.
> 2. In the fs_cfg() constructor, increment v->using_cfg.
> 3. In the fs_cfg() destructor, decrement v->using_cfg.
> 4. In calculate_live_intervals, assert(!using_cfg);
>
> Then it'd least die with a reasonable message.
>
> What do you think? Worth doing? Something else? Am I being too paranoid?
>
> Regardless of the paranoia I think you should push this (unless you
> think of a better solution). After spending nearly a whole week hitting
> my head against the wall, I'm ready to see this fixed.
Other times we've written code like this (e.g. no_batch_wrap), I've
found it irritating to maintain. For no_batch_wrap, it to catch very
likely bugs, while I feel that in this case the bugs are not likely.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20120907/9a99e89b/attachment.pgp>
More information about the mesa-dev
mailing list