[Mesa-dev] [PATCH] i965: Fix virtual_grf_interferes() between calculate_live_intervals() and DCE.
Kenneth Graunke
kenneth at whitecape.org
Thu Sep 6 10:08:04 PDT 2012
On 09/05/2012 11:16 PM, Eric Anholt wrote:
> This fixes the blue zombies bug in l4d2.
>
> NOTE: This is a candidate for the 9.0 branch.
> ---
> .../drivers/dri/i965/brw_fs_live_variables.cpp | 24 ++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> index 16480ad..392691d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> @@ -221,6 +221,30 @@ fs_visitor::calculate_live_intervals()
> }
>
> this->live_intervals_valid = true;
> +
> + /* Note in the non-control-flow code above, that we only take def[] as the
> + * first store, and use[] as the last use. We use this in dead code
> + * elimination, to determine when a store never gets used. However, we
> + * also use these arrays to answer the virtual_grf_interferes() question
> + * (live interval analysis), which is used for register coalescing and
> + * register allocation.
> + *
> + * So, there's a conflict over what the array should mean: if use[]
> + * considers a def after the last use, then the dead code elimination pass
> + * never does anything (and it's an important pass!). But if we don't
> + * include dead code, then virtual_grf_interferes() lies and we'll do
> + * horrible things like coalesce the register that is dead-code-written
> + * into another register that was live across the dead write (causing the
> + * use of the second register to take the dead write's source value instead
> + * of the coalesced MOV's source value).
> + *
> + * To resolve the conflict, immediately after calculating live intervals,
> + * detect dead code, nuke it, and if we changed anything, calculate again
> + * before returning to the caller. Now we happen to produce def[] and
> + * use[] arrays that will work for virtual_grf_interferes().
> + */
> + if (dead_code_eliminate())
> + calculate_live_intervals();
> }
>
> bool
I am worried that this will come back to haunt us later. This is the
first time calculate_live_intervals() has actually *changed* the IR,
rather than analyze it.
Specifically, if an optimization pass were to do:
fs_cfg cfg(this);
calculate_live_intervals();
and the start/end of a basic block happened to be dead code, then when
we hit
for (fs_inst *inst = block->start;
inst != block->end->next;
inst = (fs_inst *) inst->next) {
we'll most assuredly start traversing data and hit a NULL pointer (if we
don't crash when dereferencing block->start) since you'll never hit
block->end.
Granted, no code actually uses both CFGs and live intervals today, but
it sounds plausible.
Here's one idea for safeguarding against this:
1. Add a new "int using_cfg" variable to fs_visitor.
2. In the fs_cfg() constructor, increment v->using_cfg.
3. In the fs_cfg() destructor, decrement v->using_cfg.
4. In calculate_live_intervals, assert(!using_cfg);
Then it'd least die with a reasonable message.
What do you think? Worth doing? Something else? Am I being too paranoid?
Regardless of the paranoia I think you should push this (unless you
think of a better solution). After spending nearly a whole week hitting
my head against the wall, I'm ready to see this fixed.
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
More information about the mesa-dev
mailing list