[Mesa-dev] [PATCH] i965: Fix virtual_grf_interferes() between calculate_live_intervals() and DCE.

Kenneth Graunke kenneth at whitecape.org
Thu Sep 6 10:08:04 PDT 2012


On 09/05/2012 11:16 PM, Eric Anholt wrote:
> This fixes the blue zombies bug in l4d2.
> 
> NOTE: This is a candidate for the 9.0 branch.
> ---
>  .../drivers/dri/i965/brw_fs_live_variables.cpp     |   24 ++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> index 16480ad..392691d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> @@ -221,6 +221,30 @@ fs_visitor::calculate_live_intervals()
>     }
>  
>     this->live_intervals_valid = true;
> +
> +   /* Note in the non-control-flow code above, that we only take def[] as the
> +    * first store, and use[] as the last use.  We use this in dead code
> +    * elimination, to determine when a store never gets used.  However, we
> +    * also use these arrays to answer the virtual_grf_interferes() question
> +    * (live interval analysis), which is used for register coalescing and
> +    * register allocation.
> +    *
> +    * So, there's a conflict over what the array should mean: if use[]
> +    * considers a def after the last use, then the dead code elimination pass
> +    * never does anything (and it's an important pass!).  But if we don't
> +    * include dead code, then virtual_grf_interferes() lies and we'll do
> +    * horrible things like coalesce the register that is dead-code-written
> +    * into another register that was live across the dead write (causing the
> +    * use of the second register to take the dead write's source value instead
> +    * of the coalesced MOV's source value).
> +    *
> +    * To resolve the conflict, immediately after calculating live intervals,
> +    * detect dead code, nuke it, and if we changed anything, calculate again
> +    * before returning to the caller.  Now we happen to produce def[] and
> +    * use[] arrays that will work for virtual_grf_interferes().
> +    */
> +   if (dead_code_eliminate())
> +      calculate_live_intervals();
>  }
>  
>  bool

I am worried that this will come back to haunt us later.  This is the
first time calculate_live_intervals() has actually *changed* the IR,
rather than analyze it.

Specifically, if an optimization pass were to do:

fs_cfg cfg(this);
calculate_live_intervals();

and the start/end of a basic block happened to be dead code, then when
we hit

   for (fs_inst *inst = block->start;
        inst != block->end->next;
        inst = (fs_inst *) inst->next) {

we'll most assuredly start traversing data and hit a NULL pointer (if we
don't crash when dereferencing block->start) since you'll never hit
block->end.

Granted, no code actually uses both CFGs and live intervals today, but
it sounds plausible.

Here's one idea for safeguarding against this:

1. Add a new "int using_cfg" variable to fs_visitor.
2. In the fs_cfg() constructor, increment v->using_cfg.
3. In the fs_cfg() destructor, decrement v->using_cfg.
4. In calculate_live_intervals, assert(!using_cfg);

Then it'd least die with a reasonable message.

What do you think?  Worth doing?  Something else?  Am I being too paranoid?

Regardless of the paranoia I think you should push this (unless you
think of a better solution).  After spending nearly a whole week hitting
my head against the wall, I'm ready to see this fixed.

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>


More information about the mesa-dev mailing list