[Mesa-dev] [PATCH 11/11] i965/vs: Improve live interval calculation.

Eric Anholt eric at anholt.net
Mon Oct 8 09:56:22 PDT 2012


Kenneth Graunke <kenneth at whitecape.org> writes:
> On 10/04/2012 04:07 PM, Eric Anholt wrote:
>> +
>> +   /* Now, extend those intervals using our analysis of control flow.
>> +    *
>> +    * The control flow-aware analysis was done at a channel level, while at
>> +    * this point we're distilling it down to vgrfs.
>> +    */
>
> Here's my understanding of why you did it this way:
>
> You wrote the control-flow-aware analysis on a per-channel basis because 
> that more accurately tracks liveness, and can give smaller intervals. 
> However, existing consumers of this liveness information (optimization 
> passes) currently expect it to be on a VGRF-level.  So, for 
> compatibility, you distilled it down at the very end.
>
> Someday, it might be worthwhile to make the consumers work on a 
> per-channel basis and stop distilling this.
>
> Is that reasonably accurate?

Thanks for bringing up cases where I'm not clear enough about "why".
How about this for improved comments:

/**
 * Sets up the use[] and def[] arrays.
 *
 * The basic-block-level live variable analysis needs to know which
 * variables get used before they're completely defined, and which
 * variables are completely defined before they're used.
 *
 * We independently track each channel of a vec4.  This is because we need to
 * be able to recognize a sequence like:
 *
 * ...
 * DP4 tmp.x a b;
 * DP4 tmp.y c d;
 * MUL result.xy tmp.xy e.xy
 * ...
 *
 * as having tmp live only across that sequence (assuming it's used nowhere
 * else), because it's a common pattern.  A more conservative approach that
 * doesn't get tmp marked a deffed in this block will tend to result in
 * spilling.
 */
void
vec4_live_variables::setup_def_use()

/**
 * Computes a conservative start/end of the live intervals for each virtual GRF.
 *
 * We could expose per-channel live intervals to the consumer based on the
 * information we computed in vec4_live_variables, except that our only
 * current user is virtual_grf_interferes().  So we instead union the
 * per-channel ranges into a per-vgrf range for virtual_grf_def[] and
 * virtual_grf_use[].
 *
 * We could potentially have virtual_grf_interferes() do the test per-channel,
 * which would let some interesting register allocation occur (particularly on
 * code-generated GLSL sequences from the Cg compiler which does register
 * allocation at the GLSL level and thus reuses components of the variable
 * with distinct lifetimes).  But right now the complexity of doing so doesn't
 * seem worth it, since having virtual_grf_interferes() be cheap is important
 * for register allocation performance.
 */
void
vec4_visitor::calculate_live_intervals()


>> +
>> +   /** Which devec4 reach the exit point of the block. */
>> +   bool *liveout;
>
> "Which defs"

lol sed.  I fixed this up once, and then it came back.

>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> index 1dfdcce..9482d47 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> @@ -2695,6 +2695,7 @@ vec4_visitor::vec4_visitor(struct brw_vs_compile *c,
>>      this->virtual_grf_def = NULL;
>>      this->virtual_grf_use = NULL;
>>      this->virtual_grf_sizes = NULL;
>> +   this->virtual_grf_chans = NULL;
>>      this->virtual_grf_count = 0;
>>      this->virtual_grf_reg_map = NULL;
>>      this->virtual_grf_reg_count = 0;
>
> Please move this hunk to patch 8 (as I mentioned in my other mail).
>
> This appears to be an accurate translation of brw_fs_live_variables.cpp 
> into the VS world, and handles per-channel information nicely.  I 
> haven't compared it against Muchnik's book, since that's currently 2000 
> miles away.

It's such an accurate translation I was tempted to try to keep the two
shared.

It's not as easy as I would hope.  Once you add virtual methods to the
instruction/reg clases, you have to stop using memset() to initialize
them because c++ doesn't have a good way to get sizeof(class) -
sizeof(vtable).  On the other hand, if all compilers that matter do it
the same way, we could unit test that that's the case and complain
really loud if it fails.

Only about half the file would be shareable, though, which is not much
code.

> With those tiny fixes, this is:
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
>
> I'm also curious about shader-db stats, but that's...just curiosity. :)

I've kind of given up on using shader-db for the VS, until we write a
better runner for it that doesn't result in dead-code elimination of
almost-everything-but-vertex-position.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20121008/752514af/attachment-0001.pgp>


More information about the mesa-dev mailing list