[Mesa-dev] [PATCH] squash! glsl: Optimize min/max expression trees

Connor Abbott cwabbott0 at gmail.com
Mon Aug 18 08:38:13 PDT 2014

On Mon, Aug 18, 2014 at 9:26 AM, Petri Latvala <petri.latvala at intel.com> wrote:
> On 08/14/2014 11:00 AM, Connor Abbott wrote:
>> Another thing I'd like to see is to change minmax_range to call things
>> "low" and "high" instead of "range[0]" and "range[1]." This helps
>> readability, and the tricks with indirect addressing that having an
>> array lets you do are things we really shouldn't be doing anyways
>> because it's hard to follow.
> Sure, changing.
>> As I mentioned before, swizzle_if_required() should probably use the
>> ir_builder swizzle helpers.
> I copied swizzle_if_required from opt_algebraic. I'll squeeze in a patch
> that changes that as well. Or actually just refactor the function to live
> somewhere where it's reusable.
>> I'm still not convinced that the algorithm is the best way to go about
>> it. Right now, AFAICT, we do something like:
>> - Pass in a "base range," which is what the min's and max's above us
>> in the tree will clamp the value we return to
>> - Get the ranges for each subexpression (this is a recursive call)
>> - Check and see if each operand is unnecessary (i.e. its range is
>> strictly greater than the base range or strictly greater than the
>> other argument for mins, the other way around for max's)
>> As another thing, the logic for this part could be made a *lot*
>> clearer by rearranging the code and commenting. I'd do something like:
>> bool is_redundant = false /* whether this operand will never affect
>> the final value of the min-max tree */
>> if (is_min) {
>>     /* if this operand will always be greater than the other one, it's
>> redundant */
>>     if (limit[i].low > limit[1 - i].high)
>>        is_redundant = true;
>>     /* if this operand is always greater than baserange, then even if
>> it's smaller than the other one it'll get clamped so it's redundant */
>>     if (limit[i].low > baserange.high)
>>        is_redundant = true;
>> } else {
>>     ... the exact same logic mirrored ...
>> }
>> - Recurse into the subexpressions, computing the new baserange.
>> What I think we should do instead is change prune_expression() to also
>> return the range for the expression (it's now returning two things, so
>> one would have to be passed via a class variable), so it would look
>> like:
>> - Pass in the base range
>> - If this is a constant, return ourself and the range with low == high
>> - Recurse into both subexpressions, setting both the range (limits[i])
>> and the new subexpression
>> - If one of the subexpressions is redundant, return the other
>> subexpression and its range
>> - Otherwise, return ourself and the combination of the ranges
>> This will allow us to do the recursion only once, instead of once in
>> get_range() and once in prune_expression(), which will make things
>> simpler and faster.
> You mean have only prune_expression(), cut out get_range()?
> I tried hard to have this recurse only once and it looks impossible to me.
> Consider this (hopefully this ascii art gets through fine):
>          max
>       /       \
>      max     max
>     /   \   /   \
>    3    a   b    2
> (If ascii art failed, it's    max(max(3, a), max(b, 2)) )
> a and b are variables, 2 and 3 constants. 2 is to be dropped from the right
> subtree of the top max, but for that we need the 3 from the left subtree.
> prune_expression() on the left subtree will get us the 3 as the limit, which
> correctly drops the 2 when recursed to the right subtree. What about if 3
> and 2 are swapped in the tree?

Ah, I see. Can you add a comment somewhere (perhaps before the call to
get_range()) that explains this all so some dummy like me doesn't
later ask why we recurse twice? Something like:

"Recurse to get the ranges for each of the subtrees of this
expression. We need to do this as a separate step because we need to
know the ranges of each of the subtrees before we prune either one.
Consider something like this:

(your ASCII art)

We would like to prune away the max on the bottom-right, but to do so
we need to know the range of the expression on the left beforehand,
and there's no guarantee that we will visit either subtree in a
particular order."

More information about the mesa-dev mailing list