[Mesa-dev] [PATCH 0/6][RFC] glsl: Expand opt_minmax get_range

Thu Oct 30 09:42:42 PDT 2014

On Thu, Oct 30, 2014 at 12:38 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 10/29/2014 11:59 PM, Matt Turner wrote:
>> On Wed, Oct 29, 2014 at 6:11 PM, Thomas Helland
>> <thomashelland90 at gmail.com> wrote:
>>> This series does some initial work to make expansion of
>>> the get_range function a lot cleaner.
>>> It also adds a couple simple initial ranges.
>>> These patches are by no means perfect, but I hope
>>> they will provide some feedback and ideas.
>>> I'm hoping to expand this to do the following:
>>>   -Add get_range for most opcodes I can think of
>>>   -Add more utility functions to the constant_util file.
>>>   -Repurpose the file to optimize more than just min/max.
>>>   -Elimintate if's that we know the result of
>>>   -Whatever pops into my head
>>
>> Sounds good.
>>
>>> I have some questions about undefined behaviour regarding this.
>>> Do we have anyway of signaling in our IR that
>>> the variable is the result of undefined behaviour?
>>>
>>> In compilers like llvm, if I recall, they have a flag for this
>>> so they can signal undefined behaviour and use whatever value
>>> gives the most efficient code for its uses.(used in -ffast-math).
>>>
>>> A hypotetichal situation:
>>> We find that we have sqrt(x) where x has upper bound < 0.
>>> The spec says the behavior is undefined for x < 0.
>>> The same applies for inverse sqrt, log, log2 and pow.
>>> How should this be handled?
>>> Should a warning be issued?
>>> Could we simplify this to a constant 0?
>>> That would allow more optimizations to occur.
>>
>> That's probably what I'd try first.
>>
>> I applied your series and ran our internal shader-db through it. The
>> good news is that it helps some programs!
>>
>> The bad news is that it hurts even more programs. I randomly selected
>> two, and the relevant diffs looked like this:
>>
>> -math.sat exp(8) g91<1>F         g86<8,8,1>F     null            {
>> align1 1Q compacted };
>> +math exp(8)     g91<1>F         g85<8,8,1>F     null            {
>> align1 1Q compacted };
>> +sel.l(8)        g92<1>F         g91<8,8,1>F     1F              { align1 1Q };
>>
>> So we're saying we know the result of exp() must be >= 0, so no need
>> to handle the lower bound. Instead just clamp the top. Except saturate
>> is free and just clamping the top is not.
>>
>> Disabling ir_unop_exp/ir_unop_exp2 from patch 6/6 shows some programs
>> actually do benefit from this optimization though. Before, they did
>> things like:
>>
>> math exp(8)     g17<1>F         g14<8,8,1>F     null            {
>> align1 1Q compacted };
>> sel.ge(8)       g124<1>F        g17<8,8,1>F     0F              {
>> align1 1Q compacted };
>>
>> That tells me that there are gains to be had here. We just have to
>> figure out how.
>>
>> I'm not exactly sure how the best way to handle this is, but it seems
>> like we want to trim useless clamps *iff* they cannot be paired with
>> another to form a saturate.
>
> Would it be sufficient to do ir_unop_saturate generation before this
> pass?  Or is the pass breaking the saturate up?  Also... wasn't Eric
> saying his platform didn't have free saturates?

FWIW freedreno doesn't have a saturate instruction (or flag or
whatever). It's implemented with min/max.