[Mesa-dev] [RFC PATCH] nir: Transform 4*x into x << 2 during late optimizations.
Jason Ekstrand
jason at jlekstrand.net
Fri May 8 12:11:55 PDT 2015
On Fri, May 8, 2015 at 11:11 AM, Ian Romanick <idr at freedesktop.org> wrote:
> On 05/08/2015 03:36 AM, Kenneth Graunke wrote:
>> According to Glenn, shifts on R600 have 5x the throughput as multiplies.
>>
>> Intel GPUs have strange integer multiplication restrictions - on most
>> hardware, MUL actually only does a 32-bit x 16-bit multiply. This
>> means the arguments aren't commutative, which can limit our constant
>> propagation options. SHL has no such restrictions.
>>
>> Shifting is probably reasonable on most people's hardware, so let's just
>> do that.
>>
>> i965 shader-db results (using NIR for VS):
>> total instructions in shared programs: 7432587 -> 7388982 (-0.59%)
>> instructions in affected programs: 1360411 -> 1316806 (-3.21%)
>> helped: 5772
>> HURT: 0
>>
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> Cc: mattst88 at gmail.com
>> Cc: jason at jlekstrand.net
>> ---
>> src/glsl/nir/nir_opt_algebraic.py | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> So...I found a bizarre issue with this patch.
>>
>> (('imul', 4, a), ('ishl', a, 2)),
>>
>> totally optimizes things. However...
>>
>> (('imul', a, 4), ('ishl', a, 2)),
>>
>> doesn't seem to do anything, even though imul is commutative, and nir_search
>> should totally handle that...
>>
>> ▄▄ ▄▄ ▄▄ ▄▄▄▄▄▄▄▄ ▄▄▄▄▄ ▄▄
>> ██ ██ ████ ▀▀▀██▀▀▀ █▀▀▀▀██ ██
>> ▀█▄ ██ ▄█▀ ████ ██ ▄█▀ ██
>> ██ ██ ██ ██ ██ ██ ▄██▀ ██
>> ███▀▀███ ██████ ██ ██ ▀▀
>> ███ ███ ▄██ ██▄ ██ ▄▄ ▄▄
>> ▀▀▀ ▀▀▀ ▀▀ ▀▀ ▀▀ ▀▀ ▀▀
>>
>> If you know why, let me know, otherwise I may have to look into it when more
>> awake.
>
> I've noticed a couple other weird things that I have been unable to
> understand. Shaders like the one below end with fmul/ffma instaed of
> flrp, for example. I understand why that happens from GLSL IR
> opt_algebraic, but it seems like nir_opt_algebraic should handle it.
Just a guess, but it's quite possibly due to the commutative
operations bug I just sent a patch for.
--Jason
> [require]
> GLSL >= 1.30
>
> [vertex shader]
> in vec4 v;
> in vec2 tc_in;
>
> out vec2 tc;
>
> void main() {
> gl_Position = v;
> tc = tc_in;
> }
>
> [fragment shader]
> in vec2 tc;
>
> out vec4 color;
>
> uniform sampler2D s;
> uniform float a;
> uniform vec3 base_color;
>
> void main() {
> vec3 tex_color = texture(s, tc).xyz;
>
> color.xyz = (base_color * a) + (tex_color * (1.0 - a));
> color.a = 1.0;
> }
>
>
>
>> diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py
>> index 400d60e..350471f 100644
>> --- a/src/glsl/nir/nir_opt_algebraic.py
>> +++ b/src/glsl/nir/nir_opt_algebraic.py
>> @@ -247,6 +247,11 @@ late_optimizations = [
>> (('fge', ('fadd', a, b), 0.0), ('fge', a, ('fneg', b))),
>> (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
>> (('fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))),
>> +
>> + # Multiplication by 4 comes up fairly often in indirect offset calculations.
>> + # Some GPUs have weird integer multiplication limitations, but shifts should work
>> + # equally well everywhere.
>> + (('imul', 4, a), ('ishl', a, 2)),
>
> This should be conditionalized on whether the platform has native integers.
>
>> ]
>>
>> print nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render()
>>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list