[Mesa-dev] [PATCH] i965/fs: Remove try_replace_with_sel().

Matt Turner mattst88 at gmail.com
Fri Dec 5 10:18:44 PST 2014


On Thu, Dec 4, 2014 at 11:22 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, November 21, 2014 10:23:43 AM Matt Turner wrote:
>> On Tue, Nov 11, 2014 at 9:41 AM, Matt Turner <mattst88 at gmail.com> wrote:
>> > The rest of our backend optimizations have replaced the need for this
>> > since it was written.
>> >
>> > instructions in affected programs:     30626 -> 30564 (-0.20%)
>> >
>> > Hurts a small number of CSGO shaders by one instruction, but helps even
>> > more. Hurts two by a larger number because of something I noticed when I
>> > first wrote the SEL peephole: try_replace_with_sel() operates on
>> > instructions before we've demoted uniforms to pull constants. So code
>> > like
>> >
>> >    var.x = ( -abs(r6.w) >= 0.0 ) ? pc[82].x : r9.x;
>> >    var.y = ( -abs(r6.w) >= 0.0 ) ? pc[82].y : r9.y;
>> >    var.z = ( -abs(r6.w) >= 0.0 ) ? pc[82].z : r9.z;
>> >    var.w = ( -abs(r6.w) >= 0.0 ) ? pc[82].w : r9.w;
>> >
>> > where pc[82] gets demoted to a pull constant, we end up emitting a
>> > send(4) instruction to load pc[82] each time, and since they're in
>> > different basic blocks because we mishandle the ternary operator in this
>> > case we can't combine them. Once we handle this common ternary pattern
>> > better the problem will go away.
>> > ---
>>
>> Thoughts?
>
> I don't know...if I'm reading the above text correctly, this doesn't look
> compelling.  Your argument for deleting this is "it's not necessary anymore",
> but you go on to undermine that by saying that it hurts a few shaders, and
> even quadrouples the number of pull loads in a few cases...
>
> Sure, it'll get fixed if we handle ternary operations better...but we haven't
> yet...so...
>
> I'm pretty confused.  Maybe I'm misreading your justification...

52 shaders are helped by removing it (16 by one instruction), and only
17 are hurt. Of those 17, 15 have one extra instruction. The remaining
two are the cases I described that this pass (not by design, I think?)
is able to handle because demoting to pull constants hasn't happened
yet.

My claim is that the optimization is now a net loss.

... and that this pass isn't the place the problem in those two
shaders should be optimized. For the record, their results are:

HURT:   shaders/closed/steam/counter-strike-global-offensive/2433.shader_test
SIMD8: 668 -> 683 (2.25%)
HURT:   shaders/closed/steam/counter-strike-global-offensive/2508.shader_test
SIMD8: 733 -> 756 (3.14%)

If you don't think removing the pass is worth it, okay.


More information about the mesa-dev mailing list