[Mesa-dev] [PATCH 4/5] glsl: Vectorize multiple scalar assignments
Matt Turner
mattst88 at gmail.com
Fri Jan 10 17:57:50 PST 2014
On Fri, Jan 10, 2014 at 5:53 PM, Ian Romanick <idr at freedesktop.org> wrote:
> On 01/10/2014 03:27 PM, Matt Turner wrote:
>> On Thu, Jan 9, 2014 at 11:28 AM, Ian Romanick <idr at freedesktop.org> wrote:
>>> On 01/08/2014 12:43 PM, Matt Turner wrote:
>>>> +/**
>>>> + * \file opt_vectorize.cpp
>>>> + *
>>>> + * Combines scalar assignments of the same expression (modulo swizzle) to
>>>> + * multiple channels of the same variable into a single vectorized expression
>>>> + * and assignment.
>>>> + *
>>>> + * Many generated shaders contain scalarized code. That is, they contain
>>>> + *
>>>> + * r1.x = log2(v0.x);
>>>> + * r1.y = log2(v0.y);
>>>> + * r1.z = log2(v0.z);
>>>> + *
>>>> + * rather than
>>>> + *
>>>> + * r1.xyz = log2(v0.xyz);
>>>> + *
>>>> + * We look for consecutive assignments of the same expression (modulo swizzle)
>>>> + * to each channel of the same variable.
>>>> + *
>>>> + * For instance, we want to convert these three scalar operations
>>>> + *
>>>> + * (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref v0))))
>>>> + * (assign (y) (var_ref r1) (expression float log2 (swiz y (var_ref v0))))
>>>> + * (assign (z) (var_ref r1) (expression float log2 (swiz z (var_ref v0))))
>>>> + *
>>>> + * into a single vector operation
>>>> + *
>>>> + * (assign (xyz) (var_ref r1) (expression vec3 log2 (swiz xyz (var_ref v0))))
>>>
>>> I think it's worth adding a note that this pass only attempts to combine
>>> assignments that are sequential.
>>
>> That comment block already says that:
>>
>> + * We look for consecutive assignments of the same expression (modulo swizzle)
>> + * to each channel of the same variable.
>
> I guess I overlooked that word. :(
>
>> I'll change the first comment to use the word consecutive.
>>
>>> The above example gets fully
>>> vectorized, but this sequence would not:
>>>
>>> (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref v0))))
>>> (assign (x) (var_ref r2) (expression float log2 (swiz y (var_ref v0))))
>>> (assign (y) (var_ref r1) (expression float log2 (swiz z (var_ref v0))))
>>> (assign (y) (var_ref r2) (expression float log2 (swiz w (var_ref v0))))
>>>
>>> I think this will also break on code like
>>>
>>> (assign (x) (var_ref r1) (expression float log2 (swiz w (var_ref r1))))
>>> (assign (y) (var_ref r1) (expression float log2 (swiz z (var_ref r1))))
>>> # r1.xy have different values now.
>>> (assign (z) (var_ref r1) (expression float log2 (swiz y (var_ref r1))))
>>> (assign (w) (var_ref r1) (expression float log2 (swiz x (var_ref r1))))
>>>
>>> Maybe just skip assignments where the LHS also appears in the RHS for
>>> now? Or does the check write_mask_matches_swizzle take care of this?
>>
>> It won't break because the code rejects expressions that contain
>> swizzles that don't match the LHS's write mask. See the call to
>> write_mask_matches_swizzle().
>>
>> The good thing about this is that we can combine expressions that use
>> the LHS, like
>>
>> (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref r1))))
>> (assign (y) (var_ref r1) (expression float log2 (swiz y (var_ref r1))))
>
> Okay... that's what I thought, but I wanted to be sure.
>
> With the slight tweak to the header comment (that you mention above),
> patches 2, 4, and 5 are
>
> Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
Cool, thanks. I'll fix up patch 3 and send it back out.
> This means we also won't vectorize things like
>
> (assign (x) (var_ref r1) (expression float * (swiz x (var_ref r1)) (swiz x (var_ref r2))))
> (assign (y) (var_ref r1) (expression float * (swiz y (var_ref r1)) (swiz x (var_ref r2))))
> (assign (z) (var_ref r1) (expression float * (swiz z (var_ref r1)) (swiz x (var_ref r2))))
> (assign (w) (var_ref r1) (expression float * (swiz w (var_ref r1)) (swiz x (var_ref r2))))
>
> Right? If there are occurances of that pattern in shaderdb, that may
> be an opportunity for some follow-on work...
That's right. I think the best way to find out if that exists in the
wild is to implement it in the compiler and see if anything changes.
:)
More information about the mesa-dev
mailing list