[Mesa-dev] [PATCH 4/4] radeonsi: add instance divisor support

Alex Deucher alexdeucher at gmail.com
Wed Mar 27 05:43:07 PDT 2013


On Wed, Mar 27, 2013 at 7:25 AM, Michel Dänzer <michel at daenzer.net> wrote:
> On Mit, 2013-03-27 at 12:02 +0100, Christian König wrote:
>> Am 26.03.2013 18:03, schrieb Michel Dänzer:
>> > On Die, 2013-03-26 at 17:37 +0100, Christian König wrote:
>> >> Am 26.03.2013 15:56, schrieb Michel Dänzer:
>> >>> On Die, 2013-03-26 at 14:51 +0100, Christian König wrote:
>> >>>> From: Christian König <christian.koenig at amd.com>
>> >>>>
>> >>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> >>>> [...]
>> >>>> diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h b/src/gallium/drivers/radeonsi/radeonsi_shader.h
>> >>>> index 9dae742..e09f297 100644
>> >>>> --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
>> >>>> +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
>> >>>> @@ -111,13 +111,18 @@ struct si_shader {
>> >>>>          unsigned                nr_cbufs;
>> >>>>    };
>> >>>>
>> >>>> -struct si_shader_key {
>> >>>> -        unsigned                export_16bpc:8;
>> >>>> -        unsigned                nr_cbufs:4;
>> >>>> -        unsigned                color_two_side:1;
>> >>>> -        unsigned                alpha_func:3;
>> >>>> -        unsigned                flatshade:1;
>> >>>> -        float                   alpha_ref;
>> >>>> +union si_shader_key {
>> >>>> +        struct {
>> >>>> +                unsigned        export_16bpc:8;
>> >>>> +                unsigned        nr_cbufs:4;
>> >>>> +                unsigned        color_two_side:1;
>> >>>> +                unsigned        alpha_func:3;
>> >>>> +                unsigned        flatshade:1;
>> >>>> +                float           alpha_ref;
>> >>>> +        } ps;
>> >>>> +        struct {
>> >>>> +                unsigned        instance_divisors[PIPE_MAX_ATTRIBS];
>> >>>> +        } vs;
>> >>>>    };
>> >>> This grows the shader key from 8 to 128 bytes. I don't suppose the
>> >>> instance divisors could be encoded in a more compact way? E.g. loading
>> >>> the divisor values from constants and only tracking which elements use a
>> >>> divisor in a bitmask in the key.
>> >> Considered that also, and I have two problems with that approach:
>> >> 1. While immediates are converted to shifts & muls, dividing even by a
>> >> constant in the shader isn't cheap.
>> > Is that really significant? How much work would it be to come up with a
>> > worst case test and measure the difference?
>>
>> Well no idea how to measure that on SI,
>
> I'd guess something like: With otherwise trivial vertex and pixel
> shaders, draw huge numbers of triangles generating one pixel each, and
> measure how long it takes.
>
>
>> but when I implemented the same feature on R600 the difference between
>> using reciprocal and mul compared to mulhi where quite significant.
>
> I don't see anything about this in the r600g shader key though. What's
> the difference?
>
>
>> >> How about storing only a byte for the instance_divisor? That limit's the
>> >> divisor to a modulo of 256, but I don't think that would be so extremly bad.
>> > I have no idea what the impact of that would be. What happens if an app
>> > tries to use a divisor >= 256?
>>
>> It probably would select the wrong shader :(
>>
>> >> That would reduce the key to 32 bytes instead.
>> > Still seems kind of big.
>>
>> Ok how about the following compromise: First we use a short for the
>> instance divisor, that makes the key 32 bytes in size and should leave
>> enough room for larger instance divisors, and second we don't copy the
>> key around so much any more.
>
> This still won't work correctly for some legal divisor values, right? So
> I'm afraid this is a flawed compromise, as it doesn't really address the
> key size issues (potentially huge number of shader variants, cycles
> spent in memcmp, memory usage) either. If those can't be addressed, it
> should at least handle all legal values correctly.

We use the large keys now and look into runtime shader patching for
certain values as an improvement later on.

Alex


More information about the mesa-dev mailing list