[Mesa-dev] [PATCH 4/4] radeonsi: add instance divisor support

Christian König deathsimple at vodafone.de
Wed Mar 27 04:17:05 PDT 2013


Am 27.03.2013 12:02, schrieb Christian König:
> Am 26.03.2013 18:03, schrieb Michel Dänzer:
>> On Die, 2013-03-26 at 17:37 +0100, Christian König wrote:
>>> Am 26.03.2013 15:56, schrieb Michel Dänzer:
>>>> On Die, 2013-03-26 at 14:51 +0100, Christian König wrote:
>>>>> From: Christian König <christian.koenig at amd.com>
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>> [...]
>>>>> diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h 
>>>>> b/src/gallium/drivers/radeonsi/radeonsi_shader.h
>>>>> index 9dae742..e09f297 100644
>>>>> --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
>>>>> +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
>>>>> @@ -111,13 +111,18 @@ struct si_shader {
>>>>>        unsigned        nr_cbufs;
>>>>>    };
>>>>>    -struct si_shader_key {
>>>>> -    unsigned        export_16bpc:8;
>>>>> -    unsigned        nr_cbufs:4;
>>>>> -    unsigned        color_two_side:1;
>>>>> -    unsigned        alpha_func:3;
>>>>> -    unsigned        flatshade:1;
>>>>> -    float            alpha_ref;
>>>>> +union si_shader_key {
>>>>> +    struct {
>>>>> +        unsigned    export_16bpc:8;
>>>>> +        unsigned    nr_cbufs:4;
>>>>> +        unsigned    color_two_side:1;
>>>>> +        unsigned    alpha_func:3;
>>>>> +        unsigned    flatshade:1;
>>>>> +        float        alpha_ref;
>>>>> +    } ps;
>>>>> +    struct {
>>>>> +        unsigned    instance_divisors[PIPE_MAX_ATTRIBS];
>>>>> +    } vs;
>>>>>    };
>>>> This grows the shader key from 8 to 128 bytes. I don't suppose the
>>>> instance divisors could be encoded in a more compact way? E.g. loading
>>>> the divisor values from constants and only tracking which elements 
>>>> use a
>>>> divisor in a bitmask in the key.
>>> Considered that also, and I have two problems with that approach:
>>> 1. While immediates are converted to shifts & muls, dividing even by a
>>> constant in the shader isn't cheap.
>> Is that really significant? How much work would it be to come up with a
>> worst case test and measure the difference?
>
> Well no idea how to measure that on SI, but when I implemented the 
> same feature on R600 the difference between using reciprocal and mul 
> compared to mulhi where quite significant.
>
>>
>>> How about storing only a byte for the instance_divisor? That limit's 
>>> the
>>> divisor to a modulo of 256, but I don't think that would be so 
>>> extremly bad.
>> I have no idea what the impact of that would be. What happens if an app
>> tries to use a divisor >= 256?
>
> It probably would select the wrong shader :(
>
>>> That would reduce the key to 32 bytes instead.
>> Still seems kind of big.
>
> Ok how about the following compromise: First we use a short for the 
> instance divisor, that makes the key 32 bytes in size and should leave 
> enough room for larger instance divisors, and second we don't copy the 
> key around so much any more.

Ups I wanted to write 64bytes in size, sorry.

Christian.

>
> Regards,
> Christian.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



More information about the mesa-dev mailing list