[Mesa-dev] [PATCH] r600g: Fix access to constants > 31.

Jerome Glisse j.glisse at gmail.com
Sun Nov 21 13:32:22 PST 2010


On Sun, Nov 21, 2010 at 4:11 PM, Tilman Sauerbeck <tilman at code-monkey.de> wrote:
> Jerome Glisse [2010-11-21 11:07]:
>> On Sun, Nov 21, 2010 at 10:30 AM, Tilman Sauerbeck
>> <tilman at code-monkey.de> wrote:
>> > Signed-off-by: Tilman Sauerbeck <tilman at code-monkey.de>
>> > ---
>> >
>> > This is pretty ugly, but hopefully it will make a proper fix easier
>> > to implement :)
>> >
>> >  src/gallium/drivers/r600/r600_asm.c    |   30 +++++++++++++++++++++++++++++-
>> >  src/gallium/drivers/r600/r600_asm.h    |    1 +
>> >  src/gallium/drivers/r600/r600_shader.c |    7 ++++++-
>> >  3 files changed, 36 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/gallium/drivers/r600/r600_asm.c b/src/gallium/drivers/r600/r600_asm.c
>> > index ba1471e..fc85432 100644
>> > --- a/src/gallium/drivers/r600/r600_asm.c
>> > +++ b/src/gallium/drivers/r600/r600_asm.c
>> > @@ -410,6 +410,7 @@ int r600_bc_add_alu_type(struct r600_bc *bc, const struct r600_bc_alu *alu, int
>> >  {
>> >        struct r600_bc_alu *nalu = r600_bc_alu();
>> >        struct r600_bc_alu *lalu;
>> > +       unsigned kcache1_addr = 0;
>> >        int i, r;
>> >
>> >        if (nalu == NULL)
>> > @@ -440,13 +441,37 @@ int r600_bc_add_alu_type(struct r600_bc *bc, const struct r600_bc_alu *alu, int
>> >        }
>> >        /* number of gpr == the last gpr used in any alu */
>> >        for (i = 0; i < 3; i++) {
>> > +               unsigned const_index;
>> > +
>> >                if (alu->src[i].sel >= bc->ngpr && alu->src[i].sel < 128) {
>> >                        bc->ngpr = alu->src[i].sel + 1;
>> >                }
>> > +
>> > +               if (alu->src[i].is_const) {
>> > +                       const_index = alu->src[i].sel - 128;
>> > +                       unsigned want_addr;
>> > +
>> > +                       /* Constants 0-31 will be read from the first
>> > +                        * set of constants (kcache0_*).
>> > +                        *
>> > +                        * Constants > 31 will be read from the second set
>> > +                        * of constants (kcache1_*).
>> > +                        */
>> > +                       if (const_index > 31) {
>> > +                               want_addr = (const_index / 32) * 2;
>> > +
>> > +                               if (!kcache1_addr || kcache1_addr == want_addr) {
>> > +                                       kcache1_addr = want_addr;
>> > +                                       nalu->src[i].sel = 160 + (const_index % 32);
>> > +                               } else
>> > +                                       fprintf(stderr, "kcache1_addr already in use (want_addr %i, is %i)\n", want_addr, kcache1_addr);
>>
>> Here instead of printing and error you should add a new clause
>> r600_add_cf iirc and this should work. Also you might want to check if
>> want_addr is in [kache1_addr ; kcache1_addr+32] range in which case
>> you don't need a new cf.
>
> My code doesn't even find the most simple kcache address clashes.
> With TGSI code as this:
>> 0: MUL TEMP[0], CONST[32], CONST[0]
>
> we'd have _one_ CF clause with one MOV instruction copying one constant
> to a temporary, and then the MUL for the temporary and the second const.
>
> The first instruction might set bc->cf_last->kcache1_addr = 2,
> but the second instruction would overwrite kcache1_addr with zero again.
>
> So at the very least we'd need to remember if we already stored
> kcache1_addr across calls to r600_bc_add_alu().
>
> Regards,
> Tilman
>

This why the asm needs a rewrite it's too fragile. Idea would be to
convert tgsi -> asm and do pass on the asm, a pass for splitting
constant, a pass for literal, a pass for constant cache addr, a pass
for cf block size, a pass to compute the stack size ... as each pass
would only do limited things it would be more robust and also very
likely simpler code. I plan to start on this soon dunno when, likely
tomorrow.

I don't intend this asm part to be like a true compiler idea is more :
glsl->scalar optimizer compiler->asm pass->bytecode building the asm
pass is mostly to hide some of the details of the hw from the
compiler, the asm representation should be very close to the hw so
that bytecode is only a matter of bit shifting.

Cheers,
Jerome


More information about the mesa-dev mailing list