[Mesa-dev] [PATCH] [rfc] radv: offset images by a differing amount.

Marek Olšák maraeo at gmail.com
Fri Jul 7 22:27:30 UTC 2017


On Fri, Jul 7, 2017 at 9:37 PM, Dave Airlie <airlied at gmail.com> wrote:
> On 8 July 2017 at 04:07, Christian König <deathsimple at vodafone.de> wrote:
>> Am 07.07.2017 um 18:51 schrieb Marek Olšák:
>>>
>>> On Fri, Jul 7, 2017 at 11:18 AM, Christian König
>>> <deathsimple at vodafone.de> wrote:
>>>>
>>>> What tilling format have the destination textures?
>>>>
>>>> Sounds like the offset is just added so that we distribute memory
>>>> accesses
>>>> more equally over memory channels.
>>>
>>> You can't set an offset that is not aligned. The hardware ignores the
>>> low unaligned bits, so they have a different meaning. They specify
>>> pipe and bank rotation for macro tiling. It's like a state. It
>>> basically rotates the tile pattern.
>>
>>
>> Yeah, I know. That's what I meant with distributing memory accesses more
>> equally over all channels. The lower bits select a memory bank swizzle IIRC.
>>
>> I've tried years ago with R600 if shuffling them randomly could improve
>> performance, but MRT wasn't widely used and/or supported at that time.
>
> I'd known this and forgotten, the public CIK docs say bits 0..7 must be zero,
> but I have older docs which had more info. It would be nice if we could get
> proper docs released for the bottom bits considering AMD are using them in their
> drivers.

The low 8 bits of the address are unused and can't be set, because
CB_COLOR0_BASE is shifted by 8 bits. We are really talking about bits
starting from 8 going higher. E.g. 8K alignment gives you 5 bits that
can be used to express the rotation.

>
> It would be good to know what registers have the bits that matter (i.e. BASE,
> FMASK, CMASK, DCC, and resource descriptors.)
>
> Then I suppose we'd need to know the algorithm for programming them, and
> if we need to make any allocations bigger in order to do so.
>
> I expect this only starts to matter when we hit memory bandwidth limits,
> the deferred demo does 3 MRT, one depth at 2kx2k then samples from those
> down to 1280x720 displayed. This combined with a 3 instanced 57k vertex
> draw seemed to be enough to see the pain. (Maybe a GL example doing something
> similiar might show the problem for radeonsi).

Addrlib contains the encoding code for the base address pipe/bank bits.

>
> The other open question I have, is does this just matter for MRT or does texture
> sampling also get some boost from it, my hack patch does it for only
> surfaces which
> will end up attached to the CB.

Yes, it should be done for read-only textures too.

>
> I'll update the patch to not call it an offset but name them the tile
> rotation bits.

The proper name is "tile swizzle" or "pipe/bank swizzle". On gfx9,
it's called "pipe/bank xor".

Marek


More information about the mesa-dev mailing list