[Mesa-dev] [PATCH 1/3] cso: don't release sampler states that are bound

Wed Dec 7 20:52:30 UTC 2016

On Wed, Dec 7, 2016 at 3:46 PM, Marek Olšák <maraeo at gmail.com> wrote:
> On Wed, Dec 7, 2016 at 6:00 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 07.12.2016 um 17:26 schrieb Marek Olšák:
>>> Optimizing the CSO cache isn't exactly on the top of my list, so I
>>> can't really do that right now.
>>>
>>> I think that varying the LOD bias is starting to be common. It's used
>>> for smooth LOD transitions when loading textures during rendering.
>>> Games with lots of content typically do that. This particular Batman
>>> game uses UE3.
>> The question of course is if they do it via sampler state or via shader
>> lod bias.
>
> The sampler state. I saw those states. lod_bias was the only changing variable.
>
>> I suppose that when these objects were designed noone thought it would
>> be useful to really create sampler state with lots of different bias
>> values. d3d10 of course would have the same problem (and it has limits
>> on how many such objects you can create, 4096 per context) but the
>> problem shifted to the app since it would have to create the objects
>> explicitly - I would suspect the app there would either quantize the
>> value itself, or use shader lod bias instead.
>
> The shader lod bias isn't a better solution though. Any LOD bias is a
> modifier of the varying LOD value. For texture streaming, you want to
> clamp the LOD, not shift it, thus min_lod is better. However, min_lod
> is integer on ATI DX9 GPUs (not sure about the API), so DX9 games
> can't use it for smooth transitions. That may explain why we are
> seeing it with Wine. I guess any DX10+ and GL3+ games do use min_lod
> instead of lod_bias, which means we probably get a lot of sampler
> states there too.
>
> We could reduce the size of pipe_sampler_state a little. AMD GCN can
> represent only 14 bits of lod_bias and 12 bits of min_lod and max_lod.

Looks like 13 bits of lod bias and 12 for min/max lod on G80+ GPUs:

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv50/nv50_state.c#n559

Even fewer for Adreno A3xx (11 lod bias, 10 min/max lod):

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/freedreno/a3xx/a3xx.xml.h#n3102

but back up to the G80 levels for Adreno A4xx:

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h#n3923

(I'm a bit surprised that GCN is 14 bits of lod bias, not 13. I guess
+16 was important to them?)

  -ilia