[Mesa-dev] [PATCH] RFC: Workaround for gen9 hw astc5x5 sampler bug

Rogovin, Kevin kevin.rogovin at intel.com
Tue Dec 5 10:26:33 UTC 2017


Hi,


>> Here are my comments of the patch posted:
>> 
>>  1.  it is essentially replication and moving around of the code of the patch series posted earlier but missing various
>>       important bits: preventing the sampler from using the auxiliary buffer (this requires to modify surface state
>>       sent in brw_wm_surfaces.c). It also does not cover blorp sufficiently (blorp might read from an ASTC5x5
>>       and there are more paths in blorp than blorp_surf_for_miptree() that sample from surfaces.
>> 

>Can you explain both more in detail? Resolves done in
>brw_predraw_resolve_inputs() mean that there is nothing interesting in the aux buffers and surface setup won't therefore enable auxiliary for texture surfaces.

That there is nothing interesting is irrelevant to the sampler if the SURFACE_STATE fed includes the auxiliary buffer, thus when one sets up the SURFACE_STATE for sampler, the auxiliary buffer cannot be mentioned in the GPU command; The sampler will always try to read the auxiliary buffer if it is given the opportunity to do so. Indeed, it is quite feasible that less bandwidth is consumed if the sampler is given the chance to read an auxiliary buffer in place of the buffer; as such even if the surface is resolved one may wish to feed the sampler the auxiliary buffer. Indeed, for HiZ, i965 programs to use the HiZ auxiliary buffer even if the depth buffer is fully resolved (see inte_mipmap_tree_sample_with_hiz() in intel_mipmap_tree.c).

> In case of blorp, as far as I know all operations sampling something should go thru blorp_surf_for_miptree(). Can you point out cases that don't?

Blorp is used in more than blorp_surf_for_miptree(), for example implementing GetTexImage(). Indeed, it is possible for blorp to sample from an ASTC5x5 (you can see this handled in the patch series I posted). I chose the hammer that the default is to just assume blorp is going to access auxiliary buffers unless a flag is set when the caller knows that blorp is going to sample from an astc5x5 (against see my patch series).

>Right. In the case of sampling both aux and astc5x5 in the same draw cycle the only thing to do is to disable aux. With my question of direction I meant the texture 
> cache flush between two cycles. Do we need to flush in both cases
> 1) ASTC5x5 in first cycle and AUX in the following
> 2) AUX in first cycle and ASTC5x5 in the following

YES we need to flush in both cases. What is happening is that the sampler hardware is bugged. Let us suppose it was bugged in only 1 direction, take 1. Then if the sampler first samples from an ASTC5x5 then an AUX it would not hang, but the other way it would. However, if there are multiple draws in flight where one samples from an ASTC5x5 and the other does not, the command buffer order gives ZERO guarantee that the sampler will sample in that order because fragments get executed out-of-order even across draw calls even within a subslice (this is why sendc is needed at end of shader in GEN).

>>  4. With 3 in mind, using the bit-masks is not a good idea as we want to then enforce at the code level
>>       that only one of the two is possible without texture invalidates.
> Can you elaborate this a little more? It tells if aux is/was used and it tells if astc5x5 is/was used. That is all we need, right?

WRONG. We must enforce that a given draw call can have neither or only one. By having bitmasks it is possible to support a state having both.

At any rate, please review the patch series I have posted and I am happy to take suggestions to improve that patch series that I have tested.

-Kevin
 


More information about the mesa-dev mailing list