[Mesa-dev] [PATCH] RFC: Workaround for gen9 hw astc5x5 sampler bug

Pohjolainen, Topi topi.pohjolainen at gmail.com
Tue Dec 5 18:17:12 UTC 2017


On Tue, Dec 05, 2017 at 10:26:33AM +0000, Rogovin, Kevin wrote:
> Hi,
> 
> 
> >> Here are my comments of the patch posted:
> >> 
> >>  1.  it is essentially replication and moving around of the code of the patch series posted earlier but missing various
> >>       important bits: preventing the sampler from using the auxiliary buffer (this requires to modify surface state
> >>       sent in brw_wm_surfaces.c). It also does not cover blorp sufficiently (blorp might read from an ASTC5x5
> >>       and there are more paths in blorp than blorp_surf_for_miptree() that sample from surfaces.
> >> 
> 
> >Can you explain both more in detail? Resolves done in
> >brw_predraw_resolve_inputs() mean that there is nothing interesting in the aux buffers and surface setup won't therefore enable auxiliary for texture surfaces.
> 
> That there is nothing interesting is irrelevant to the sampler if the SURFACE_STATE fed includes the auxiliary buffer, thus when one sets up the SURFACE_STATE for sampler, the auxiliary buffer cannot be mentioned in the GPU command; The sampler will always try to read the auxiliary buffer if it is given the opportunity to do so. Indeed, it is quite feasible that less bandwidth is consumed if the sampler is given the chance to read an auxiliary buffer in place of the buffer; as such even if the surface is resolved one may wish to feed the sampler the auxiliary buffer. Indeed, for HiZ, i965 programs to use the HiZ auxiliary buffer even if the depth buffer is fully resolved (see inte_mipmap_tree_sample_with_hiz() in intel_mipmap_tree.c).
> 
> > In case of blorp, as far as I know all operations sampling something should go thru blorp_surf_for_miptree(). Can you point out cases that don't?
> 
> Blorp is used in more than blorp_surf_for_miptree(), for example implementing GetTexImage(). Indeed, it is possible for blorp to sample from an ASTC5x5 (you can see this handled in the patch series I posted). I chose the hammer that the default is to just assume blorp is going to access auxiliary buffers unless a flag is set when the caller knows that blorp is going to sample from an astc5x5 (against see my patch series).
> 
> >Right. In the case of sampling both aux and astc5x5 in the same draw cycle the only thing to do is to disable aux. With my question of direction I meant the texture 
> > cache flush between two cycles. Do we need to flush in both cases
> > 1) ASTC5x5 in first cycle and AUX in the following
> > 2) AUX in first cycle and ASTC5x5 in the following
> 
> YES we need to flush in both cases. What is happening is that the sampler hardware is bugged. Let us suppose it was bugged in only 1 direction, take 1. Then if the sampler first samples from an ASTC5x5 then an AUX it would not hang, but the other way it would. However, if there are multiple draws in flight where one samples from an ASTC5x5 and the other does not, the command buffer order gives ZERO guarantee that the sampler will sample in that order because fragments get executed out-of-order even across draw calls even within a subslice (this is why sendc is needed at end of shader in GEN).
> 

I need to clarify this bit a little more. In the current setup we have only
one draw cycle in flight per context that uses sample engine. There may be
blitter calls in parallel (although I'm not sure) but they don't use sampling
engine anyway.

The draw cycle itself may have multiple draws programmed into it but as they
all are based on the same surface setup there is naturally no flushing problem.
Surfaces with auxiliary would be resolved before the draw and programmed
without auxiliary buffer.

Are you saying that this bug extends over hardware context?


More information about the mesa-dev mailing list