[Mesa-dev] [PATCH] r600g: cache shader variants instead of rebuilding v2

Vadim Girlin vadimgirlin at gmail.com
Mon Jun 11 10:45:51 CEST 2012


On Sun, 2012-06-10 at 21:45 +0400, Vadim Girlin wrote:
> On Sun, 2012-06-10 at 10:27 +0200, Christian König wrote:
> > On 10.06.2012 04:07, Vadim Girlin wrote:
> > > Shader variants are stored in the list, the key for lookup is based on the
> > > states that require different hw shaders - currently it's rctx->two_side (all
> > > gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).
> > >
> > > v2:
> > >   - use simple list instead of keymap as suggested by Marek on irc
> > >   - call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
> > >     (r600_shader_select isn't used for vertex shaders currently)
> > >
> > > Improves performance for some apps, e.g. FlightGear -
> > > see https://bugs.freedesktop.org/show_bug.cgi?id=50360
> > >
> > > Signed-off-by: Vadim Girlin<vadimgirlin at gmail.com>
> > Mhm, I really start wondering if it might not be easier to avoid having 
> > different shader variants by using CF_COND_BOOL/CF_COND_NOT_BOOL for 
> > those two special cases, e.g. build the shader in a way that it can 
> > handle both variants and then select the one we currently want with the 
> > CF bool constants.
> > 
> > If the shader overhead for it is to much we might also try using this 
> > implementation only if the application really starts using those 
> > features in question.
> > 
> 
> I agree that we might want to use common shader code for those cases. I
> just don't want to use control flow for that. According to the docs, the
> cost of the single CF instruction is ~40x comparing to the cost of the
> ALU instruction. And it seems we'll need to add 3 CF instructions to
> guard color selection for the two_side case. I'm not sure how we could
> use it for the writes_all case, where we need varying number of the
> exports. 
> 
> There are other possible solutions, e.g. for the first case I think we
> can pass bool value (0.0/1.0) to PS through the SPI by using
> SPI_PS_INPUT_CNTL_x:DEFAULT_VAL and non-existant semantic index, or put
> it into the constant buffer - we're already using special const buffer
> to pass clip planes for clipvertex, so we can just add the constant for
> that. Then we can MUL that value with the front_face to get the selector
> value for the colors. Additional MUL instruction per shader could be
> merged into some alu group, so I guess it might have lower overhead than
> using control flow.
> 
> Regarding the writes_all case, I guess we simply need to try playing
> with CB_SHADER_MASK, CB_TARGET_MASK, and some other bits to avoid
> performance regression when the shader does export to all possible CBs,
> as Alex implemented it initially. IIRC there were some changes related
> to those masks after that, so maybe the problem is solved already. 

Though it seems there are no magic bits - catalyst also uses different
shaders in that case.

Vadim

> 
> Anyway, those solutions will require additional time for implementation
> and testing, and I'm not sure if they will result in a better
> performance than caching. After all, it's not a high priority for me, I
> just wanted to provide a quick fix for the performance problem with
> FlightGear - I don't know any other apps that are affected by
> rebuilding. I think we can improve it later if we need.
> 
> Vadim
> 
> > Cheers,
> > Christian.
> > 
> 
> 
> 





More information about the mesa-dev mailing list