[Mesa-dev] [PATCH] r600g: cache shader variants instead of rebuilding v2
Vadim Girlin
vadimgirlin at gmail.com
Mon Jun 11 10:45:51 CEST 2012
On Sun, 2012-06-10 at 21:45 +0400, Vadim Girlin wrote:
> On Sun, 2012-06-10 at 10:27 +0200, Christian König wrote:
> > On 10.06.2012 04:07, Vadim Girlin wrote:
> > > Shader variants are stored in the list, the key for lookup is based on the
> > > states that require different hw shaders - currently it's rctx->two_side (all
> > > gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).
> > >
> > > v2:
> > > - use simple list instead of keymap as suggested by Marek on irc
> > > - call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
> > > (r600_shader_select isn't used for vertex shaders currently)
> > >
> > > Improves performance for some apps, e.g. FlightGear -
> > > see https://bugs.freedesktop.org/show_bug.cgi?id=50360
> > >
> > > Signed-off-by: Vadim Girlin<vadimgirlin at gmail.com>
> > Mhm, I really start wondering if it might not be easier to avoid having
> > different shader variants by using CF_COND_BOOL/CF_COND_NOT_BOOL for
> > those two special cases, e.g. build the shader in a way that it can
> > handle both variants and then select the one we currently want with the
> > CF bool constants.
> >
> > If the shader overhead for it is to much we might also try using this
> > implementation only if the application really starts using those
> > features in question.
> >
>
> I agree that we might want to use common shader code for those cases. I
> just don't want to use control flow for that. According to the docs, the
> cost of the single CF instruction is ~40x comparing to the cost of the
> ALU instruction. And it seems we'll need to add 3 CF instructions to
> guard color selection for the two_side case. I'm not sure how we could
> use it for the writes_all case, where we need varying number of the
> exports.
>
> There are other possible solutions, e.g. for the first case I think we
> can pass bool value (0.0/1.0) to PS through the SPI by using
> SPI_PS_INPUT_CNTL_x:DEFAULT_VAL and non-existant semantic index, or put
> it into the constant buffer - we're already using special const buffer
> to pass clip planes for clipvertex, so we can just add the constant for
> that. Then we can MUL that value with the front_face to get the selector
> value for the colors. Additional MUL instruction per shader could be
> merged into some alu group, so I guess it might have lower overhead than
> using control flow.
>
> Regarding the writes_all case, I guess we simply need to try playing
> with CB_SHADER_MASK, CB_TARGET_MASK, and some other bits to avoid
> performance regression when the shader does export to all possible CBs,
> as Alex implemented it initially. IIRC there were some changes related
> to those masks after that, so maybe the problem is solved already.
Though it seems there are no magic bits - catalyst also uses different
shaders in that case.
Vadim
>
> Anyway, those solutions will require additional time for implementation
> and testing, and I'm not sure if they will result in a better
> performance than caching. After all, it's not a high priority for me, I
> just wanted to provide a quick fix for the performance problem with
> FlightGear - I don't know any other apps that are affected by
> rebuilding. I think we can improve it later if we need.
>
> Vadim
>
> > Cheers,
> > Christian.
> >
>
>
>
More information about the mesa-dev
mailing list