[Mesa-dev] [PATCH 11/15] i965/fs: When >64 input components, order them to match prev pipeline stage.
Paul Berry
stereotype441 at gmail.com
Mon Sep 16 10:40:03 PDT 2013
On 15 September 2013 21:16, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On 09/03/2013 04:18 PM, Paul Berry wrote:
> > Since the SF/SBE stage is only capable of performing arbitrary
> > reorderings of 16 varying slots, we can't arrange the fragment shader
> > inputs in an arbitrary order if there are more than 16 input varying
> > slots in use. We need to make sure that slots 16-31 match the
> > corresponding outputs of the previous pipeline stage.
> >
> > The easiest way to accomplish this is to just make all varying slots
> > match up with the previous pipeline stage.
> > ---
> > src/mesa/drivers/dri/i965/brw_fs.cpp | 42
> ++++++++++++++++++++++++++++++------
> > src/mesa/drivers/dri/i965/brw_wm.c | 3 ++-
> > 2 files changed, 38 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index 7950d5f6..8d73a0f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -1237,11 +1237,40 @@ fs_visitor::calculate_urb_setup()
> > int urb_next = 0;
> > /* Figure out where each of the incoming setup attributes lands. */
> > if (brw->gen >= 6) {
> > - for (unsigned int i = 0; i < VARYING_SLOT_MAX; i++) {
> > - if (fp->Base.InputsRead & BRW_FS_VARYING_INPUT_MASK &
> > - BITFIELD64_BIT(i)) {
> > - c->prog_data.urb_setup[i] = urb_next++;
> > - }
> > + if (_mesa_bitcount_64(fp->Base.InputsRead &
> > + BRW_FS_VARYING_INPUT_MASK) <= 16) {
> > + /* The SF/SBE pipeline stage can do arbitrary rearrangement of
> the
> > + * first 16 varying inputs, so we can put them wherever we
> want.
> > + * Just put them in order.
> > + */
>
> It might be nice to have a comment saying why this is useful (as opposed
> to always following the previous stage's ordering). As I understand it,
> this is useful when the VS outputs a bunch of varyings but the FS only
> uses a couple of them---we can pack them into the first few slots,
> saving space.
>
Also, it allows us to avoid recompiling the fragment shader when the vertex
(or geometry) shader changes. I've added this comment to clarify:
* This is useful because it means that (a) inputs not used by the
* fragment shader won't take up valuable register space, and (b)
we
* won't have to recompile the fragment shader if it gets paired
with
* a different vertex (or geometry) shader.
>
> > + for (unsigned int i = 0; i < VARYING_SLOT_MAX; i++) {
> > + if (fp->Base.InputsRead & BRW_FS_VARYING_INPUT_MASK &
> > + BITFIELD64_BIT(i)) {
> > + c->prog_data.urb_setup[i] = urb_next++;
> > + }
> > + }
> > + } else {
> > + /* We have enough input varyings that the SF/SBE pipeline
> stage can't
> > + * arbitrarily rearrange them to suit our whim; we have to put
> them
> > + * in an order that matches the output of the previous
> pipeline stage
> > + * (geometry or vertex shader).
> > + */
> > + struct brw_vue_map prev_stage_vue_map;
> > + brw_compute_vue_map(brw, &prev_stage_vue_map,
> > + c->key.input_slots_valid);
> > + int first_slot = 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
> > + assert(prev_stage_vue_map.num_slots <= first_slot + 32);
> > + for (int slot = first_slot; slot <
> prev_stage_vue_map.num_slots;
> > + slot++) {
> > + int varying = prev_stage_vue_map.slot_to_varying[slot];
> > + if (varying != BRW_VARYING_SLOT_COUNT &&
>
> It wasn't immediately obvious to me why varying would be
> BRW_VARYING_SLOT_COUNT. But, on further inspection, I see this is the
> value that gets assigned to empty slots.
>
> Up to you whether you want to add a small comment.
>
Good point. I've added this comment:
/* Note that varying == BRW_VARYING_SLOT_COUNT when a slot is
* unused.
*/
>
> > + (fp->Base.InputsRead & BRW_FS_VARYING_INPUT_MASK &
> > + BITFIELD64_BIT(varying))) {
> > + c->prog_data.urb_setup[varying] = slot - first_slot;
> > + urb_next = MAX2(urb_next, slot + 1);
> > + }
> > + }
> > + urb_next = prev_stage_vue_map.num_slots - first_slot;
>
> Huh? It looks like you're setting urb_next in the loop, then clobbering
> it immediately after the loop. This should probably be fixed.
>
Oops, I meant to delete the first reference to urb_next entirely, and I
guess I forgot. Fixed, thanks.
>
> > }
> > } else {
> > /* FINISHME: The sf doesn't map VS->FS inputs for us very well. */
> > @@ -3149,7 +3178,8 @@ brw_fs_precompile(struct gl_context *ctx, struct
> gl_shader_program *prog)
> > key.iz_lookup |= IZ_DEPTH_WRITE_ENABLE_BIT;
> > }
> >
> > - if (brw->gen < 6)
> > + if (brw->gen < 6 || _mesa_bitcount_64(fp->Base.InputsRead &
> > + BRW_FS_VARYING_INPUT_MASK) >
> 16)
>
> Could this be simplified to:
>
> if (brw->gen < 6 || c->prog_data.num_varying_inputs > 16)
>
> Or are these values different?
>
Unfortunately not. This logic is happening during setup of the program
key, which happens before compilation, so c->prog_data hasn't been
calculated yet. The same applies to your question below.
>
> > key.input_slots_valid = fp->Base.InputsRead | VARYING_BIT_POS;
> >
> > key.clamp_fragment_color = ctx->API == API_OPENGL_COMPAT;
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm.c
> b/src/mesa/drivers/dri/i965/brw_wm.c
> > index 3df2b7d..3e59880 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> > @@ -466,7 +466,8 @@ static void brw_wm_populate_key( struct brw_context
> *brw,
> > (ctx->Multisample.SampleAlphaToCoverage ||
> ctx->Color.AlphaEnabled);
> >
> > /* BRW_NEW_VUE_MAP_GEOM_OUT */
> > - if (brw->gen < 6)
> > + if (brw->gen < 6 || _mesa_bitcount_64(fp->program.Base.InputsRead &
> > + BRW_FS_VARYING_INPUT_MASK) >
> 16)
>
> Same question here.
>
> > key->input_slots_valid = brw->vue_map_geom_out.slots_valid;
> >
> > /* The unique fragment program ID */
> >
>
> Other than those few comments, this is still:
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130916/4b983117/attachment-0001.html>
More information about the mesa-dev
mailing list