[Mesa-dev] [PATCH 6/8] i965/gen6: Enable the features required for GL_ARB_sample_shading

Paul Berry stereotype441 at gmail.com
Sun Oct 20 16:25:31 CEST 2013


On 14 October 2013 10:12, Anuj Phogat <anuj.phogat at gmail.com> wrote:

> - Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE,
>   GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
> - Don't enable GEN6_WM_16_DISPATCH_ENABLE when GEN6_WM_MSDISPMODE_PERSAMPLE
>   is enabled. Refer SNB PRM Vol. 2, Part 1, Page 279 for details.
>
> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
> ---
>  src/mesa/drivers/dri/i965/gen6_wm_state.c | 67
> +++++++++++++++++++++++++++++--
>  1 file changed, 64 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c
> b/src/mesa/drivers/dri/i965/gen6_wm_state.c
> index c96a107..4bc25d6 100644
> --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
> @@ -103,6 +103,7 @@ upload_wm_state(struct brw_context *brw)
>
>     /* _NEW_BUFFERS */
>     bool multisampled_fbo = ctx->DrawBuffer->Visual.samples > 1;
> +   bool msdispmode_persample = false;
>
>      /* CACHE_NEW_WM_PROG */
>     if (brw->wm.prog_data->nr_params == 0) {
> @@ -156,7 +157,17 @@ upload_wm_state(struct brw_context *brw)
>
>     /* CACHE_NEW_WM_PROG */
>     dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
> -   if (brw->wm.prog_data->prog_offset_16)
> +   msdispmode_persample =
> +      ctx->Multisample.Enabled &&
> +      (ctx->Multisample.SampleShading ||
> +       brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_ID ||
> +       brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS);
>

The dependency on ctx->Multisample.SampleShading isn't quite right.  You
also need to account for the value of
ctx->Multisample.MinSampleShadingValue.  I believe the correct expression
is something like:

(ctx->Multisample.SampleShading &&
ceil(ctx->Multisample.MinSampleShadingValue *
ctx->DrawBuffer->Visual.samples) > 1)

This is necessary to ensure that if, for example, the buffer is 4xMSAA and
the client sets MinSampleShadingARB to 0.25 or less, we still get
per-fragment shading.


> +
> +   /* In case of non 1x (i.e 4x, 8x) multisampling with
> MDISPMODE_PERSAMPLE,
> +    * only one of SIMD8 and SIMD16 should be enabled.
> +    */
> +   if (brw->wm.prog_data->prog_offset_16 &&
> +       !(multisampled_fbo && msdispmode_persample))
>        dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
>
>     /* CACHE_NEW_WM_PROG | _NEW_COLOR */
> @@ -185,7 +196,10 @@ upload_wm_state(struct brw_context *brw)
>
>     /* _NEW_COLOR, _NEW_MULTISAMPLE */
>     if (fp->program.UsesKill || ctx->Color.AlphaEnabled ||
> -       ctx->Multisample.SampleAlphaToCoverage)
> +       ctx->Multisample.SampleAlphaToCoverage ||
> +       (ctx->Multisample.SampleShading &&
> +        (fp->program.Base.OutputsWritten &
> +         BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))))
>        dw5 |= GEN6_WM_KILL_ENABLE;
>

There's a problem here.  The bspec says "This bit is required to be ENABLED
in the following situations: ... The pixel shader kernel generates and
outputs oMask.".  That means you need to use the same boolean expression
here to determine whether to turn on GEN6_WM_KILL_ENABLE as you do in patch
5 (in fs_visitor::emit_fb_writes()) to decide whether to output oMask.

But here you're using (ctx->Multisample.SampleShading &&
(fp->program.Base.OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))

Whereas in patch 5 you just use (fp->program.Base.OutputsWritten &
BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))

I believe the expression in patch 5 is the correct one, because
gl_SampleMask should still be able to kill samples even if we're not using
per-sample shading.

To avoid bugs like this, I recommend adding a boolean to brw_wm_prog_data
to indicate whether the fragment shader outputs an oMask (let's say we call
it "uses_omask").  fs_visitor::emit_fb_writes() should set it to true if
it's outputting oMask.  Then, here in gen6_wm_state.c, rather than having
to duplicate the boolean expression (and risking bugs if you don't get it
right), you can just refer brw->wm.prog_data->uses_omask.  We already use a
similar mechanism for keeping track of whether dual source blending is
enabled (see the dual_src_blend boolean in brw_wm_prog_data).


>
>     if (brw_color_buffer_write_enabled(brw) ||
> @@ -193,6 +207,19 @@ upload_wm_state(struct brw_context *brw)
>        dw5 |= GEN6_WM_DISPATCH_ENABLE;
>     }
>
> +   /* From the SNB PRM, volume 2 part 1, page 278:
> +    * "This bit is inserted in the PS payload header and made available to
> +    * the DataPort (either via the message header or via header bypass) to
> +    * indicate that oMask data (one or two phases) is included in Render
> +    * Target Write messages. If present, the oMask data is used to mask
> off
> +    * samples."
> +    * TODO: [DevSNB:A0] This bit must be disabled in A Step.
> +    */
> +   if (ctx->Extensions.ARB_sample_shading &&
> +       (brw->fragment_program->Base.OutputsWritten &
> +        BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))
> +      dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET;
> +
>

The dependency on ctx->Extensions.ARB_sample_shading is unnecessary and
confusing.  The front end won't allow the shader to write to the sample
mask if the extension isn't supported.  And besides, since the purpose of
this series is to add support for the extension,
ctx->Extensions.ARB_sample_shading will always be true once patch 8 lands.

Also, my earlier comments about adding a boolean to brw_wm_prog_data apply
here too.  This really ought to change to just:

if (brw->wm.prog_data->uses_omask)
   dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET;


>     /* CACHE_NEW_WM_PROG */
>     dw6 |= brw->wm.prog_data->num_varying_inputs <<
>        GEN6_WM_NUM_SF_OUTPUTS_SHIFT;
> @@ -202,12 +229,46 @@ upload_wm_state(struct brw_context *brw)
>           dw6 |= GEN6_WM_MSRAST_ON_PATTERN;
>        else
>           dw6 |= GEN6_WM_MSRAST_OFF_PIXEL;
> -      dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL;
> +
> +      /* From arb_sample_shading specification:
> +       * "If MULTISAMPLE or SAMPLE_SHADING_ARB is disabled, sample shading
> +       *  has no effect."
> +       *
> +       * "Using gl_SampleID in a fragment shader causes the entire shader
> +       *  to be evaluated per-sample."
> +       * "Using gl_SamplePosition in a fragment shader causes the entire
> +       *  shader to be evaluated per-sample."
> +       *
> +       *  I interprate the above four lines as enable the sample shading
> +       *  if fragment shader uses gl_SampleID or gl_SamplePosition.
> +       */
> +      if (msdispmode_persample)
> +         dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE;
> +      else
> +         dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL;
>     } else {
>        dw6 |= GEN6_WM_MSRAST_OFF_PIXEL;
>        dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE;
>     }
>
> +   /* _NEW_MULTISAMPLE */
>

Drop this comment.  The code you're adding below doesn't depend on
_NEW_MULTISAMPLE.


> +   /* From the SNB PRM, volume 2 part 1, page 281:
> +    * "If the PS kernel does not need the Position XY Offsets
> +    * to compute a Position XY value, then this field should be
> +    * programmed to POSOFFSET_NONE."
> +    *
> +    * "SW Recommendation: If the PS kernel needs the Position Offsets
> +    * to compute a Position XY value, this field should match Position
> +    * ZW Interpolation Mode to ensure a consistent position.xyzw
> +    * computation."
> +    * We only require XY sample offsets. So, this recommendation doesn't
> +    * look useful at the moment. We might need this in future.
> +    */
> +   if (brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS)
> +      dw6 |= GEN6_WM_POSOFFSET_SAMPLE;
> +   else
> +      dw6 |= GEN6_WM_POSOFFSET_NONE;
> +
>

This is correct, however it is risky for the same reason that the sample
mask code above was risky: correct execution depends on this code using the
same boolean expression to decide whether to set GEN6_WM_POSOFFSET_SAMPLE
as fs_visitor::setup_payload_gen6() uses to decide whether to expect the
sample position to be present in the payload.

I would feel much more comfortable if we did a similar thing to my
"uses_omask" suggestion above.  That is, add a boolean to brw_wm_prog_data
(let's call it "uses_pos_offset").  In fs_visitor::setup_payload_gen6(),
set the bit if MSAA position offsets need to be included in the payload.
Then, here in gen6_wm_state.c, just consult the value of
brw->wm.prog_data->uses_pos_offset rather than duplicating the boolean
expression.


>     BEGIN_BATCH(9);
>     OUT_BATCH(_3DSTATE_WM << 16 | (9 - 2));
>     OUT_BATCH(brw->wm.base.prog_offset);
> --
> 1.8.1.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20131020/8d9574fb/attachment-0001.html>


More information about the mesa-dev mailing list