[Mesa-dev] [PATCH 6/8] i965/gen6: Enable the features required for GL_ARB_sample_shading

Anuj Phogat anuj.phogat at gmail.com
Wed Oct 23 20:16:49 CEST 2013


On Sun, Oct 20, 2013 at 7:25 AM, Paul Berry <stereotype441 at gmail.com> wrote:
> On 14 October 2013 10:12, Anuj Phogat <anuj.phogat at gmail.com> wrote:
>>
>> - Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE,
>>   GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
>> - Don't enable GEN6_WM_16_DISPATCH_ENABLE when
>> GEN6_WM_MSDISPMODE_PERSAMPLE
>>   is enabled. Refer SNB PRM Vol. 2, Part 1, Page 279 for details.
>>
>> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
>> ---
>>  src/mesa/drivers/dri/i965/gen6_wm_state.c | 67
>> +++++++++++++++++++++++++++++--
>>  1 file changed, 64 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> b/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> index c96a107..4bc25d6 100644
>> --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> @@ -103,6 +103,7 @@ upload_wm_state(struct brw_context *brw)
>>
>>     /* _NEW_BUFFERS */
>>     bool multisampled_fbo = ctx->DrawBuffer->Visual.samples > 1;
>> +   bool msdispmode_persample = false;
>>
>>      /* CACHE_NEW_WM_PROG */
>>     if (brw->wm.prog_data->nr_params == 0) {
>> @@ -156,7 +157,17 @@ upload_wm_state(struct brw_context *brw)
>>
>>     /* CACHE_NEW_WM_PROG */
>>     dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
>> -   if (brw->wm.prog_data->prog_offset_16)
>> +   msdispmode_persample =
>> +      ctx->Multisample.Enabled &&
>> +      (ctx->Multisample.SampleShading ||
>> +       brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_ID ||
>> +       brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS);
>
>
> The dependency on ctx->Multisample.SampleShading isn't quite right.  You
> also need to account for the value of
> ctx->Multisample.MinSampleShadingValue.  I believe the correct expression is
> something like:
>
> (ctx->Multisample.SampleShading &&
> ceil(ctx->Multisample.MinSampleShadingValue *
> ctx->DrawBuffer->Visual.samples) > 1)
>
> This is necessary to ensure that if, for example, the buffer is 4xMSAA and
> the client sets MinSampleShadingARB to 0.25 or less, we still get
> per-fragment shading.
>
>>
>> +
>> +   /* In case of non 1x (i.e 4x, 8x) multisampling with
>> MDISPMODE_PERSAMPLE,
>> +    * only one of SIMD8 and SIMD16 should be enabled.
>> +    */
>> +   if (brw->wm.prog_data->prog_offset_16 &&
>> +       !(multisampled_fbo && msdispmode_persample))
>>        dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
>>
>>     /* CACHE_NEW_WM_PROG | _NEW_COLOR */
>> @@ -185,7 +196,10 @@ upload_wm_state(struct brw_context *brw)
>>
>>     /* _NEW_COLOR, _NEW_MULTISAMPLE */
>>     if (fp->program.UsesKill || ctx->Color.AlphaEnabled ||
>> -       ctx->Multisample.SampleAlphaToCoverage)
>> +       ctx->Multisample.SampleAlphaToCoverage ||
>> +       (ctx->Multisample.SampleShading &&
>> +        (fp->program.Base.OutputsWritten &
>> +         BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))))
>>        dw5 |= GEN6_WM_KILL_ENABLE;
>
>
> There's a problem here.  The bspec says "This bit is required to be ENABLED
> in the following situations: ... The pixel shader kernel generates and
> outputs oMask.".  That means you need to use the same boolean expression
> here to determine whether to turn on GEN6_WM_KILL_ENABLE as you do in patch
> 5 (in fs_visitor::emit_fb_writes()) to decide whether to output oMask.
>
> But here you're using (ctx->Multisample.SampleShading &&
> (fp->program.Base.OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))
>
> Whereas in patch 5 you just use (fp->program.Base.OutputsWritten &
> BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))
>
> I believe the expression in patch 5 is the correct one, because
> gl_SampleMask should still be able to kill samples even if we're not using
> per-sample shading.
>
> To avoid bugs like this, I recommend adding a boolean to brw_wm_prog_data to
> indicate whether the fragment shader outputs an oMask (let's say we call it
> "uses_omask").  fs_visitor::emit_fb_writes() should set it to true if it's
> outputting oMask.  Then, here in gen6_wm_state.c, rather than having to
> duplicate the boolean expression (and risking bugs if you don't get it
> right), you can just refer brw->wm.prog_data->uses_omask.  We already use a
> similar mechanism for keeping track of whether dual source blending is
> enabled (see the dual_src_blend boolean in brw_wm_prog_data).
>
>>
>>
>>     if (brw_color_buffer_write_enabled(brw) ||
>> @@ -193,6 +207,19 @@ upload_wm_state(struct brw_context *brw)
>>        dw5 |= GEN6_WM_DISPATCH_ENABLE;
>>     }
>>
>> +   /* From the SNB PRM, volume 2 part 1, page 278:
>> +    * "This bit is inserted in the PS payload header and made available
>> to
>> +    * the DataPort (either via the message header or via header bypass)
>> to
>> +    * indicate that oMask data (one or two phases) is included in Render
>> +    * Target Write messages. If present, the oMask data is used to mask
>> off
>> +    * samples."
>> +    * TODO: [DevSNB:A0] This bit must be disabled in A Step.
>> +    */
>> +   if (ctx->Extensions.ARB_sample_shading &&
>> +       (brw->fragment_program->Base.OutputsWritten &
>> +        BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))
>> +      dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET;
>> +
>
>
> The dependency on ctx->Extensions.ARB_sample_shading is unnecessary and
> confusing.  The front end won't allow the shader to write to the sample mask
> if the extension isn't supported.  And besides, since the purpose of this
> series is to add support for the extension,
> ctx->Extensions.ARB_sample_shading will always be true once patch 8 lands.
>
> Also, my earlier comments about adding a boolean to brw_wm_prog_data apply
> here too.  This really ought to change to just:
>
> if (brw->wm.prog_data->uses_omask)
>    dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET;
>
>>
>>     /* CACHE_NEW_WM_PROG */
>>     dw6 |= brw->wm.prog_data->num_varying_inputs <<
>>        GEN6_WM_NUM_SF_OUTPUTS_SHIFT;
>> @@ -202,12 +229,46 @@ upload_wm_state(struct brw_context *brw)
>>           dw6 |= GEN6_WM_MSRAST_ON_PATTERN;
>>        else
>>           dw6 |= GEN6_WM_MSRAST_OFF_PIXEL;
>> -      dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL;
>> +
>> +      /* From arb_sample_shading specification:
>> +       * "If MULTISAMPLE or SAMPLE_SHADING_ARB is disabled, sample
>> shading
>> +       *  has no effect."
>> +       *
>> +       * "Using gl_SampleID in a fragment shader causes the entire shader
>> +       *  to be evaluated per-sample."
>> +       * "Using gl_SamplePosition in a fragment shader causes the entire
>> +       *  shader to be evaluated per-sample."
>> +       *
>> +       *  I interprate the above four lines as enable the sample shading
>> +       *  if fragment shader uses gl_SampleID or gl_SamplePosition.
>> +       */
>> +      if (msdispmode_persample)
>> +         dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE;
>> +      else
>> +         dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL;
>>     } else {
>>        dw6 |= GEN6_WM_MSRAST_OFF_PIXEL;
>>        dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE;
>>     }
>>
>> +   /* _NEW_MULTISAMPLE */
>
>
> Drop this comment.  The code you're adding below doesn't depend on
> _NEW_MULTISAMPLE.
>
>>
>> +   /* From the SNB PRM, volume 2 part 1, page 281:
>> +    * "If the PS kernel does not need the Position XY Offsets
>> +    * to compute a Position XY value, then this field should be
>> +    * programmed to POSOFFSET_NONE."
>> +    *
>> +    * "SW Recommendation: If the PS kernel needs the Position Offsets
>> +    * to compute a Position XY value, this field should match Position
>> +    * ZW Interpolation Mode to ensure a consistent position.xyzw
>> +    * computation."
>> +    * We only require XY sample offsets. So, this recommendation doesn't
>> +    * look useful at the moment. We might need this in future.
>> +    */
>> +   if (brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS)
>> +      dw6 |= GEN6_WM_POSOFFSET_SAMPLE;
>> +   else
>> +      dw6 |= GEN6_WM_POSOFFSET_NONE;
>> +
>
>
> This is correct, however it is risky for the same reason that the sample
> mask code above was risky: correct execution depends on this code using the
> same boolean expression to decide whether to set GEN6_WM_POSOFFSET_SAMPLE as
> fs_visitor::setup_payload_gen6() uses to decide whether to expect the sample
> position to be present in the payload.
>
> I would feel much more comfortable if we did a similar thing to my
> "uses_omask" suggestion above.  That is, add a boolean to brw_wm_prog_data
> (let's call it "uses_pos_offset").  In fs_visitor::setup_payload_gen6(), set
> the bit if MSAA position offsets need to be included in the payload.  Then,
> here in gen6_wm_state.c, just consult the value of
> brw->wm.prog_data->uses_pos_offset rather than duplicating the boolean
> expression.
>
>>
>>     BEGIN_BATCH(9);
>>     OUT_BATCH(_3DSTATE_WM << 16 | (9 - 2));
>>     OUT_BATCH(brw->wm.base.prog_offset);
>> --
>> 1.8.1.4
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
Thanks for the detailed feedback Paul. I'm working on resolving these issues.


More information about the mesa-dev mailing list