[Mesa-dev] [PATCH 6/8] i965/gen6: Enable the features required for GL_ARB_sample_shading
Anuj Phogat
anuj.phogat at gmail.com
Wed Oct 23 20:16:49 CEST 2013
On Sun, Oct 20, 2013 at 7:25 AM, Paul Berry <stereotype441 at gmail.com> wrote:
> On 14 October 2013 10:12, Anuj Phogat <anuj.phogat at gmail.com> wrote:
>>
>> - Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE,
>> GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
>> - Don't enable GEN6_WM_16_DISPATCH_ENABLE when
>> GEN6_WM_MSDISPMODE_PERSAMPLE
>> is enabled. Refer SNB PRM Vol. 2, Part 1, Page 279 for details.
>>
>> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
>> ---
>> src/mesa/drivers/dri/i965/gen6_wm_state.c | 67
>> +++++++++++++++++++++++++++++--
>> 1 file changed, 64 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> b/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> index c96a107..4bc25d6 100644
>> --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
>> @@ -103,6 +103,7 @@ upload_wm_state(struct brw_context *brw)
>>
>> /* _NEW_BUFFERS */
>> bool multisampled_fbo = ctx->DrawBuffer->Visual.samples > 1;
>> + bool msdispmode_persample = false;
>>
>> /* CACHE_NEW_WM_PROG */
>> if (brw->wm.prog_data->nr_params == 0) {
>> @@ -156,7 +157,17 @@ upload_wm_state(struct brw_context *brw)
>>
>> /* CACHE_NEW_WM_PROG */
>> dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
>> - if (brw->wm.prog_data->prog_offset_16)
>> + msdispmode_persample =
>> + ctx->Multisample.Enabled &&
>> + (ctx->Multisample.SampleShading ||
>> + brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_ID ||
>> + brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS);
>
>
> The dependency on ctx->Multisample.SampleShading isn't quite right. You
> also need to account for the value of
> ctx->Multisample.MinSampleShadingValue. I believe the correct expression is
> something like:
>
> (ctx->Multisample.SampleShading &&
> ceil(ctx->Multisample.MinSampleShadingValue *
> ctx->DrawBuffer->Visual.samples) > 1)
>
> This is necessary to ensure that if, for example, the buffer is 4xMSAA and
> the client sets MinSampleShadingARB to 0.25 or less, we still get
> per-fragment shading.
>
>>
>> +
>> + /* In case of non 1x (i.e 4x, 8x) multisampling with
>> MDISPMODE_PERSAMPLE,
>> + * only one of SIMD8 and SIMD16 should be enabled.
>> + */
>> + if (brw->wm.prog_data->prog_offset_16 &&
>> + !(multisampled_fbo && msdispmode_persample))
>> dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
>>
>> /* CACHE_NEW_WM_PROG | _NEW_COLOR */
>> @@ -185,7 +196,10 @@ upload_wm_state(struct brw_context *brw)
>>
>> /* _NEW_COLOR, _NEW_MULTISAMPLE */
>> if (fp->program.UsesKill || ctx->Color.AlphaEnabled ||
>> - ctx->Multisample.SampleAlphaToCoverage)
>> + ctx->Multisample.SampleAlphaToCoverage ||
>> + (ctx->Multisample.SampleShading &&
>> + (fp->program.Base.OutputsWritten &
>> + BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))))
>> dw5 |= GEN6_WM_KILL_ENABLE;
>
>
> There's a problem here. The bspec says "This bit is required to be ENABLED
> in the following situations: ... The pixel shader kernel generates and
> outputs oMask.". That means you need to use the same boolean expression
> here to determine whether to turn on GEN6_WM_KILL_ENABLE as you do in patch
> 5 (in fs_visitor::emit_fb_writes()) to decide whether to output oMask.
>
> But here you're using (ctx->Multisample.SampleShading &&
> (fp->program.Base.OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))
>
> Whereas in patch 5 you just use (fp->program.Base.OutputsWritten &
> BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))
>
> I believe the expression in patch 5 is the correct one, because
> gl_SampleMask should still be able to kill samples even if we're not using
> per-sample shading.
>
> To avoid bugs like this, I recommend adding a boolean to brw_wm_prog_data to
> indicate whether the fragment shader outputs an oMask (let's say we call it
> "uses_omask"). fs_visitor::emit_fb_writes() should set it to true if it's
> outputting oMask. Then, here in gen6_wm_state.c, rather than having to
> duplicate the boolean expression (and risking bugs if you don't get it
> right), you can just refer brw->wm.prog_data->uses_omask. We already use a
> similar mechanism for keeping track of whether dual source blending is
> enabled (see the dual_src_blend boolean in brw_wm_prog_data).
>
>>
>>
>> if (brw_color_buffer_write_enabled(brw) ||
>> @@ -193,6 +207,19 @@ upload_wm_state(struct brw_context *brw)
>> dw5 |= GEN6_WM_DISPATCH_ENABLE;
>> }
>>
>> + /* From the SNB PRM, volume 2 part 1, page 278:
>> + * "This bit is inserted in the PS payload header and made available
>> to
>> + * the DataPort (either via the message header or via header bypass)
>> to
>> + * indicate that oMask data (one or two phases) is included in Render
>> + * Target Write messages. If present, the oMask data is used to mask
>> off
>> + * samples."
>> + * TODO: [DevSNB:A0] This bit must be disabled in A Step.
>> + */
>> + if (ctx->Extensions.ARB_sample_shading &&
>> + (brw->fragment_program->Base.OutputsWritten &
>> + BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))
>> + dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET;
>> +
>
>
> The dependency on ctx->Extensions.ARB_sample_shading is unnecessary and
> confusing. The front end won't allow the shader to write to the sample mask
> if the extension isn't supported. And besides, since the purpose of this
> series is to add support for the extension,
> ctx->Extensions.ARB_sample_shading will always be true once patch 8 lands.
>
> Also, my earlier comments about adding a boolean to brw_wm_prog_data apply
> here too. This really ought to change to just:
>
> if (brw->wm.prog_data->uses_omask)
> dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET;
>
>>
>> /* CACHE_NEW_WM_PROG */
>> dw6 |= brw->wm.prog_data->num_varying_inputs <<
>> GEN6_WM_NUM_SF_OUTPUTS_SHIFT;
>> @@ -202,12 +229,46 @@ upload_wm_state(struct brw_context *brw)
>> dw6 |= GEN6_WM_MSRAST_ON_PATTERN;
>> else
>> dw6 |= GEN6_WM_MSRAST_OFF_PIXEL;
>> - dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL;
>> +
>> + /* From arb_sample_shading specification:
>> + * "If MULTISAMPLE or SAMPLE_SHADING_ARB is disabled, sample
>> shading
>> + * has no effect."
>> + *
>> + * "Using gl_SampleID in a fragment shader causes the entire shader
>> + * to be evaluated per-sample."
>> + * "Using gl_SamplePosition in a fragment shader causes the entire
>> + * shader to be evaluated per-sample."
>> + *
>> + * I interprate the above four lines as enable the sample shading
>> + * if fragment shader uses gl_SampleID or gl_SamplePosition.
>> + */
>> + if (msdispmode_persample)
>> + dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE;
>> + else
>> + dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL;
>> } else {
>> dw6 |= GEN6_WM_MSRAST_OFF_PIXEL;
>> dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE;
>> }
>>
>> + /* _NEW_MULTISAMPLE */
>
>
> Drop this comment. The code you're adding below doesn't depend on
> _NEW_MULTISAMPLE.
>
>>
>> + /* From the SNB PRM, volume 2 part 1, page 281:
>> + * "If the PS kernel does not need the Position XY Offsets
>> + * to compute a Position XY value, then this field should be
>> + * programmed to POSOFFSET_NONE."
>> + *
>> + * "SW Recommendation: If the PS kernel needs the Position Offsets
>> + * to compute a Position XY value, this field should match Position
>> + * ZW Interpolation Mode to ensure a consistent position.xyzw
>> + * computation."
>> + * We only require XY sample offsets. So, this recommendation doesn't
>> + * look useful at the moment. We might need this in future.
>> + */
>> + if (brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS)
>> + dw6 |= GEN6_WM_POSOFFSET_SAMPLE;
>> + else
>> + dw6 |= GEN6_WM_POSOFFSET_NONE;
>> +
>
>
> This is correct, however it is risky for the same reason that the sample
> mask code above was risky: correct execution depends on this code using the
> same boolean expression to decide whether to set GEN6_WM_POSOFFSET_SAMPLE as
> fs_visitor::setup_payload_gen6() uses to decide whether to expect the sample
> position to be present in the payload.
>
> I would feel much more comfortable if we did a similar thing to my
> "uses_omask" suggestion above. That is, add a boolean to brw_wm_prog_data
> (let's call it "uses_pos_offset"). In fs_visitor::setup_payload_gen6(), set
> the bit if MSAA position offsets need to be included in the payload. Then,
> here in gen6_wm_state.c, just consult the value of
> brw->wm.prog_data->uses_pos_offset rather than duplicating the boolean
> expression.
>
>>
>> BEGIN_BATCH(9);
>> OUT_BATCH(_3DSTATE_WM << 16 | (9 - 2));
>> OUT_BATCH(brw->wm.base.prog_offset);
>> --
>> 1.8.1.4
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
Thanks for the detailed feedback Paul. I'm working on resolving these issues.
More information about the mesa-dev
mailing list