[Mesa-dev] [PATCH] i965: Whack UAV bit when FS discards and there are no color writes.

Francisco Jerez currojerez at riseup.net
Fri Mar 25 02:10:03 UTC 2016


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Thu, Mar 24, 2016 at 5:50 PM, Kenneth Graunke <kenneth at whitecape.org>
> wrote:
>
>> dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no
>> framebuffer attachments, using a shader that discards based on
>> gl_FragCoord.  It uses occlusion queries to inspect whether pixels
>> are rendered or not.
>>
>> Unfortunately, the hardware is not dispatching any pixel shaders,
>> so discards never happen, and the full quad of pixels increments
>> PS_DEPTH_COUNT, making the occlusion query results bogus.
>>
>> To understand why, we have to delve into the WM_INT internal
>> signalling mechanism's formulas.
>>
>> The "WM_INT::Pixel Shader Kill Pixel" signal is defined as:
>>
>>     3DSTATE_WM::ForceKillPixel == ON ||
>>     (3DSTATE_WM::ForceKillPixel != Off &&
>>      !WM_INT::WM_HZ_OP &&
>>      3DSTATE_WM::EDSC_Mode != PREPS &&
>>      (WM_INT::Depth Write Enable || WM_INT::Stencil Write Enable) &&
>>      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>      (3DSTATE_PS_EXTRA::PixelShaderKillsPixels ||
>>       3DSTATE_PS_EXTRA:: oMask Present to RenderTarget ||
>>       3DSTATE_PS_BLEND::AlphaToCoverageEnable ||
>>       3DSTATE_PS_BLEND::AlphaTestEnable ||
>>       3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable))
>>
>> Because there is no depth or stencil buffer, writes to those buffers
>> are disabled.  So the highlighted condition is false, making the whole
>> "Kill Pixel" condition false.  This then feeds into the following
>> "WM_INT::ThreadDispatchEnable" condition:
>>
>>     3DSTATE_WM::ForceThreadDispatch != OFF &&
>>     !WM_INT::WM_HZ_OP &&
>>     3DSTATE_PS_EXTRA::PixelShaderValid &&
>>     (3DSTATE_PS_EXTRA::PixelShaderHasUAV ||
>>      WM_INT::Pixel Shader Kill Pixel ||
>>      WM_INT::RTIndependentRasterizationEnable ||
>>      (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT &&
>>       3DSTATE_PS_BLEND::HasWriteableRT) ||
>>      (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF &&
>>       (WM_INT::Depth Test Enable || WM_INT::Depth Write Enable)) ||
>>      (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) ||
>>      (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable ||
>>                                      WM_INT::Depth Write Enable ||
>>                                      WM_INT::Stencil Test Enable)))
>>
>> Given that there's no depth/stencil testing, no writeable render target,
>> and the hardware thinks kill pixel doesn't happen, all of these
>> conditions are false.  We have to whack some bit to make PS invocations
>> happen.  There are many options.
>>
>> Curro suggested using the UAV bit.  There's some precedence in doing
>> that - we set it for fragment shaders that do SSBO/image/atomic writes
>> when no color buffer writes are enabled.  We can simply include discard
>> here too.
>>
>> Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests.
>>
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> ---
>>  src/mesa/drivers/dri/i965/gen8_ps_state.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> b/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> index b9a06e7..cda8159 100644
>> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
>> @@ -93,8 +93,8 @@ gen8_upload_ps_extra(struct brw_context *brw,
>>      *
>>      * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS |
>> _NEW_COLOR
>>      */
>> -   if (_mesa_active_fragment_shader_has_side_effects(&brw->ctx) &&
>> -       !brw_color_buffer_write_enabled(brw))
>> +   if ((_mesa_active_fragment_shader_has_side_effects(ctx) ||
>> +        prog_data->uses_kill) && !brw_color_buffer_write_enabled(brw))
>>
>
> Whould you mind adding a short comment as to why uses_kill is in here?  You
> don't need the full explination above, bot something along the lines of
>
> "Gen8 hardware tries to compute ThreadDispatchEnable for us but doesn't
> take into account KillPixels when no depth or stencil writes are enabled.
> In order for occlusion queries to work correctly with no attachments, we
> need to force-enable here."
>
That sounds good to me.

> You can come up with your own text if you'd like.
>
> One other thought: Do we know if HAS_UAV has any perf impact other than
> actually running the shader?  I would guess not, but if it does we may want
> to make the condition a bit more specifi.
>

It does, it causes an implicit DC flush at the bottom of the pipeline,
see the comment right on top of this code for a more detailed
explanation.  It would definitely be possible to make the check more
specific by e.g. setting the bit only if both stencil and depth writes
are disabled (in addition to color buffer writes), but Ken had some
concerns about the additional state dependencies that would introduce
when I brought up the same question earlier today in the office.  Anyway
this only gets turned on when no color buffer writes are enabled so it
doesn't seem like a huge deal, but it would of course be nice to do the
additional checks.

> Other than that, looks good to me.
> --Jason
>
>
>>        dw1 |= GEN8_PSX_SHADER_HAS_UAV;
>>
>>     if (prog_data->computed_stencil) {
>> --
>> 2.7.4
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160324/55512e7d/attachment-0001.sig>


More information about the mesa-dev mailing list