<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Feb 10, 2017 at 4:47 PM, Nanley Chery <span dir="ltr"><<a href="mailto:nanleychery@gmail.com" target="_blank">nanleychery@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Fri, Feb 10, 2017 at 03:21:38PM -0800, Jason Ekstrand wrote:<br>
> On Fri, Feb 10, 2017 at 2:55 PM, Nanley Chery <<a href="mailto:nanleychery@gmail.com">nanleychery@gmail.com</a>> wrote:<br>
><br>
> > On Fri, Feb 10, 2017 at 11:02:19AM -0800, Jason Ekstrand wrote:<br>
> > > It's a bit hard to measure because it almost gets lost in the noise,<br>
> > > but this seemed to help Dota 2 by a percent or two on my Broadwell<br>
> > > GT3e desktop.<br>
> > ><br>
> > > Reviewed-by: Lionel Landwerlin <<a href="mailto:lionel.g.landwerlin@intel.com">lionel.g.landwerlin@intel.com</a><wbr>><br>
> > > ---<br>
> > >Â src/intel/vulkan/genX_<wbr>pipeline.c | 171 ++++++++++++++++++++++++++++++<br>
> > +--------<br>
> > >Â 1 file changed, 137 insertions(+), 34 deletions(-)<br>
> > ><br>
> > > diff --git a/src/intel/vulkan/genX_<wbr>pipeline.c b/src/intel/vulkan/genX_<br>
> > pipeline.c<br>
> > > index 71b9e30..d2af8b9 100644<br>
> > > --- a/src/intel/vulkan/genX_<wbr>pipeline.c<br>
> > > +++ b/src/intel/vulkan/genX_<wbr>pipeline.c<br>
> > > @@ -638,6 +638,133 @@ static const uint32_t vk_to_gen_stencil_op[] = {<br>
> > >Â Â Â [VK_STENCIL_OP_DECREMENT_AND_<wbr>WRAP]Â Â Â Â Â Â = STENCILOP_DECR,<br>
> > >Â };<br>
> > ><br>
> > > +static bool<br>
> > > +may_write_stencil_face(const VkStencilOpState *face,<br>
> > > +Â Â Â Â Â Â Â Â Â Â Â Â VkCompareOp depthCompareOp)<br>
> > > +{<br>
> ><br>
> > This function triggers some false negatives. One example is<br>
> ><br>
> > face->compareOp == VK_COMPARE_OP_GREATER &&<br>
> > face->depthFailOp == VK_STENCIL_OP_KEEP &&<br>
> > face->passOp == VK_STENCIL_OP_KEEP &&<br>
> > face->failOp == VK_STENCIL_OP_INVERT &&<br>
> > depthCompareOp == don't care;<br>
> ><br>
> > This function returns false even though a stencil write occurs if the<br>
> > stencil test results in a fail.<br>
> ><br>
> > The possible input combinations that affect writing to stencil can be<br>
> > divided up into 10 sets.<br>
> ><br>
> > s = stencil compare op<br>
> > d = depth compare op<br>
> > df = depth fail op<br>
> > sf = stencil fail op<br>
> > sp = stencil pass op<br>
> ><br>
> >           s.Always       |     !s.Always<br>
> >        d.Never |   !d.Never  |  s.Never  |<br>
> >Â Â Â Â Â Â Â Â Â Â Â Â | d.Always |Â Â Â |Â Â Â Â Â Â Â |<br>
> > !keep         |     |   |       |<br>
> >Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â |Â Â Â |Â Â Â Â Â Â Â |<br>
> >Â Â Â Â Â Â ------------|----------|------<wbr>|--------------|--------------<br>
> >Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â |Â Â Â |Â Â Â Â Â Â Â |<br>
> > keep     df.Keep  | sp.Keep |   |  sf.Keep  |<br>
> >Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â |Â Â Â |Â Â Â Â Â Â Â |<br>
> >Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â |Â Â Â |Â Â Â Â Â Â Â |<br>
> ><br>
> > We know that stencil won't be written if the function inputs land us in<br>
> > 3 of these sets (denoted by the *.Keep's above). I think this function<br>
> > would be easier to understand if it determined if we fell in one of the<br>
> > 3 instead of the 7. How about something like this:<br>
> ><br>
> > static bool<br>
> > wont_write_stencil_face(const VkStencilOpState *face,<br>
> >Â Â Â Â Â Â Â Â Â Â Â Â Â VkCompareOp depthCompareOp)<br>
> > {<br>
> >Â Â if (face->compareOp == VK_COMPARE_OP_NEVER &&<br>
> >Â Â Â Â face->failOp == VK_STENCIL_OP_KEEP)<br>
> >Â Â Â Â return true;<br>
> ><br>
> >Â Â if (face->compareOp == VK_COMPARE_OP_ALWAYS) {<br>
> >Â Â Â Â Â if (depthCompareOp == VK_COMPARE_OP_NEVER &&<br>
> >Â Â Â Â Â Â Â face->depthFailOp == VK_STENCIL_OP_KEEP)<br>
> >Â Â Â Â Â Â Â return true;<br>
> >Â Â Â Â Â if (depthCompareOp == VK_COMPARE_OP_ALWAYS &&<br>
> >Â Â Â Â Â Â Â face->passOp == VK_STENCIL_OP_KEEP)<br>
> >Â Â Â Â Â Â Â return true;<br>
> >Â Â }<br>
> ><br>
><br>
> This covers a lot fewer cases. For instance, if all of the stencil ops are<br>
> VK_STENCIL_OP_KEEP but compareOp == VK_COMPARE_OP_GREATER, your function<br>
> will claim that it writes stencil even though it clearly doesn't.<br>
><br>
<br>
</div></div>You're right, there are more than 3/10 cases in which we know we won't<br>
be writing to the stencil face. Here's another attempt at an<br>
easier-to-understand function:<br></blockquote><br></div><div class="gmail_quote">This function gets it wrong in the following case:<br><br></div><div class="gmail_quote">compareOp == VK_COMPARE_OP_GREATER<br></div><div class="gmail_quote">failOp == VK_STENCIL_OP_KEEP<br></div><div class="gmail_quote">passOp == VK_STENCIL_OP_REPLACE<br></div><div class="gmail_quote">depthFailOp == VK_STENCIL_OP_REPLACE<br><br></div><div class="gmail_quote">In this case, st_may_fail == true but sfo_not_keep == false so it returns false. However, it can clearly still write stencil if the stencil test passes.<br><br></div><div class="gmail_quote">--Jason<br></div><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
static bool<br>
<span class="">may_write_stencil_face(const VkStencilOpState *face,<br>
            VkCompareOp depthCompareOp)<br>
{<br>
</span>Â Â const bool spo_not_keep = face->passOp != VK_STENCIL_OP_KEEP;<br>
  const bool sfo_not_keep = face->failOp != VK_STENCIL_OP_KEEP;<br>
  const bool dfo_not_keep = face->depthFailOp != VK_STENCIL_OP_KEEP;<br>
<br>
  const bool dt_may_fail = depthCompareOp != VK_COMPARE_OP_ALWAYS;<br>
  const bool st_may_fail = face->compareOp != VK_COMPARE_OP_ALWAYS;<br>
<br>
  if (st_may_fail) {<br>
   /* If the stencil test may fail, but the fail op is not keep, we might be<br>
    * writing to the stencil face.<br>
    */<br>
   if (sfo_not_keep)<br>
     return true;<br>
  } else {<br>
   /* If the stencil test will pass, but the pass op is not keep, we might<br>
    * be writing to the stencil face.<br>
    */<br>
   if (spo_not_keep)<br>
     return true;<br>
<br>
   /* If the stencil test will pass, and the depth test may fail, but the<br>
    * depth fail op is not keep, we might be writing to the stencil face.<br>
    */<br>
   if (dt_may_fail && dfo_not_keep)<br>
<span class="im HOEnZb">Â Â Â Â Â return true;<br>
  }<br>
<br>
  return false;<br>
}<br>
<br>
-Nanley<br>
<br>
><br>
</span><div class="HOEnZb"><div class="h5">> >Â Â return false;<br>
> > }<br>
> ><br>
> > -Nanley<br>
> ><br>
> > > +Â Â /* If compareOp is ALWAYS then the stencil test will never fail and<br>
> > we can<br>
> > > +  * ignore failOp. If it's not ALWAYS and failOp is not KEEP, we may<br>
> > write<br>
> > > +Â Â * stencil.<br>
> > > +Â Â */<br>
> > > +Â Â if (face->compareOp != VK_COMPARE_OP_ALWAYS &&<br>
> > > +Â Â Â Â face->failOp != VK_STENCIL_OP_KEEP)<br>
> > > +Â Â Â return true;<br>
> > > +<br>
> > > +<br>
> > > +Â Â /* If the compareOp is NEVER then the stencil test will never pass<br>
> > and the<br>
> > > +  * passOp and depthFailOp don't matter. If compareOp isn't NEVER,<br>
> > then we<br>
> > > +Â Â * need to take them into account.<br>
> > > +Â Â */<br>
> > > +Â Â if (face->compareOp != VK_COMPARE_OP_NEVER) {<br>
> > > +Â Â Â /* If depthOp is NEVER then passOp doesn't matter. */<br>
> > > +Â Â Â if (depthCompareOp != VK_COMPARE_OP_NEVER &&<br>
> > > +Â Â Â Â Â face->passOp != VK_STENCIL_OP_KEEP)<br>
> > > +Â Â Â Â Â return true;<br>
> > > +<br>
> > > +Â Â Â /* If depthOp is ALWAYS then depthFailOp doesn't matter. */<br>
> > > +Â Â Â if (depthCompareOp != VK_COMPARE_OP_ALWAYS &&<br>
> > > +Â Â Â Â Â face->depthFailOp != VK_STENCIL_OP_KEEP)<br>
> > > +Â Â Â Â Â return true;<br>
> > > +Â Â }<br>
> > > +<br>
> > > +Â Â return false;<br>
> > > +}<br>
> > > +<br>
> > > +/* Intel hardware is fairly sensitive to whether or not depth/stencil<br>
> > writes<br>
> > > + * are enabled. In the presence of discards, it's fairly easy to get<br>
> > into the<br>
> > > + * non-promoted case which means a fairly big performance hit. From<br>
> > the Iron<br>
> > > + * Lake PRM, Vol 2, pt. 1, section 8.4.3.2, "Early Depth Test Cases":<br>
> > > + *<br>
> > > + *Â Â "Non-promoted depth (N) is active whenever the depth test can be<br>
> > done<br>
> > > + *Â Â early but it cannot determine whether or not to write source<br>
> > depth to<br>
> > > + *Â Â the depth buffer, therefore the depth write must be performed<br>
> > post pixel<br>
> > > + *Â Â shader. This includes cases where the pixel shader can kill<br>
> > pixels,<br>
> > > + *Â Â including via sampler chroma key, as well as cases where the<br>
> > alpha test<br>
> > > + *Â Â function is enabled, which kills pixels based on a programmable<br>
> > alpha<br>
> > > + *Â Â test. In this case, even if the depth test fails, the pixel<br>
> > cannot be<br>
> > > + *Â Â killed if a stencil write is indicated. Whether or not the<br>
> > stencil write<br>
> > > + *Â Â happens depends on whether or not the pixel is killed later. In<br>
> > these<br>
> > > + *Â Â cases if stencil test fails and stencil writes are off, the<br>
> > pixels can<br>
> > > + *Â Â also be killed early. If stencil writes are enabled, the pixels<br>
> > must be<br>
> > > + *Â Â treated as Computed depth (described above)."<br>
> > > + *<br>
> > > + * The same thing as mentioned in the stencil case can happen in the<br>
> > depth<br>
> > > + * case as well if it thinks it writes depth but, thanks to the depth<br>
> > test<br>
> > > + * being GL_EQUAL, the write doesn't actually matter. A little extra<br>
> > work<br>
> > > + * up-front to try and disable depth and stencil writes can make a big<br>
> > > + * difference.<br>
> > > + *<br>
> > > + * Unfortunately, the way depth and stencil testing is specified, there<br>
> > are<br>
> > > + * many case where, regardless of depth/stencil writes being enabled,<br>
> > nothing<br>
> > > + * actually gets written due to some other bit of state being set. This<br>
> > > + * function attempts to "sanitize" the depth stencil state and disable<br>
> > writes<br>
> > > + * and sometimes even testing whenever possible.<br>
> > > + */<br>
> > > +static void<br>
> > > +sanitize_ds_state(<wbr>VkPipelineDepthStencilStateCre<wbr>ateInfo *state,<br>
> > > +Â Â Â Â Â Â Â Â Â bool *stencilWriteEnable,<br>
> > > +Â Â Â Â Â Â Â Â Â VkImageAspectFlags ds_aspects)<br>
> > > +{<br>
> > > +Â Â *stencilWriteEnable = state->stencilTestEnable;<br>
> > > +<br>
> > > +Â Â /* If the depth test is disabled, we won't be writing anything. */<br>
> > > +Â Â if (!state->depthTestEnable)<br>
> > > +Â Â Â state->depthWriteEnable = false;<br>
> > > +<br>
> > > +Â Â /* The Vulkan spec requires that if either depth or stencil is not<br>
> > present,<br>
> > > +Â Â * the pipeline is to act as if the test silently passes.<br>
> > > +Â Â */<br>
> > > +Â Â if (!(ds_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {<br>
> > > +Â Â Â state->depthWriteEnable = false;<br>
> > > +Â Â Â state->depthCompareOp = VK_COMPARE_OP_ALWAYS;<br>
> > > +Â Â }<br>
> > > +<br>
> > > +Â Â if (!(ds_aspects & VK_IMAGE_ASPECT_STENCIL_BIT)) {<br>
> > > +Â Â Â *stencilWriteEnable = false;<br>
> > > +Â Â Â state->front.compareOp = VK_COMPARE_OP_ALWAYS;<br>
> > > +Â Â Â state->back.compareOp = VK_COMPARE_OP_ALWAYS;<br>
> > > +Â Â }<br>
> > > +<br>
> > > +Â Â /* If the stencil test is enabled and always fails, then we will<br>
> > never get<br>
> > > +Â Â * to the depth test so we can just disable the depth test entirely.<br>
> > > +Â Â */<br>
> > > +Â Â if (state->stencilTestEnable &&<br>
> > > +Â Â Â Â state->front.compareOp == VK_COMPARE_OP_NEVER &&<br>
> > > +Â Â Â Â state->back.compareOp == VK_COMPARE_OP_NEVER) {<br>
> > > +Â Â Â state->depthTestEnable = false;<br>
> > > +Â Â Â state->depthWriteEnable = false;<br>
> > > +Â Â }<br>
> > > +<br>
> > > +Â Â /* If depthCompareOp is EQUAL then the value we would be writing to<br>
> > the<br>
> > > +Â Â * depth buffer is the same as the value that's already there so<br>
> > there's no<br>
> > > +Â Â * point in writing it.<br>
> > > +Â Â */<br>
> > > +Â Â if (state->depthCompareOp == VK_COMPARE_OP_EQUAL)<br>
> > > +Â Â Â state->depthWriteEnable = false;<br>
> > > +<br>
> > > +Â Â /* If the stencil ops are such that we don't actually ever modify the<br>
> > > +Â Â * stencil buffer, we should disable writes.<br>
> > > +Â Â */<br>
> > > +Â Â if (!may_write_stencil_face(&<wbr>state->front, state->depthCompareOp) &&<br>
> > > +Â Â Â Â !may_write_stencil_face(&<wbr>state->back, state->depthCompareOp))<br>
> > > +Â Â Â *stencilWriteEnable = false;<br>
> > > +<br>
> > > +Â Â /* If the depth test always passes and we never write out depth,<br>
> > that's the<br>
> > > +Â Â * same as if the depth test is disabled entirely.<br>
> > > +Â Â */<br>
> > > +Â Â if (state->depthCompareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > +Â Â Â Â !state->depthWriteEnable)<br>
> > > +Â Â Â state->depthTestEnable = false;<br>
> > > +<br>
> > > +Â Â /* If the stencil test always passes and we never write out stencil,<br>
> > that's<br>
> > > +Â Â * the same as if the stencil test is disabled entirely.<br>
> > > +Â Â */<br>
> > > +Â Â if (state->front.compareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > +Â Â Â Â state->back.compareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > +Â Â Â Â !*stencilWriteEnable)<br>
> > > +Â Â Â state->stencilTestEnable = false;<br>
> > > +}<br>
> > > +<br>
> > >Â static void<br>
> > >Â emit_ds_state(struct anv_pipeline *pipeline,<br>
> > >Â Â Â Â Â Â Â Â const VkPipelineDepthStencilStateCre<wbr>ateInfo *pCreateInfo,<br>
> > > @@ -663,12 +790,20 @@ emit_ds_state(struct anv_pipeline *pipeline,<br>
> > >Â Â Â Â return;<br>
> > >Â Â Â }<br>
> > ><br>
> > > +Â Â VkImageAspectFlags ds_aspects = 0;<br>
> > > +Â Â if (subpass->depth_stencil_<wbr>attachment != VK_ATTACHMENT_UNUSED) {<br>
> > > +Â Â Â VkFormat depth_stencil_format =<br>
> > > +Â Â Â Â Â pass->attachments[subpass-><wbr>depth_stencil_attachment].<wbr>format;<br>
> > > +Â Â Â ds_aspects = vk_format_aspects(depth_<wbr>stencil_format);<br>
> > > +Â Â }<br>
> > > +<br>
> > >Â Â Â VkPipelineDepthStencilStateCre<wbr>ateInfo info = *pCreateInfo;<br>
> > > +Â Â sanitize_ds_state(&info, &pipeline->writes_stencil, ds_aspects);<br>
> > > +Â Â pipeline->writes_depth = info.depthWriteEnable;<br>
> > > +Â Â pipeline->depth_test_enable = info.depthTestEnable;<br>
> > ><br>
> > >Â Â Â /* VkBool32 depthBoundsTestEnable; // optional (depth_bounds_test) */<br>
> > ><br>
> > > -Â Â pipeline->writes_stencil = info.stencilTestEnable;<br>
> > > -<br>
> > >Â #if GEN_GEN <= 7<br>
> > >Â Â Â struct GENX(DEPTH_STENCIL_STATE) depth_stencil = {<br>
> > >Â #else<br>
> > > @@ -690,38 +825,6 @@ emit_ds_state(struct anv_pipeline *pipeline,<br>
> > >Â Â Â Â .BackfaceStencilTestFunction = vk_to_gen_compare_op[info.<br>
> > back.compareOp],<br>
> > >Â Â Â };<br>
> > ><br>
> > > -Â Â VkImageAspectFlags aspects = 0;<br>
> > > -Â Â if (subpass->depth_stencil_<wbr>attachment != VK_ATTACHMENT_UNUSED) {<br>
> > > -Â Â Â VkFormat depth_stencil_format =<br>
> > > -Â Â Â Â Â pass->attachments[subpass-><wbr>depth_stencil_attachment].<wbr>format;<br>
> > > -Â Â Â aspects = vk_format_aspects(depth_<wbr>stencil_format);<br>
> > > -Â Â }<br>
> > > -<br>
> > > -Â Â /* The Vulkan spec requires that if either depth or stencil is not<br>
> > present,<br>
> > > -Â Â * the pipeline is to act as if the test silently passes.<br>
> > > -Â Â */<br>
> > > -Â Â if (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {<br>
> > > -Â Â Â depth_stencil.<wbr>DepthBufferWriteEnable = false;<br>
> > > -Â Â Â depth_stencil.<wbr>DepthTestFunction = PREFILTEROPALWAYS;<br>
> > > -Â Â }<br>
> > > -<br>
> > > -Â Â if (!(aspects & VK_IMAGE_ASPECT_STENCIL_BIT)) {<br>
> > > -Â Â Â pipeline->writes_stencil = false;<br>
> > > -Â Â Â depth_stencil.<wbr>StencilTestFunction = PREFILTEROPALWAYS;<br>
> > > -Â Â Â depth_stencil.<wbr>BackfaceStencilTestFunction = PREFILTEROPALWAYS;<br>
> > > -Â Â }<br>
> > > -<br>
> > > -Â Â /* From the Broadwell PRM:<br>
> > > -Â Â *<br>
> > > -Â Â *Â Â "If Depth_Test_Enable = 1 AND Depth_Test_func = EQUAL, the<br>
> > > -Â Â *Â Â Depth_Write_Enable must be set to 0."<br>
> > > -Â Â */<br>
> > > -Â Â if (info.depthTestEnable && info.depthCompareOp ==<br>
> > VK_COMPARE_OP_EQUAL)<br>
> > > -Â Â Â depth_stencil.<wbr>DepthBufferWriteEnable = false;<br>
> > > -<br>
> > > -Â Â pipeline->writes_depth = depth_stencil.<wbr>DepthBufferWriteEnable;<br>
> > > -Â Â pipeline->depth_test_enable = depth_stencil.DepthTestEnable;<br>
> > > -<br>
> > >Â #if GEN_GEN <= 7<br>
> > >Â Â Â GENX(DEPTH_STENCIL_STATE_pack)<wbr>(NULL, depth_stencil_dw,<br>
> > &depth_stencil);<br>
> > >Â #else<br>
> > > --<br>
> > > 2.5.0.400.gff86faf<br>
> > ><br>
> > > ______________________________<wbr>_________________<br>
> > > mesa-dev mailing list<br>
> > > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> > > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
> ><br>
</div></div></blockquote></div><br></div></div>