<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Feb 13, 2017 at 11:14 AM, Nanley Chery <span dir="ltr"><<a href="mailto:nanleychery@gmail.com" target="_blank">nanleychery@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Fri, Feb 10, 2017 at 09:26:52PM -0800, Jason Ekstrand wrote:<br>
> On Fri, Feb 10, 2017 at 5:29 PM, Nanley Chery <<a href="mailto:nanleychery@gmail.com">nanleychery@gmail.com</a>> wrote:<br>
><br>
> > On Fri, Feb 10, 2017 at 03:31:32PM -0800, Jason Ekstrand wrote:<br>
> > > On Fri, Feb 10, 2017 at 3:18 PM, Nanley Chery <<a href="mailto:nanleychery@gmail.com">nanleychery@gmail.com</a>><br>
> > wrote:<br>
> > ><br>
> > > > On Fri, Feb 10, 2017 at 02:55:42PM -0800, Nanley Chery wrote:<br>
> > > > > On Fri, Feb 10, 2017 at 11:02:19AM -0800, Jason Ekstrand wrote:<br>
> > > > > > It's a bit hard to measure because it almost gets lost in the<br>
> > noise,<br>
> > > > > > but this seemed to help Dota 2 by a percent or two on my Broadwell<br>
> > > > > > GT3e desktop.<br>
> > > > > ><br>
> > > > > > Reviewed-by: Lionel Landwerlin <<a href="mailto:lionel.g.landwerlin@intel.com">lionel.g.landwerlin@intel.com</a><wbr>><br>
> > > > > > ---<br>
> > > > > > src/intel/vulkan/genX_<wbr>pipeline.c | 171<br>
> > ++++++++++++++++++++++++++++++<br>
> > > > +--------<br>
> > > > > > 1 file changed, 137 insertions(+), 34 deletions(-)<br>
> > > > > ><br>
> > > > > > diff --git a/src/intel/vulkan/genX_<wbr>pipeline.c<br>
> > b/src/intel/vulkan/genX_<br>
> > > > pipeline.c<br>
> > > > > > index 71b9e30..d2af8b9 100644<br>
> > > > > > --- a/src/intel/vulkan/genX_<wbr>pipeline.c<br>
> > > > > > +++ b/src/intel/vulkan/genX_<wbr>pipeline.c<br>
> > > > > > @@ -638,6 +638,133 @@ static const uint32_t vk_to_gen_stencil_op[]<br>
> > = {<br>
> > > > > > [VK_STENCIL_OP_DECREMENT_AND_<wbr>WRAP] = STENCILOP_DECR,<br>
> > > > > > };<br>
> > > > > ><br>
> > > > > > +static bool<br>
> > > > > > +may_write_stencil_face(const VkStencilOpState *face,<br>
> > > > > > + VkCompareOp depthCompareOp)<br>
> > > > > > +{<br>
> > > > ><br>
> > > > > This function triggers some false negatives. One example is<br>
> > > > ><br>
> > > > > face->compareOp == VK_COMPARE_OP_GREATER &&<br>
> > > > > face->depthFailOp == VK_STENCIL_OP_KEEP &&<br>
> > > > > face->passOp == VK_STENCIL_OP_KEEP &&<br>
> > > > > face->failOp == VK_STENCIL_OP_INVERT &&<br>
> > > > > depthCompareOp == don't care;<br>
> > > > ><br>
> > > > > This function returns false even though a stencil write occurs if the<br>
> > > > > stencil test results in a fail.<br>
> > > ><br>
> > > > Thanks for pointing out in the office that this actually isn't a false<br>
> > > > negative. This function is correct, but I still think it could be<br>
> > > > simplified. I really needed a visual to evaluate it as it currently is<br>
> > > > written.<br>
> > > ><br>
> > ><br>
> > > There is a way I could write it that may be clearer. How about:<br>
> > ><br>
> > > may_write_stencil_face(const VkStencilOpState *face,<br>
> > > VkCompareOp depthCompareOp)<br>
> > > {<br>
> > > /* If compareOp is ALWAYS, then failOp doesn't matter */<br>
> > > if (face->compareOp == VK_COMPARE_OP_ALWAYS)<br>
> > > face->failOp = VK_STENCIL_OP_KEEP;<br>
> > ><br>
> > > if (face->compareOp == VK_COMPARE_OP_NEVER ||<br>
> ><br>
> > Why does the depth compare op independently impact the stencil op?<br>
> > Did you mean && instead of ||?<br>
> ><br>
><br>
> Perhaps it would help if I explained the way that I'm reasoning about<br>
> things a bit better.<br>
><br>
> In the first case (above), if compareOp is ALWAYS then the stencil test<br>
> will never fail so the failOp doesn't matter. We can set it to KEEP or<br>
> whatever else we like but KEEP is the most convenient.<br>
><br>
> The second case is a bit more complicated. Here we're looking at the<br>
> stencil+depth pass op. In order for this to trigger, both the depth and<br>
> the stencil test have to pass. If either compare function is NEVER then at<br>
> least one of them will always fail and we'll never trigger the passOp.<br>
><br>
> In the third case, depthFailOp gets taken if stencil passes but depth<br>
> fails. Again, if compareOp == NEVER or depthCompareOp == ALWAYS, then this<br>
> case cannot happen and depthFailOp doesn't matter.<br>
><br>
> Does that make sense? It's not so much about figuring out which of the<br>
> stencil ops get taken as it is about figuring out which ones don't matter<br>
> because the corresponding depth and stencil test results cannot happen.<br>
> Then, you take the ones that do matter and look at the stencil ops. If any<br>
> of the stencil ops that *can* happen aren't KEEP, then we write stencil.<br>
><br>
> --Jason<br>
><br>
><br>
> > > face->depthCompareOp == VK_COMPARE_OP_NEVER)<br>
> > > face->passOp = VK_STENCIL_OP_KEEP;<br>
> > ><br>
> > > if (face->compareOp == VK_COMPARE_OP_NEVER ||<br>
> > > face->depthCompareOp == VK_COMPARE_OP_ALWAYS)<br>
> > > face->depthFailOp = VK_STENCIL_OP_KEEP;<br>
> > ><br>
> > > return face->failOp != VK_STENCIL_OP_KEEP ||<br>
> > > face->depthFailOp != VK_STENCIL_OP_KEEP ||<br>
> > > face->passOp != VK_STENCIL_OP_KEEP;<br>
> ><br>
> > Why does this work?<br>
> ><br>
> > > }<br>
> > ><br>
> > > I thought about writing it that way at one point but I decided sanitizing<br>
> > > the stencil ops wasn't needed. I can switch if you'd rather.<br>
> > ><br>
> ><br>
> > I sent another function to the list before I saw this, which I think is<br>
> > more straight-forward. What do you think?<br>
> ><br>
<br>
</div></div>It looks like you missed this question. The second function I sent isn't<br>
100% correct, but I think the general structure of the code and comments<br>
is easier to grasp. I do find your explanation helpful to understanding<br>
how your function works. If you disagree with using my proposed function<br>
as a starting point, could you massage your explanation into code<br>
comments?<span class="HOEnZb"><font color="#888888"><br></font></span></blockquote><div><br></div><div>Yeah, I would probably say that your function is the wrong mental model. I can definitely add more code comments and send another version.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="HOEnZb"><font color="#888888">
-Nanley<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
> > -Nanley<br>
> ><br>
> > > >-Nanley<br>
> > > ><br>
> > > > ><br>
> > > > > The possible input combinations that affect writing to stencil can be<br>
> > > > > divided up into 10 sets.<br>
> > > > ><br>
> > > > > s = stencil compare op<br>
> > > > > d = depth compare op<br>
> > > > > df = depth fail op<br>
> > > > > sf = stencil fail op<br>
> > > > > sp = stencil pass op<br>
> > > > ><br>
> > > > > s.Always | !s.Always<br>
> > > > > d.Never | !d.Never | s.Never |<br>
> > > > > | d.Always | | |<br>
> > > > > !keep | | | |<br>
> > > > > | | | |<br>
> > > > > ------------|----------|------<br>
> > |--------------|--------------<br>
> > > > > | | | |<br>
> > > > > keep df.Keep | sp.Keep | | sf.Keep |<br>
> > > > > | | | |<br>
> > > > > | | | |<br>
> > > > ><br>
> > > > > We know that stencil won't be written if the function inputs land us<br>
> > in<br>
> > > > > 3 of these sets (denoted by the *.Keep's above). I think this<br>
> > function<br>
> > > > > would be easier to understand if it determined if we fell in one of<br>
> > the<br>
> > > > > 3 instead of the 7. How about something like this:<br>
> > > > ><br>
> > > > > static bool<br>
> > > > > wont_write_stencil_face(const VkStencilOpState *face,<br>
> > > > > VkCompareOp depthCompareOp)<br>
> > > > > {<br>
> > > > > if (face->compareOp == VK_COMPARE_OP_NEVER &&<br>
> > > > > face->failOp == VK_STENCIL_OP_KEEP)<br>
> > > > > return true;<br>
> > > > ><br>
> > > > > if (face->compareOp == VK_COMPARE_OP_ALWAYS) {<br>
> > > > > if (depthCompareOp == VK_COMPARE_OP_NEVER &&<br>
> > > > > face->depthFailOp == VK_STENCIL_OP_KEEP)<br>
> > > > > return true;<br>
> > > > > if (depthCompareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > > > face->passOp == VK_STENCIL_OP_KEEP)<br>
> > > > > return true;<br>
> > > > > }<br>
> > > > > return false;<br>
> > > > > }<br>
> > > > ><br>
> > > > > -Nanley<br>
> > > > ><br>
> > > > > > + /* If compareOp is ALWAYS then the stencil test will never fail<br>
> > > > and we can<br>
> > > > > > + * ignore failOp. If it's not ALWAYS and failOp is not KEEP,<br>
> > we<br>
> > > > may write<br>
> > > > > > + * stencil.<br>
> > > > > > + */<br>
> > > > > > + if (face->compareOp != VK_COMPARE_OP_ALWAYS &&<br>
> > > > > > + face->failOp != VK_STENCIL_OP_KEEP)<br>
> > > > > > + return true;<br>
> > > > > > +<br>
> > > > > > +<br>
> > > > > > + /* If the compareOp is NEVER then the stencil test will never<br>
> > pass<br>
> > > > and the<br>
> > > > > > + * passOp and depthFailOp don't matter. If compareOp isn't<br>
> > NEVER,<br>
> > > > then we<br>
> > > > > > + * need to take them into account.<br>
> > > > > > + */<br>
> > > > > > + if (face->compareOp != VK_COMPARE_OP_NEVER) {<br>
> > > > > > + /* If depthOp is NEVER then passOp doesn't matter. */<br>
> > > > > > + if (depthCompareOp != VK_COMPARE_OP_NEVER &&<br>
> > > > > > + face->passOp != VK_STENCIL_OP_KEEP)<br>
> > > > > > + return true;<br>
> > > > > > +<br>
> > > > > > + /* If depthOp is ALWAYS then depthFailOp doesn't matter. */<br>
> > > > > > + if (depthCompareOp != VK_COMPARE_OP_ALWAYS &&<br>
> > > > > > + face->depthFailOp != VK_STENCIL_OP_KEEP)<br>
> > > > > > + return true;<br>
> > > > > > + }<br>
> > > > > > +<br>
> > > > > > + return false;<br>
> > > > > > +}<br>
> > > > > > +<br>
> > > > > > +/* Intel hardware is fairly sensitive to whether or not<br>
> > depth/stencil<br>
> > > > writes<br>
> > > > > > + * are enabled. In the presence of discards, it's fairly easy to<br>
> > get<br>
> > > > into the<br>
> > > > > > + * non-promoted case which means a fairly big performance hit.<br>
> > From<br>
> > > > the Iron<br>
> > > > > > + * Lake PRM, Vol 2, pt. 1, section 8.4.3.2, "Early Depth Test<br>
> > Cases":<br>
> > > > > > + *<br>
> > > > > > + * "Non-promoted depth (N) is active whenever the depth test<br>
> > can<br>
> > > > be done<br>
> > > > > > + * early but it cannot determine whether or not to write source<br>
> > > > depth to<br>
> > > > > > + * the depth buffer, therefore the depth write must be<br>
> > performed<br>
> > > > post pixel<br>
> > > > > > + * shader. This includes cases where the pixel shader can kill<br>
> > > > pixels,<br>
> > > > > > + * including via sampler chroma key, as well as cases where the<br>
> > > > alpha test<br>
> > > > > > + * function is enabled, which kills pixels based on a<br>
> > programmable<br>
> > > > alpha<br>
> > > > > > + * test. In this case, even if the depth test fails, the pixel<br>
> > > > cannot be<br>
> > > > > > + * killed if a stencil write is indicated. Whether or not the<br>
> > > > stencil write<br>
> > > > > > + * happens depends on whether or not the pixel is killed<br>
> > later. In<br>
> > > > these<br>
> > > > > > + * cases if stencil test fails and stencil writes are off, the<br>
> > > > pixels can<br>
> > > > > > + * also be killed early. If stencil writes are enabled, the<br>
> > pixels<br>
> > > > must be<br>
> > > > > > + * treated as Computed depth (described above)."<br>
> > > > > > + *<br>
> > > > > > + * The same thing as mentioned in the stencil case can happen in<br>
> > the<br>
> > > > depth<br>
> > > > > > + * case as well if it thinks it writes depth but, thanks to the<br>
> > depth<br>
> > > > test<br>
> > > > > > + * being GL_EQUAL, the write doesn't actually matter. A little<br>
> > extra<br>
> > > > work<br>
> > > > > > + * up-front to try and disable depth and stencil writes can make<br>
> > a big<br>
> > > > > > + * difference.<br>
> > > > > > + *<br>
> > > > > > + * Unfortunately, the way depth and stencil testing is specified,<br>
> > > > there are<br>
> > > > > > + * many case where, regardless of depth/stencil writes being<br>
> > enabled,<br>
> > > > nothing<br>
> > > > > > + * actually gets written due to some other bit of state being set.<br>
> > > > This<br>
> > > > > > + * function attempts to "sanitize" the depth stencil state and<br>
> > > > disable writes<br>
> > > > > > + * and sometimes even testing whenever possible.<br>
> > > > > > + */<br>
> > > > > > +static void<br>
> > > > > > +sanitize_ds_state(<wbr>VkPipelineDepthStencilStateCre<wbr>ateInfo *state,<br>
> > > > > > + bool *stencilWriteEnable,<br>
> > > > > > + VkImageAspectFlags ds_aspects)<br>
> > > > > > +{<br>
> > > > > > + *stencilWriteEnable = state->stencilTestEnable;<br>
> > > > > > +<br>
> > > > > > + /* If the depth test is disabled, we won't be writing<br>
> > anything. */<br>
> > > > > > + if (!state->depthTestEnable)<br>
> > > > > > + state->depthWriteEnable = false;<br>
> > > > > > +<br>
> > > > > > + /* The Vulkan spec requires that if either depth or stencil is<br>
> > not<br>
> > > > present,<br>
> > > > > > + * the pipeline is to act as if the test silently passes.<br>
> > > > > > + */<br>
> > > > > > + if (!(ds_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {<br>
> > > > > > + state->depthWriteEnable = false;<br>
> > > > > > + state->depthCompareOp = VK_COMPARE_OP_ALWAYS;<br>
> > > > > > + }<br>
> > > > > > +<br>
> > > > > > + if (!(ds_aspects & VK_IMAGE_ASPECT_STENCIL_BIT)) {<br>
> > > > > > + *stencilWriteEnable = false;<br>
> > > > > > + state->front.compareOp = VK_COMPARE_OP_ALWAYS;<br>
> > > > > > + state->back.compareOp = VK_COMPARE_OP_ALWAYS;<br>
> > > > > > + }<br>
> > > > > > +<br>
> > > > > > + /* If the stencil test is enabled and always fails, then we<br>
> > will<br>
> > > > never get<br>
> > > > > > + * to the depth test so we can just disable the depth test<br>
> > > > entirely.<br>
> > > > > > + */<br>
> > > > > > + if (state->stencilTestEnable &&<br>
> > > > > > + state->front.compareOp == VK_COMPARE_OP_NEVER &&<br>
> > > > > > + state->back.compareOp == VK_COMPARE_OP_NEVER) {<br>
> > > > > > + state->depthTestEnable = false;<br>
> > > > > > + state->depthWriteEnable = false;<br>
> > > > > > + }<br>
> > > > > > +<br>
> > > > > > + /* If depthCompareOp is EQUAL then the value we would be<br>
> > writing<br>
> > > > to the<br>
> > > > > > + * depth buffer is the same as the value that's already there<br>
> > so<br>
> > > > there's no<br>
> > > > > > + * point in writing it.<br>
> > > > > > + */<br>
> > > > > > + if (state->depthCompareOp == VK_COMPARE_OP_EQUAL)<br>
> > > > > > + state->depthWriteEnable = false;<br>
> > > > > > +<br>
> > > > > > + /* If the stencil ops are such that we don't actually ever<br>
> > modify<br>
> > > > the<br>
> > > > > > + * stencil buffer, we should disable writes.<br>
> > > > > > + */<br>
> > > > > > + if (!may_write_stencil_face(&<wbr>state->front,<br>
> > state->depthCompareOp)<br>
> > > > &&<br>
> > > > > > + !may_write_stencil_face(&<wbr>state->back,<br>
> > state->depthCompareOp))<br>
> > > > > > + *stencilWriteEnable = false;<br>
> > > > > > +<br>
> > > > > > + /* If the depth test always passes and we never write out<br>
> > depth,<br>
> > > > that's the<br>
> > > > > > + * same as if the depth test is disabled entirely.<br>
> > > > > > + */<br>
> > > > > > + if (state->depthCompareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > > > > + !state->depthWriteEnable)<br>
> > > > > > + state->depthTestEnable = false;<br>
> > > > > > +<br>
> > > > > > + /* If the stencil test always passes and we never write out<br>
> > > > stencil, that's<br>
> > > > > > + * the same as if the stencil test is disabled entirely.<br>
> > > > > > + */<br>
> > > > > > + if (state->front.compareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > > > > + state->back.compareOp == VK_COMPARE_OP_ALWAYS &&<br>
> > > > > > + !*stencilWriteEnable)<br>
> > > > > > + state->stencilTestEnable = false;<br>
> > > > > > +}<br>
> > > > > > +<br>
> > > > > > static void<br>
> > > > > > emit_ds_state(struct anv_pipeline *pipeline,<br>
> > > > > > const VkPipelineDepthStencilStateCre<wbr>ateInfo<br>
> > > > *pCreateInfo,<br>
> > > > > > @@ -663,12 +790,20 @@ emit_ds_state(struct anv_pipeline *pipeline,<br>
> > > > > > return;<br>
> > > > > > }<br>
> > > > > ><br>
> > > > > > + VkImageAspectFlags ds_aspects = 0;<br>
> > > > > > + if (subpass->depth_stencil_<wbr>attachment !=<br>
> > VK_ATTACHMENT_UNUSED) {<br>
> > > > > > + VkFormat depth_stencil_format =<br>
> > > > > > + pass->attachments[subpass-><wbr>depth_stencil_attachment].<br>
> > format;<br>
> > > > > > + ds_aspects = vk_format_aspects(depth_<wbr>stencil_format);<br>
> > > > > > + }<br>
> > > > > > +<br>
> > > > > > VkPipelineDepthStencilStateCre<wbr>ateInfo info = *pCreateInfo;<br>
> > > > > > + sanitize_ds_state(&info, &pipeline->writes_stencil,<br>
> > ds_aspects);<br>
> > > > > > + pipeline->writes_depth = info.depthWriteEnable;<br>
> > > > > > + pipeline->depth_test_enable = info.depthTestEnable;<br>
> > > > > ><br>
> > > > > > /* VkBool32 depthBoundsTestEnable; // optional<br>
> > (depth_bounds_test)<br>
> > > > */<br>
> > > > > ><br>
> > > > > > - pipeline->writes_stencil = info.stencilTestEnable;<br>
> > > > > > -<br>
> > > > > > #if GEN_GEN <= 7<br>
> > > > > > struct GENX(DEPTH_STENCIL_STATE) depth_stencil = {<br>
> > > > > > #else<br>
> > > > > > @@ -690,38 +825,6 @@ emit_ds_state(struct anv_pipeline *pipeline,<br>
> > > > > > .BackfaceStencilTestFunction = vk_to_gen_compare_op[info.<br>
> > > > back.compareOp],<br>
> > > > > > };<br>
> > > > > ><br>
> > > > > > - VkImageAspectFlags aspects = 0;<br>
> > > > > > - if (subpass->depth_stencil_<wbr>attachment !=<br>
> > VK_ATTACHMENT_UNUSED) {<br>
> > > > > > - VkFormat depth_stencil_format =<br>
> > > > > > - pass->attachments[subpass-><wbr>depth_stencil_attachment].<br>
> > format;<br>
> > > > > > - aspects = vk_format_aspects(depth_<wbr>stencil_format);<br>
> > > > > > - }<br>
> > > > > > -<br>
> > > > > > - /* The Vulkan spec requires that if either depth or stencil is<br>
> > not<br>
> > > > present,<br>
> > > > > > - * the pipeline is to act as if the test silently passes.<br>
> > > > > > - */<br>
> > > > > > - if (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {<br>
> > > > > > - depth_stencil.<wbr>DepthBufferWriteEnable = false;<br>
> > > > > > - depth_stencil.<wbr>DepthTestFunction = PREFILTEROPALWAYS;<br>
> > > > > > - }<br>
> > > > > > -<br>
> > > > > > - if (!(aspects & VK_IMAGE_ASPECT_STENCIL_BIT)) {<br>
> > > > > > - pipeline->writes_stencil = false;<br>
> > > > > > - depth_stencil.<wbr>StencilTestFunction = PREFILTEROPALWAYS;<br>
> > > > > > - depth_stencil.<wbr>BackfaceStencilTestFunction =<br>
> > PREFILTEROPALWAYS;<br>
> > > > > > - }<br>
> > > > > > -<br>
> > > > > > - /* From the Broadwell PRM:<br>
> > > > > > - *<br>
> > > > > > - * "If Depth_Test_Enable = 1 AND Depth_Test_func = EQUAL,<br>
> > the<br>
> > > > > > - * Depth_Write_Enable must be set to 0."<br>
> > > > > > - */<br>
> > > > > > - if (info.depthTestEnable && info.depthCompareOp ==<br>
> > > > VK_COMPARE_OP_EQUAL)<br>
> > > > > > - depth_stencil.<wbr>DepthBufferWriteEnable = false;<br>
> > > > > > -<br>
> > > > > > - pipeline->writes_depth = depth_stencil.<wbr>DepthBufferWriteEnable;<br>
> > > > > > - pipeline->depth_test_enable = depth_stencil.DepthTestEnable;<br>
> > > > > > -<br>
> > > > > > #if GEN_GEN <= 7<br>
> > > > > > GENX(DEPTH_STENCIL_STATE_pack)<wbr>(NULL, depth_stencil_dw,<br>
> > > > &depth_stencil);<br>
> > > > > > #else<br>
> > > > > > --<br>
> > > > > > 2.5.0.400.gff86faf<br>
> > > > > ><br>
> > > > > > ______________________________<wbr>_________________<br>
> > > > > > mesa-dev mailing list<br>
> > > > > > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> > > > > > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
> > > ><br>
> ><br>
</div></div></blockquote></div><br></div></div>