<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Tue, Nov 6, 2018 at 9:43 PM Roland Scheidegger <<a href="mailto:sroland@vmware.com">sroland@vmware.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am 06.11.18 um 22:48 schrieb Jason Ekstrand:<br>
> This came to the top of my list recently due to a difference between<br>
> OpenGL and Vulkan discard operations and D3D's discard operation.  The<br>
> OpenGL and Vulkan discard is defined to be control flow and derivatives<br>
> are undefined after discard.  With D3D, derivatives are considered<br>
> well-defined after discard.<br>
> <br>
> In order to work around this, DXVK (and I would assume VKD3D though I'm<br>
> not sure), simply sets a global boolean instead of doing the discard and<br>
> then emits `if (do_discard) discard;` at the end of the shader.  For<br>
> complex shadaers, this leads to the shader doing way more work than<br>
> needed and poor performance.<br>
Is that really all they do? This will not work in presence of UAVs<br>
(shader images / ssbos), since after a discard writes must have no effect.<br></blockquote><div><br></div><div>I'm not sure off-hand how UAVs are handled in the presence of discards.  I would guess something like if (!discard) writeUAV();<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
> If, on the other hand, they knew that<br>
> derivative groups are just subgroup quads, they could do something<br>
> better:<br>
> <br>
>     bool want_discard;<br>
> <br>
>     void d3d_discard()<br>
>     {<br>
>         want_discard = true;<br>
>         if (subgroupClusteredAnd(want_discard, 4))<br>
>             discard;<br>
>     }<br>
> <br>
>     void main()<br>
>     {<br>
>         want_discard = false;<br>
> <br>
>         // stuff<br>
> <br>
>         if (some_condition)<br>
>             d3d_discard();<br>
> <br>
>         // Exepensive stuff<br>
Expensive<br>
<br>
> <br>
>         if (want_discard)<br>
>             discard;<br>
If the expensive stuff includes buffer / image writes, that still<br>
wouldn't work as far as I can tell (although it could be fixed, just<br>
like without the extension, by wrapping the writes with if (!want_discard).<br></blockquote><div><br></div><div>Correct.  However, the common case is UBO pulls and textures.</div><div><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Roland<br>
<br>
<br>
>     }<br>
> ---<br>
>  chapters/features.txt        | 21 +++++++++++++++++++++<br>
>  chapters/shaders.txt         | 12 ++++++++++++<br>
>  chapters/textures.txt        |  8 ++++++++<br>
>  include/vulkan/vulkan_core.h | 15 +++++++++++++--<br>
>  xml/vk.xml                   | 13 ++++++++++---<br>
>  5 files changed, 64 insertions(+), 5 deletions(-)<br>
> <br>
> diff --git a/chapters/features.txt b/chapters/features.txt<br>
> index 08c8d8420..3d22972ea 100644<br>
> --- a/chapters/features.txt<br>
> +++ b/chapters/features.txt<br>
> @@ -2969,6 +2969,27 @@ more slink:VkSubgroupFeatureFlagBits.<br>
>  <br>
>  endif::VK_VERSION_1_1[]<br>
>  <br>
> +ifdef::VK_EXT_derivative_group_quad[]<br>
> +<br>
> +[open,refpage='VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT',desc='Structure describing the relationship between derivative groups and subgroup quads for an implementation',type='structs']<br>
> +--<br>
> +<br>
> +The sname:VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT structure is<br>
> +defined as:<br>
> +<br>
> +include::../api/structs/VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT.txt[]<br>
> +<br>
> +The members of the sname:VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT<br>
> +structure describe the following implementation-dependent limits:<br>
> +<br>
> +  * pname:derivativeGroupsAreSubgroupQuads is a boolean that specifies that<br>
> +    derivative groups in fragment shaders correspond to subgroup quads.<br>
> +--<br>
> +<br>
> +include::../validity/structs/VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT.txt[]<br>
> +<br>
> +endif::VK_EXT_derivative_group_quad[]<br>
> +<br>
>  ifdef::VK_EXT_blend_operation_advanced[]<br>
>  <br>
>  [open,refpage='VkPhysicalDeviceBlendOperationAdvancedPropertiesEXT',desc='Structure describing advanced blending limits that can be supported by an implementation',type='structs']<br>
> diff --git a/chapters/shaders.txt b/chapters/shaders.txt<br>
> index 5cb3edb35..11d3ea9db 100644<br>
> --- a/chapters/shaders.txt<br>
> +++ b/chapters/shaders.txt<br>
> @@ -808,6 +808,11 @@ A _derivative group_ (see the subsection "`Control Flow`" of section 2 of<br>
>  the SPIR-V 1.00 Revision 4 specification) for a fragment shader is the set<br>
>  of invocations generated by a single primitive (point, line, or triangle),<br>
>  including any helper invocations generated by that primitive.<br>
> +ifdef::VK_EXT_derivative_group_quad[]<br>
> +If the fname:derivativeGroupsAreSubgroupQuads field of<br>
> +slink:VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT is ename:VK_TRUE, a<br>
> +derivative group for a fragment shader is a single subgroup quad.<br>
> +endif::VK_EXT_derivative_group_quad[]<br>
>  ifdef::VK_NV_compute_shader_derivatives[]<br>
>  A derivative group for a compute shader is a single local workgroup.<br>
>  endif::VK_NV_compute_shader_derivatives[]<br>
> @@ -920,6 +925,13 @@ The operations supported are add, mul, min, max, and, or, xor.<br>
>  <br>
>  The quad subgroup operations allow clusters of 4 invocations (a quad), to<br>
>  share data efficiently with each other.<br>
> +ifdef::VK_EXT_derivative_group_quad[]<br>
> +For fragment shaders, if the fname:derivativeGroupsAreSubgroupQuads field of<br>
> +slink:VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT is ename:VK_TRUE,<br>
> +each quad corresponds to one of the groups of four shader<br>
> +invocations used for<br>
> +<<texture-derivatives-compute,derivatives>>.<br>
> +endif::VK_EXT_derivative_group_quad[]<br>
>  ifdef::VK_NV_compute_shader_derivatives[]<br>
>  For compute shaders using the code:DerivativeGroupQuadsNV or<br>
>  code:DerivativeGroupLinearNV execution modes, each quad corresponds to one<br>
> diff --git a/chapters/textures.txt b/chapters/textures.txt<br>
> index a7bfaedc3..46a292381 100644<br>
> --- a/chapters/textures.txt<br>
> +++ b/chapters/textures.txt<br>
> @@ -1387,6 +1387,14 @@ Implementations must: make the same choice of either coarse or fine for both<br>
>  code:OpDPdx and code:OpDPdy, and implementations should: make the choice<br>
>  that is more efficient to compute.<br>
>  <br>
> +ifdef::VK_EXT_derivative_group_quad[]<br>
> +If the fname:derivativeGroupsAreSubgroupQuads field of<br>
> +slink:VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT is ename:VK_TRUE,<br>
> +the 2x2 neighborhood of fragments corresponds exactly to a subgroup quad<br>
> +the invocations in each quad are ordered to have attribute values of<br>
> +P~i0,j0~, P~i1,j0~, P~i0,j1~, and P~i1,j1~, respectively.<br>
> +endif::VK_EXT_derivative_group_quad[]<br>
> +<br>
>  ifdef::VK_NV_compute_shader_derivatives[]<br>
>  [[texture-derivatives-compute]]<br>
>  === Compute Shader Derivatives<br>
> diff --git a/include/vulkan/vulkan_core.h b/include/vulkan/vulkan_core.h<br>
> index 4cd8ed51d..7b304eafe 100644<br>
> --- a/include/vulkan/vulkan_core.h<br>
> +++ b/include/vulkan/vulkan_core.h<br>
> @@ -451,6 +451,7 @@ typedef enum VkStructureType {<br>
>      VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXCLUSIVE_SCISSOR_FEATURES_NV = 1000205002,<br>
>      VK_STRUCTURE_TYPE_CHECKPOINT_DATA_NV = 1000206000,<br>
>      VK_STRUCTURE_TYPE_QUEUE_FAMILY_CHECKPOINT_PROPERTIES_NV = 1000206001,<br>
> +    VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DERIVATIVE_GROUP_QUAD_PROPERTIES_EXT = 1000209000,<br>
>      VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_MEMORY_MODEL_FEATURES_KHR = 1000211000,<br>
>      VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PCI_BUS_INFO_PROPERTIES_EXT = 1000212000,<br>
>      VK_STRUCTURE_TYPE_IMAGEPIPE_SURFACE_CREATE_INFO_FUCHSIA = 1000214000,<br>
> @@ -7791,8 +7792,6 @@ typedef struct VkPipelineCoverageModulationStateCreateInfoNV {<br>
>  <br>
>  <br>
>  #define VK_EXT_image_drm_format_modifier 1<br>
> -#define VK_EXT_EXTENSION_159_SPEC_VERSION 0<br>
> -#define VK_EXT_EXTENSION_159_EXTENSION_NAME "VK_EXT_extension_159"<br>
>  #define VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_SPEC_VERSION 1<br>
>  #define VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_EXTENSION_NAME "VK_EXT_image_drm_format_modifier"<br>
>  <br>
> @@ -8791,6 +8790,18 @@ VKAPI_ATTR void VKAPI_CALL vkGetQueueCheckpointDataNV(<br>
>      VkCheckpointDataNV*                         pCheckpointData);<br>
>  #endif<br>
>  <br>
> +#define VK_EXT_derivative_group_quad 1<br>
> +#define VK_EXT_DERIVATIVE_GROUP_QUAD_SPEC_VERSION 1<br>
> +#define VK_EXT_DERIVATIVE_GROUP_QUAD_EXTENSION_NAME "VK_EXT_derivative_group_quad"<br>
> +<br>
> +typedef struct VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT {<br>
> +    VkStructureType    sType;<br>
> +    const void*        pNext;<br>
> +    VkBool32           derivativeGroupsAreSubgroupQuads;<br>
> +} VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT;<br>
> +<br>
> +<br>
> +<br>
>  #define VK_EXT_pci_bus_info 1<br>
>  #define VK_EXT_PCI_BUS_INFO_SPEC_VERSION  1<br>
>  #define VK_EXT_PCI_BUS_INFO_EXTENSION_NAME "VK_EXT_pci_bus_info"<br>
> diff --git a/xml/vk.xml b/xml/vk.xml<br>
> index 24cc3ce78..93dc66159 100644<br>
> --- a/xml/vk.xml<br>
> +++ b/xml/vk.xml<br>
> @@ -3590,6 +3590,11 @@ server.<br>
>              <member>const <type>void</type>*                      <name>pNext</name></member><br>
>              <member><type>VkMemoryOverallocationBehaviorAMD</type> <name>overallocationBehavior</name></member><br>
>          </type><br>
> +        <type category="struct" name="VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT"  structextends="VkPhysicalDeviceProperties2"><br>
> +            <member values="VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DERIVATIVE_GROUP_QUAD_PROPERTIES_EXT"><type>VkStructureType</type> <name>sType</name></member><br>
> +            <member>const <type>void</type>*                      <name>pNext</name></member><br>
> +            <member><type>VkBool32</type>                         <name>derivativeGroupsAreSubgroupQuads</name></member><br>
> +        </type><br>
>      </types><br>
>  <br>
>      <comment>Vulkan enumerant (token) definitions</comment><br>
> @@ -9893,10 +9898,12 @@ server.<br>
>                  <enum value="&quot;VK_KHR_extension_209&quot;"          name="VK_KHR_EXTENSION_209_EXTENSION_NAME"/><br>
>              </require><br>
>          </extension><br>
> -        <extension name="VK_INTEL_extension_210" number="210" type="device" author="INTEL" contact="Jason Ekstrand @jekstrand" supported="disabled"><br>
> +        <extension name="VK_EXT_derivative_group_quad" number="210" type="device" requiresCore="1.1" author="EXT" contact="Jason Ekstrand @jekstrand" supported="vulkan"><br>
>              <require><br>
> -                <enum value="0"                                         name="VK_KHR_EXTENSION_210_SPEC_VERSION"/><br>
> -                <enum value="&quot;VK_KHR_extension_210&quot;"          name="VK_KHR_EXTENSION_210_EXTENSION_NAME"/><br>
> +                <enum value="1"                                         name="VK_EXT_DERIVATIVE_GROUP_QUAD_SPEC_VERSION"/><br>
> +                <enum value="&quot;VK_EXT_derivative_group_quad&quot;"  name="VK_EXT_DERIVATIVE_GROUP_QUAD_EXTENSION_NAME"/><br>
> +                <enum offset="0" extends="VkStructureType"              name="VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DERIVATIVE_GROUP_QUAD_PROPERTIES_EXT"/><br>
> +                <type name="VkPhysicalDeviceDerivativeGroupQuadPropertiesEXT"/><br>
>              </require><br>
>          </extension><br>
>          <extension name="VK_INTEL_extension_211" number="211" type="device" author="INTEL" contact="Jason Ekstrand @jekstrand" supported="disabled"><br>
> <br>
<br>
</blockquote></div></div>