[Mesa-dev] [PATCH 5/5] anv: setup appropriate border color structures on gen7/gen75
Kenneth Graunke
kenneth at whitecape.org
Mon Oct 17 19:43:46 UTC 2016
On Monday, October 17, 2016 11:09:32 AM PDT Jason Ekstrand wrote:
> On Mon, Oct 17, 2016 at 8:46 AM, Lionel Landwerlin <llandwerlin at gmail.com>
> wrote:
>
> > Up to this point we were using the gen8+ structures. Altough this commit
> > doesn't fixes the border color CTS tests, this is a step in the right
> > direction to fix the following tests :
> >
>
> It's not entirely clear where you're headed but I'm afraid the approach may
> ultimately be doomed to failure. The fundamental problem is that Vulkan
> allows complete mix-and-match of images and samplers in the shader. We
> cannot know, until the shader *executes* which sampler will be associated
> with a particular texture. This leaves us with two possible solutions:
>
> 1) Emit a sampler for every possible sampler+image combination.
> 2) Piles of shader hacks.
>
> Both approaches are highly annoying. While the first option is probably
> the less annoying of the two, there is one format (R32_UINT maybe? I don't
> remember for sure) for which integer border color just doesn't work at all
> on Haswell.
You're thinking of
https://bugs.freedesktop.org/show_bug.cgi?id=94196
In this case, Haswell behaves exactly as Ivybridge, despite appearing to
work in all other cases. It's really sad.
> On Ivy Bridge, integer border color is so broken we probably
> shouldn't be using it at all. This leaves us with shader hacks. Ken wrote
> a bunch of code for this so that we could get the ES 3.1 CTS passing on
> Haswell and Ivy Bridge. Maybe that could be ported? Maybe we can find
> some magic Ivy Bridge "make border color work" bit?
CLAMP_TO_BORDER with integer formats is documented to not work at all.
It seems horribly buggy. I've tried to figure out how to make it work,
and I couldn't. Feel free to try, but I'm not sure it's possible.
It looks like the Windows driver emulates all integer texturing with
1,000 lines of jump tables and texelFetch/ld messages (at least on IVB).
My plan for IVB/BYT was to always use CLAMP_TO_EDGE and manually
bounds-check the texture coordinate in the shader, and supply the border
color via a uniform. ivbborder of ~kwg/mesa has the last version I
worked on, but I believe Jordan took it over and has improved it a bunch
since then. He might have an updated copy.
> > dEQP-VK.pipeline.sampler.view_type.2d.format.*.address_
> > modes.all_mode_clamp_to_border_*
> >
> > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
> > ---
> > src/intel/vulkan/anv_cmd_buffer.c | 10 ++
> > src/intel/vulkan/anv_device.c | 42 +-------
> > src/intel/vulkan/anv_genX.h | 3 +-
> > src/intel/vulkan/anv_private.h | 7 ++
> > src/intel/vulkan/genX_state.c | 220 ++++++++++++++++++++++++++++++
> > ++------
> > 5 files changed, 208 insertions(+), 74 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_
> > buffer.c
> > index b051489..a63f3d9 100644
> > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > @@ -931,6 +931,16 @@ anv_cmd_buffer_emit_samplers(struct anv_cmd_buffer
> > *cmd_buffer,
> > if (sampler == NULL)
> > continue;
> >
> > + /* On Haswell, although the border color structures are 20 dwords
> > long
> > + * and must be aligned at 512 bytes, the position of the 8/16/32bits
> > + * colors overlap, meaning we can't have a single color structure
> > + * configured for all formats. We therefore need to reemit the
> > sampler
> > + * structure for the used format. */
> > + if (cmd_buffer->device->info.is_haswell) {
> > + gen75_pack_sampler_state(cmd_buffer->device, sampler,
> > + desc->image_view->vk_format);
> > + }
> > +
> > memcpy(state->map + (s * 16),
> > sampler->state, sizeof(sampler->state));
> > }
> > diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> > index ce1b9c1..4e69307 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -724,46 +724,6 @@ anv_queue_finish(struct anv_queue *queue)
> > {
> > }
> >
> > -static struct anv_state
> > -anv_state_pool_emit_data(struct anv_state_pool *pool, size_t size,
> > size_t align, const void *p)
> > -{
> > - struct anv_state state;
> > -
> > - state = anv_state_pool_alloc(pool, size, align);
> > - memcpy(state.map, p, size);
> > -
> > - if (!pool->block_pool->device->info.has_llc)
> > - anv_state_clflush(state);
> > -
> > - return state;
> > -}
> > -
> > -struct gen8_border_color {
> > - union {
> > - float float32[4];
> > - uint32_t uint32[4];
> > - };
> > - /* Pad out to 64 bytes */
> > - uint32_t _pad[12];
> > -};
> > -
> > -static void
> > -anv_device_init_border_colors(struct anv_device *device)
> > -{
> > - static const struct gen8_border_color border_colors[] = {
> > - [VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK] = { .float32 = { 0.0,
> > 0.0, 0.0, 0.0 } },
> > - [VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK] = { .float32 = { 0.0,
> > 0.0, 0.0, 1.0 } },
> > - [VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE] = { .float32 = { 1.0,
> > 1.0, 1.0, 1.0 } },
> > - [VK_BORDER_COLOR_INT_TRANSPARENT_BLACK] = { .uint32 = { 0, 0,
> > 0, 0 } },
> > - [VK_BORDER_COLOR_INT_OPAQUE_BLACK] = { .uint32 = { 0, 0,
> > 0, 1 } },
> > - [VK_BORDER_COLOR_INT_OPAQUE_WHITE] = { .uint32 = { 1, 1,
> > 1, 1 } },
> > - };
> > -
> > - device->border_colors = anv_state_pool_emit_data(&
> > device->dynamic_state_pool,
> > -
> > sizeof(border_colors), 64,
> > - border_colors);
> > -}
> > -
> > VkResult
> > anv_device_submit_simple_batch(struct anv_device *device,
> > struct anv_batch *batch)
> > @@ -926,7 +886,7 @@ VkResult anv_CreateDevice(
> >
> > anv_device_init_blorp(device);
> >
> > - anv_device_init_border_colors(device);
> > + ANV_GEN_DISPATCH(device, border_colors_setup, device);
> >
> > *pDevice = anv_device_to_handle(device);
> >
> > diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
> > index 27c55b9..a4a39e1 100644
> > --- a/src/intel/vulkan/anv_genX.h
> > +++ b/src/intel/vulkan/anv_genX.h
> > @@ -28,7 +28,7 @@
> > /*
> > * Gen-specific function declarations. This header must *not* be included
> > * directly. Instead, it is included multiple times by anv_private.h.
> > - *
> > + *
> > * In this header file, the usual genx() macro is available.
> > */
> >
> > @@ -37,6 +37,7 @@
> > #endif
> >
> > VkResult genX(init_device_state)(struct anv_device *device);
> > +void genX(border_colors_setup)(struct anv_device *device);
> >
> > void genX(cmd_buffer_emit_state_base_address)(struct anv_cmd_buffer
> > *cmd_buffer);
> >
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > private.h
> > index 69e6aac..faebbb2 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -643,6 +643,7 @@ struct anv_device {
> > struct blorp_context blorp;
> >
> > struct anv_state border_colors;
> > + uint32_t border_color_align;
> >
> > struct anv_queue queue;
> >
> > @@ -1732,6 +1733,8 @@ void anv_buffer_view_fill_image_param(struct
> > anv_device *device,
> >
> > struct anv_sampler {
> > uint32_t state[4];
> > +
> > + VkSamplerCreateInfo info;
> > };
> >
> > struct anv_framebuffer {
> > @@ -1783,6 +1786,10 @@ struct anv_query_pool {
> > struct anv_bo bo;
> > };
> >
> > +void gen75_pack_sampler_state(struct anv_device *device,
> > + struct anv_sampler *sampler,
> > + VkFormat format);
> > +
> > void *anv_lookup_entrypoint(const struct gen_device_info *devinfo,
> > const char *name);
> >
> > diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
> > index a6d405d..cbf7df2 100644
> > --- a/src/intel/vulkan/genX_state.c
> > +++ b/src/intel/vulkan/genX_state.c
> > @@ -28,11 +28,144 @@
> > #include <fcntl.h>
> >
> > #include "anv_private.h"
> > +#include "vk_format_info.h"
> >
> > #include "common/gen_sample_positions.h"
> > #include "genxml/gen_macros.h"
> > #include "genxml/genX_pack.h"
> >
> > +static uint32_t
> > +border_color_index(VkBorderColor border_color, VkFormat format)
> > +{
> > +#if GEN_IS_HASWELL
> > + if (!vk_format_is_integer(format))
> > + return border_color;
> > +
> > + uint32_t max_bpc = vk_format_max_bpc(format);
> > + uint32_t index = 0;
> > +
> > + if (max_bpc <= 8)
> > + return border_color;
> > +
> > + if (max_bpc <= 16)
> > + index = VK_BORDER_COLOR_END_RANGE + 1;
> > + else
> > + index = VK_BORDER_COLOR_END_RANGE + 4;
> > +
> > + switch (border_color) {
> > + case VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK:
> > + case VK_BORDER_COLOR_INT_TRANSPARENT_BLACK:
> > + index += 0;
> > + break;
> > +
> > + case VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK:
> > + case VK_BORDER_COLOR_INT_OPAQUE_BLACK:
> > + index += 1;
> > + break;
> > +
> > + case VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE:
> > + case VK_BORDER_COLOR_INT_OPAQUE_WHITE:
> > + index += 2;
> > + break;
> > +
> > + default:
> > + unreachable("invalid border color");
> > + }
> > +
> > + return index;
> > +#else
> > + return border_color;
> > +#endif
> > +}
> > +
> > +#define BORDER_COLOR(name, r, g, b, a) { \
> > + .BorderColor##name##Red = r, \
> > + .BorderColor##name##Green = g, \
> > + .BorderColor##name##Blue = b, \
> > + .BorderColor##name##Alpha = a, \
> > + }
> > +
> > +void
> > +genX(border_colors_setup)(struct anv_device *device)
> > +{
> > +#if GEN_IS_HASWELL
> > + static const struct GENX(SAMPLER_BORDER_COLOR_STATE) border_colors[]
> > = {
> > + [VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 0.0),
> > + [VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 1.0),
> > + [VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE] =
> > + BORDER_COLOR(Float, 1.0, 1.0, 1.0, 1.0),
> > + [VK_BORDER_COLOR_INT_TRANSPARENT_BLACK] =
> > + BORDER_COLOR(8bit, 0, 0, 0, 0),
> > + [VK_BORDER_COLOR_INT_OPAQUE_BLACK] =
> > + BORDER_COLOR(8bit, 0, 0, 0, 1),
> > + [VK_BORDER_COLOR_INT_OPAQUE_WHITE] =
> > + BORDER_COLOR(8bit, 1, 1, 1, 1),
> > + [VK_BORDER_COLOR_END_RANGE + 1] =
> > + BORDER_COLOR(16bit, 0, 0, 0, 0),
> > + [VK_BORDER_COLOR_END_RANGE + 2] =
> > + BORDER_COLOR(16bit, 0, 0, 0, 1),
> > + [VK_BORDER_COLOR_END_RANGE + 3] =
> > + BORDER_COLOR(16bit, 1, 1, 1, 1),
> > + [VK_BORDER_COLOR_END_RANGE + 4] =
> > + BORDER_COLOR(32bit, 0, 0, 0, 0),
> > + [VK_BORDER_COLOR_END_RANGE + 5] =
> > + BORDER_COLOR(32bit, 0, 0, 0, 1),
> > + [VK_BORDER_COLOR_END_RANGE + 6] =
> > + BORDER_COLOR(32bit, 1, 1, 1, 1)
> > + };
> > + device->border_color_align = 512;
> > +#elif GEN_GEN == 7
> > + static const struct GENX(SAMPLER_BORDER_COLOR_STATE) border_colors[]
> > = {
> > + [VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 0.0),
> > + [VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 1.0),
> > + [VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE] =
> > + BORDER_COLOR(Float, 1.0, 1.0, 1.0, 1.0),
> > + [VK_BORDER_COLOR_INT_TRANSPARENT_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 0.0),
> > + [VK_BORDER_COLOR_INT_OPAQUE_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 1.0),
> > + [VK_BORDER_COLOR_INT_OPAQUE_WHITE] =
> > + BORDER_COLOR(Float, 1.0, 1.0, 1.0, 1.0)
> > + };
> > + device->border_color_align = 64;
> > +#else /* GEN_GEN >= 8 */
> > + static const struct GENX(SAMPLER_BORDER_COLOR_STATE) border_colors[]
> > = {
> > + [VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 0.0),
> > + [VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK] =
> > + BORDER_COLOR(Float, 0.0, 0.0, 0.0, 1.0),
> > + [VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE] =
> > + BORDER_COLOR(Float, 1.0, 1.0, 1.0, 1.0),
> > + [VK_BORDER_COLOR_INT_TRANSPARENT_BLACK] =
> > + BORDER_COLOR(32bit, 0, 0, 0, 0),
> > + [VK_BORDER_COLOR_INT_OPAQUE_BLACK] =
> > + BORDER_COLOR(32bit, 0, 0, 0, 1),
> > + [VK_BORDER_COLOR_INT_OPAQUE_WHITE] =
> > + BORDER_COLOR(32bit, 1, 1, 1, 1)
> > + };
> > + device->border_color_align = 64;
> > +#endif
> > +
> > + device->border_colors =
> > + anv_state_pool_alloc(&device->dynamic_state_pool,
> > + ARRAY_SIZE(border_colors) *
> > device->border_color_align,
> > + device->border_color_align);
> > +
> > + for (uint32_t i = 0; i < ARRAY_SIZE(border_colors); i++) {
> > + GENX(SAMPLER_BORDER_COLOR_STATE_pack)(
> > + NULL,
> > + device->border_colors.map + i * device->border_color_align,
> > + &border_colors[i]);
> > + }
> > +
> > + if (!device->info.has_llc)
> > + anv_state_clflush(device->border_colors);
> > +}
> > +
> > VkResult
> > genX(init_device_state)(struct anv_device *device)
> > {
> > @@ -148,24 +281,21 @@ static const uint32_t vk_to_gen_shadow_compare_op[]
> > = {
> > [VK_COMPARE_OP_ALWAYS] = PREFILTEROPNEVER,
> > };
> >
> > -VkResult genX(CreateSampler)(
> > - VkDevice _device,
> > - const VkSamplerCreateInfo* pCreateInfo,
> > - const VkAllocationCallbacks* pAllocator,
> > - VkSampler* pSampler)
> > +#if GEN_IS_HASWELL
> > +void
> > +#else
> > +static void
> > +#endif
> > +genX(pack_sampler_state)(
> > + struct anv_device * device,
> > + struct anv_sampler * sampler,
> > + VkFormat format)
> > {
> > - ANV_FROM_HANDLE(anv_device, device, _device);
> > - struct anv_sampler *sampler;
> > -
> > - assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO);
> > -
> > - sampler = anv_alloc2(&device->alloc, pAllocator, sizeof(*sampler), 8,
> > - VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> > - if (!sampler)
> > - return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> > -
> > - uint32_t border_color_offset = device->border_colors.offset +
> > - pCreateInfo->borderColor * 64;
> > + uint32_t color_index =
> > + border_color_index(sampler->info.borderColor, format);
> > + uint32_t color_offset =
> > + device->border_colors.offset +
> > + color_index * device->border_color_align;
> >
> > struct GENX(SAMPLER_STATE) sampler_state = {
> > .SamplerDisable = false,
> > @@ -180,28 +310,28 @@ VkResult genX(CreateSampler)(
> > #if GEN_GEN == 8
> > .BaseMipLevel = 0.0,
> > #endif
> > - .MipModeFilter = vk_to_gen_mipmap_mode[pCreateInfo->mipmapMode],
> > - .MagModeFilter = vk_to_gen_tex_filter(pCreateInfo->magFilter,
> > - pCreateInfo->anisotropyEnable)
> > ,
> > - .MinModeFilter = vk_to_gen_tex_filter(pCreateInfo->minFilter,
> > - pCreateInfo->anisotropyEnable)
> > ,
> > - .TextureLODBias = anv_clamp_f(pCreateInfo->mipLodBias, -16,
> > 15.996),
> > + .MipModeFilter = vk_to_gen_mipmap_mode[sampler->info.mipmapMode],
> > + .MagModeFilter = vk_to_gen_tex_filter(sampler->info.magFilter,
> > + sampler->info.
> > anisotropyEnable),
> > + .MinModeFilter = vk_to_gen_tex_filter(sampler->info.minFilter,
> > + sampler->info.
> > anisotropyEnable),
> > + .TextureLODBias = anv_clamp_f(sampler->info.mipLodBias, -16,
> > 15.996),
> > .AnisotropicAlgorithm = EWAApproximation,
> > - .MinLOD = anv_clamp_f(pCreateInfo->minLod, 0, 14),
> > - .MaxLOD = anv_clamp_f(pCreateInfo->maxLod, 0, 14),
> > + .MinLOD = anv_clamp_f(sampler->info.minLod, 0, 14),
> > + .MaxLOD = anv_clamp_f(sampler->info.maxLod, 0, 14),
> > .ChromaKeyEnable = 0,
> > .ChromaKeyIndex = 0,
> > .ChromaKeyMode = 0,
> > - .ShadowFunction = vk_to_gen_shadow_compare_op[
> > pCreateInfo->compareOp],
> > + .ShadowFunction = vk_to_gen_shadow_compare_op[
> > sampler->info.compareOp],
> > .CubeSurfaceControlMode = OVERRIDE,
> >
> > - .BorderColorPointer = border_color_offset,
> > + .BorderColorPointer = color_offset,
> >
> > #if GEN_GEN >= 8
> > .LODClampMagnificationMode = MIPNONE,
> > #endif
> >
> > - .MaximumAnisotropy = vk_to_gen_max_anisotropy(
> > pCreateInfo->maxAnisotropy),
> > + .MaximumAnisotropy = vk_to_gen_max_anisotropy(
> > sampler->info.maxAnisotropy),
> > .RAddressMinFilterRoundingEnable = 0,
> > .RAddressMagFilterRoundingEnable = 0,
> > .VAddressMinFilterRoundingEnable = 0,
> > @@ -209,13 +339,39 @@ VkResult genX(CreateSampler)(
> > .UAddressMinFilterRoundingEnable = 0,
> > .UAddressMagFilterRoundingEnable = 0,
> > .TrilinearFilterQuality = 0,
> > - .NonnormalizedCoordinateEnable = pCreateInfo->
> > unnormalizedCoordinates,
> > - .TCXAddressControlMode = vk_to_gen_tex_address[
> > pCreateInfo->addressModeU],
> > - .TCYAddressControlMode = vk_to_gen_tex_address[
> > pCreateInfo->addressModeV],
> > - .TCZAddressControlMode = vk_to_gen_tex_address[
> > pCreateInfo->addressModeW],
> > + .NonnormalizedCoordinateEnable = sampler->info.
> > unnormalizedCoordinates,
> > + .TCXAddressControlMode = vk_to_gen_tex_address[sampler-
> > >info.addressModeU],
> > + .TCYAddressControlMode = vk_to_gen_tex_address[sampler-
> > >info.addressModeV],
> > + .TCZAddressControlMode = vk_to_gen_tex_address[sampler-
> > >info.addressModeW],
> > };
> >
> > GENX(SAMPLER_STATE_pack)(NULL, sampler->state, &sampler_state);
> > +}
> > +
> > +
> > +VkResult genX(CreateSampler)(
> > + VkDevice _device,
> > + const VkSamplerCreateInfo* pCreateInfo,
> > + const VkAllocationCallbacks* pAllocator,
> > + VkSampler* pSampler)
> > +{
> > + ANV_FROM_HANDLE(anv_device, device, _device);
> > + struct anv_sampler *sampler;
> > +
> > + assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO);
> > +
> > + sampler = anv_alloc2(&device->alloc, pAllocator, sizeof(*sampler), 8,
> > + VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> > + if (!sampler)
> > + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> > +
> > + sampler->info = *pCreateInfo;
> > +
> > + /* No need to pack the sampler state on HSW, as the packing will
> > depend on
> > + * the format of the associated texture. */
> > +#if ! GEN_IS_HASWELL
> > + genX(pack_sampler_state)(device, sampler, VK_FORMAT_UNDEFINED);
> > +#endif
> >
> > *pSampler = anv_sampler_to_handle(sampler);
> >
> > --
> > 2.9.3
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20161017/1dcd1e69/attachment.sig>
More information about the mesa-dev
mailing list