[Mesa-dev] [PATCH 1/7] anv/spirv: SSBO/UBO buffers needs padding size is not multiple of 32-bits
Jason Ekstrand
jason at jlekstrand.net
Fri Feb 23 21:31:57 UTC 2018
On Fri, Feb 23, 2018 at 12:28 PM, Chema Casanova <jmcasanova at igalia.com>
wrote:
>
>
> El 23/02/18 a las 17:26, Jason Ekstrand escribió:
> > On Fri, Feb 23, 2018 at 1:26 AM, Jose Maria Casanova Crespo
> > <jmcasanova at igalia.com <mailto:jmcasanova at igalia.com>> wrote:
> >
> > The surfaces that backup the GPU buffers have a boundary check that
> > considers that access to partial dwords are considered out-of-bounds.
> > For example is basic 16-bit cases of buffers with size 2 or 6 where
> the
> > last two bytes will always be read as 0 or its writting ignored.
> >
> > The introduction of 16-bit types implies that we need to align the
> size
> > to 4-bytes multiples so that partial dwords could be read/written.
> > Adding an inconditional +2 size to buffers not being multiple of 2
> > solves this issue for the general cases of UBO or SSBO.
> >
> > But, when unsized_arrays of 16-bit elements are used it is not
> possible
> > to know if the size was padded or not. To solve this issue the
> > implementation of SSBO calculates the needed size of the surface,
> > as suggested by Jason:
> >
> > surface_size = 2 * aling_u64(buffer_size, 4) - buffer_size
> >
> > So when we calculate backwards the SpvOpArrayLenght with a nir
> expresion
> > when the array stride is not multiple of 4.
> >
> > array_size = (surface_size & ~3) - (surface_size & 3)
> >
> > It is also exposed this buffer requirements when robust buffer access
> > is enabled so these buffer sizes recommend being multiple of 4.
> > ---
> >
> > I have some doubts if vtn_variables.c is the best place to include
> > this specific to calculate the real buffer size as this is new
> > calculus seems to be quite HW dependent and maybe other drivers
> > different
> > to ANV don't need this kind of solution.
> >
> > src/compiler/spirv/vtn_variables.c | 14 ++++++++++++++
> > src/intel/vulkan/anv_descriptor_set.c | 16 ++++++++++++++++
> > src/intel/vulkan/anv_device.c | 11 +++++++++++
> > 3 files changed, 41 insertions(+)
> >
> > diff --git a/src/compiler/spirv/vtn_variables.c
> > b/src/compiler/spirv/vtn_variables.c
> > index 9eb85c24e9..78adab3ed2 100644
> > --- a/src/compiler/spirv/vtn_variables.c
> > +++ b/src/compiler/spirv/vtn_variables.c
> > @@ -2113,6 +2113,20 @@ vtn_handle_variables(struct vtn_builder *b,
> > SpvOp opcode,
> > nir_builder_instr_insert(&b->nb, &instr->instr);
> > nir_ssa_def *buf_size = &instr->dest.ssa;
> >
> > + /* Calculate real length if padding was done to align the
> buffer
> > + * to 32-bits. This only could happen is stride is not
> multiple
> > + * of 4. Introduced to support 16-bit type unsized arrays in
> anv.
> > + */
> > + if (stride % 4) {
> > + buf_size = nir_isub(&b->nb,
> > + nir_iand(&b->nb,
> > + buf_size,
> > + nir_imm_int(&b->nb, ~3)),
> > + nir_iand (&b->nb,
> > + buf_size,
> > + nir_imm_int(&b->nb, 3)));
> >
> >
> > We can't do this in spirv_to_nir as it's also used by radv and they may
> > not have the same issue. Instead, we need to handle it either in
> > anv_nir_apply_pipeline_layout or in the back-end compiler. Doing it
> > here has the advantage that we can only do it in the "stride % 4 != 0"
> > case but I don't think the three instructions are all that big of a deal
> > given that we just did a send and are about to do an integer divide. My
> > preference would be to put most of it in ISL and the back-end compiler
> > if we can.
>
> I've already had my doubts in my commit comment. So I'll implement it
> properly in the backend implementation of nir_intrinsic_get_buffer_size.
> I should have a look to that code before.
>
> >
> > + }
> > +
> > /* array_length = max(buffer_size - offset, 0) / stride */
> > nir_ssa_def *array_length =
> > nir_idiv(&b->nb,
> > diff --git a/src/intel/vulkan/anv_descriptor_set.c
> > b/src/intel/vulkan/anv_descriptor_set.c
> > index edb829601e..a97f2f37dc 100644
> > --- a/src/intel/vulkan/anv_descriptor_set.c
> > +++ b/src/intel/vulkan/anv_descriptor_set.c
> > @@ -704,6 +704,22 @@ anv_descriptor_set_write_buffer(struct
> > anv_descriptor_set *set,
> > bview->offset = buffer->offset + offset;
> > bview->range = anv_buffer_get_range(buffer, offset, range);
> >
> > + /* Uniform and Storage buffers need to have surface size
> > + * not less that the aligned 32-bit size of the buffer.
> > + * To calculate the array lenght on unsized arrays
> > + * in StorageBuffer the last 2 bits store the padding size
> > + * added to the surface, so we can calculate latter the
> original
> > + * buffer size to know the number of elements.
> > + *
> > + * surface_size = 2 * aling_u64(buffer_size, 4) -
> buffer_size
> > + *
> > + * array_size = (surface_size & ~3) - (surface_size & 3)
> > + */
> > + if (type == VK_DESCRIPTOR_TYPE_STORAGE_BUFFER)
> > + bview->range = 2 * align_u64(bview->range, 4) -
> bview->range;
> > + else if (type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER)
> > + bview->range = align_u64(bview->range, 4);
>
> > We chatted on another e-mail about doing this in ISL instead of here but
> > you mentioned issues with failing GL tests. Did you ever get to the
> > bottom of those?
>
> Yes I've already implemented that way but it was originally creating 10
> regressions in Jenkins on DEQP in tests like
> dEQP-GLES31.functional.texture.texture_buffer.modify.
> bufferdata.buffer_size_131071
>
>
> Although it was not possible to reproduce it locally (it was a kind of
> poltergeist)
I hate it when that happens.
> it was consistent as I was messing with the texture size of
> the buffer.
>
> So the solution was add extra restrictions to the size alignment padding
> to detect UniformBuffers only when next 3 conditions matched:
> - isl_format was ISL_FORMAT_R32G32B32A32_FLOAT
> - stride = 1
> - size % 4 != 0
>
I think that mechanism is more-or-less fine. You could also use a
condition like this:
if (format == ISL_FORMAT_RAW ||
stride < isl_format_get_layout(format)->bpb / 8) {
assert(stride == 1);
/* Do the math */
}
That would enable it only for UBOs SSBOs and storage texel buffers where we
don't have a matching typed format. Also, it would prevent us from
changing num_elements in any case other than stride == 1.
> But when working testing this solution I realized that for Storage
> Buffers there was a another entry point where I couldn't differentiate
> from an Storage Buffer at anv_image.c for buffer usage of
> VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT. at is uses stride = 1 and
> ISL_RAW format.
>
> So in order to avoid messing with the sizes in that case I though it was
> better to move the padding decision to the place I could differentiate
> it correctly. Maybe it was an unfounded doubt and it wouldn't mind to
> modify the size in that VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT case.
>
> If so, what was the conclusion?
>
> So after re-thinking aloud, and if my doubts with
> VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT is unfounded I think I'll
> recover ISL approach patch, as this would be needed any case for mediump
> support.
>
I wouldn't worry about those. The only time we will use SURFTYPE_RAW for
storage texel buffers is when we don't have a matching typed format which
only happens for formats which are at least 32bit and so the size will be a
multiple of 4 anyway.
--Jason
> > +
> > /* If we're writing descriptors through a push command, we
> > need to
> > * allocate the surface state from the command buffer.
> > Otherwise it will
> > * be allocated by the descriptor pool when calling
> > diff --git a/src/intel/vulkan/anv_device.c
> > b/src/intel/vulkan/anv_device.c
> > index a83b7a39f6..cedeed5621 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -2103,6 +2103,17 @@ void anv_GetBufferMemoryRequirements(
> >
> > pMemoryRequirements->size = buffer->size;
> > pMemoryRequirements->alignment = alignment;
> > +
> > + /* Storage and Uniform buffers should have their size aligned to
> > + * 32-bits to avoid boundary checks when last DWord is not
> complete.
> > + * This would ensure that not internal padding would be needed
> for
> > + * 16-bit types.
> > + */
> > + if (device->robust_buffer_access &&
> > + (buffer->usage & VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT ||
> > + buffer->usage & VK_BUFFER_USAGE_STORAGE_BUFFER_BIT))
> > + pMemoryRequirements->size = align_u64(buffer->size, 4);
> > +
> > pMemoryRequirements->memoryTypeBits = memory_types;
> > }
> >
> > --
> > 2.14.3
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.
> freedesktop.org>
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180223/83732b04/attachment-0001.html>
More information about the mesa-dev
mailing list