[Mesa-dev] [PATCH v2 1/8] isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit
Chema Casanova
jmcasanova at igalia.com
Wed Feb 28 10:24:32 UTC 2018
On 27/02/18 19:53, Jason Ekstrand wrote:
> On Tue, Feb 27, 2018 at 5:27 AM, Jose Maria Casanova Crespo
> <jmcasanova at igalia.com <mailto:jmcasanova at igalia.com>> wrote:
>
> The surfaces that backup the GPU buffers have a boundary check that
> considers that access to partial dwords are considered out-of-bounds.
> For example, buffers with 1,3 16-bit elements has size 2 or 6 and the
> last two bytes would always be read as 0 or its writting ignored.
>
> The introduction of 16-bit types implies that we need to align the size
> to 4-bytew multiples so that partial dwords could be read/written.
> Adding an inconditional +2 size to buffers not being multiple of 2
> solves this issue for the general cases of UBO or SSBO.
>
> But, when unsized arrays of 16-bit elements are used it is not possible
> to know if the size was padded or not. To solve this issue the
> implementation calculates the needed size of the buffer surfaces,
> as suggested by Jason:
>
> surface_size = isl_align(buffer_size, 4) +
> (isl_align(buffer_size, 4) - buffer_size)
>
> So when we calculate backwards the buffer_size in the backend we
> update the resinfo return value with:
>
> buffer_size = (surface_size & ~3) - (surface_size & 3)
>
> It is also exposed this buffer requirements when robust buffer access
> is enabled so these buffer sizes recommend being multiple of 4.
>
> v2: (Jason Ekstrand)
> Move padding logic fron anv to isl_surface_state
> Move calculus of original size from spirv to driver backend
> v3: (Jason Ekstrand)
> Rename some variables and use a similar expresion when calculating
> padding than when obtaining the original buffer size.
> Avoid use of unnecesary component call at brw_fs_nir.
>
> Reviewed-by: Jason Ekstrand <jason at jlekstrand.net
> <mailto:jason at jlekstrand.net>>
> ---
> src/intel/compiler/brw_fs_nir.cpp | 27 ++++++++++++++++++++++++++-
> src/intel/isl/isl_surface_state.c | 22 +++++++++++++++++++++-
> src/intel/vulkan/anv_device.c | 11 +++++++++++
> 3 files changed, 58 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 8efec34cc9d..4aa411d149f 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4290,7 +4290,32 @@ fs_visitor::nir_emit_intrinsic(const
> fs_builder &bld, nir_intrinsic_instr *instr
> inst->mlen = 1;
> inst->size_written = 4 * REG_SIZE;
>
> - bld.MOV(retype(dest, ret_payload.type),
> component(ret_payload, 0));
> + /* SKL PRM, vol07, 3D Media GPGPU Engine, Bounds Checking and
> Faulting:
> + *
> + * "Out-of-bounds checking is always performed at a DWord
> granularity. If
> + * any part of the DWord is out-of-bounds then the whole DWord is
> + * considered out-of-bounds."
> + *
> + * This implies that types with size smaller than 4-bytes
> need to be
> + * padded if they don't complete the last dword of the
> buffer. But as we
> + * need to maintain the original size we need to reverse the
> padding
> + * calculation to return the correct size to know the number
> of elements
> + * of an unsized array. As we stored in the last two bits of
> the size of
> + * the buffer the needed padding we calculate here:
> + *
> + * buffer_size = resinfo_size & ~3 - resinfo_size & 3
>
>
> Mind putting both calculations in this comment like you do in the one below?
Locally changed as:
* .... As we stored in the last two bits of the surface
* size the needed padding for the buffer, we calculate here the
* original buffer_size reversing the surface_size calculation:
*
* surface_size = isl_align(buffer_size, 4) +
* (isl_align(buffer_size) - buffer_size)
*
* buffer_size = surface_size & ~3 - surface_size & 3
I used the same names as in isl comment, so surface_size instead of
resinfo_size.
> rb still applies
Thanks.
> + */
> +
> + fs_reg size_aligned4 = ubld.vgrf(BRW_REGISTER_TYPE_UD);
> + fs_reg size_padding = ubld.vgrf(BRW_REGISTER_TYPE_UD);
> + fs_reg buffer_size = ubld.vgrf(BRW_REGISTER_TYPE_UD);
> +
> + ubld.AND(size_padding, ret_payload, brw_imm_ud(3));
> + ubld.AND(size_aligned4, ret_payload, brw_imm_ud(~3));
> + ubld.ADD(buffer_size, size_aligned4, negate(size_padding));
> +
> + bld.MOV(retype(dest, ret_payload.type),
> component(buffer_size, 0));
> +
> brw_mark_surface_used(prog_data, index);
> break;
> }
> diff --git a/src/intel/isl/isl_surface_state.c
> b/src/intel/isl/isl_surface_state.c
> index bfb27fa4a44..c205b3d2c0b 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -673,7 +673,27 @@ void
> isl_genX(buffer_fill_state_s)(void *state,
> const struct
> isl_buffer_fill_state_info *restrict info)
> {
> - uint32_t num_elements = info->size / info->stride;
> + uint64_t buffer_size = info->size;
> +
> + /* Uniform and Storage buffers need to have surface size not
> less that the
> + * aligned 32-bit size of the buffer. To calculate the array
> lenght on
> + * unsized arrays in StorageBuffer the last 2 bits store the
> padding size
> + * added to the surface, so we can calculate latter the original
> buffer
> + * size to know the number of elements.
> + *
> + * surface_size = isl_align(buffer_size, 4) +
> + * (isl_align(buffer_size) - buffer_size)
> + *
> + * buffer_size = (surface_size & ~3) - (surface_size & 3)
> + */
> + if (info->format == ISL_FORMAT_RAW ||
> + info->stride < isl_format_get_layout(info->format)->bpb / 8) {
> + assert(info->stride == 1);
> + uint64_t aligned_size = isl_align(buffer_size, 4);
> + buffer_size = aligned_size + (aligned_size - buffer_size);
> + }
> +
> + uint32_t num_elements = buffer_size / info->stride;
>
> if (GEN_GEN >= 7) {
> /* From the IVB PRM, SURFACE_STATE::Height,
> diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> index a83b7a39f6a..cedeed56219 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -2103,6 +2103,17 @@ void anv_GetBufferMemoryRequirements(
>
> pMemoryRequirements->size = buffer->size;
> pMemoryRequirements->alignment = alignment;
> +
> + /* Storage and Uniform buffers should have their size aligned to
> + * 32-bits to avoid boundary checks when last DWord is not complete.
> + * This would ensure that not internal padding would be needed for
> + * 16-bit types.
> + */
> + if (device->robust_buffer_access &&
> + (buffer->usage & VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT ||
> + buffer->usage & VK_BUFFER_USAGE_STORAGE_BUFFER_BIT))
> + pMemoryRequirements->size = align_u64(buffer->size, 4);
> +
> pMemoryRequirements->memoryTypeBits = memory_types;
> }
>
> --
> 2.14.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
>
More information about the mesa-dev
mailing list