<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Feb 27, 2018 at 5:27 AM, Jose Maria Casanova Crespo <span dir="ltr"><<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The surfaces that backup the GPU buffers have a boundary check that<br>
considers that access to partial dwords are considered out-of-bounds.<br>
For example, buffers with 1,3 16-bit elements has size 2 or 6 and the<br>
last two bytes would always be read as 0 or its writting ignored.<br>
<br>
The introduction of 16-bit types implies that we need to align the size<br>
to 4-bytew multiples so that partial dwords could be read/written.<br>
Adding an inconditional +2 size to buffers not being multiple of 2<br>
solves this issue for the general cases of UBO or SSBO.<br>
<br>
But, when unsized arrays of 16-bit elements are used it is not possible<br>
to know if the size was padded or not. To solve this issue the<br>
implementation calculates the needed size of the buffer surfaces,<br>
as suggested by Jason:<br>
<br>
surface_size = isl_align(buffer_size, 4) +<br>
(isl_align(buffer_size, 4) - buffer_size)<br>
<br>
So when we calculate backwards the buffer_size in the backend we<br>
update the resinfo return value with:<br>
<br>
buffer_size = (surface_size & ~3) - (surface_size & 3)<br>
<br>
It is also exposed this buffer requirements when robust buffer access<br>
is enabled so these buffer sizes recommend being multiple of 4.<br>
<br>
v2: (Jason Ekstrand)<br>
Move padding logic fron anv to isl_surface_state<br>
Move calculus of original size from spirv to driver backend<br>
v3: (Jason Ekstrand)<br>
Rename some variables and use a similar expresion when calculating<br>
padding than when obtaining the original buffer size.<br>
Avoid use of unnecesary component call at brw_fs_nir.<br>
<br>
Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br>
---<br>
src/intel/compiler/brw_fs_nir.<wbr>cpp | 27 ++++++++++++++++++++++++++-<br>
src/intel/isl/isl_surface_<wbr>state.c | 22 +++++++++++++++++++++-<br>
src/intel/vulkan/anv_device.c | 11 +++++++++++<br>
3 files changed, 58 insertions(+), 2 deletions(-)<br>
<br>
diff --git a/src/intel/compiler/brw_fs_<wbr>nir.cpp b/src/intel/compiler/brw_fs_<wbr>nir.cpp<br>
index 8efec34cc9d..4aa411d149f 100644<br>
--- a/src/intel/compiler/brw_fs_<wbr>nir.cpp<br>
+++ b/src/intel/compiler/brw_fs_<wbr>nir.cpp<br>
@@ -4290,7 +4290,32 @@ fs_visitor::nir_emit_<wbr>intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr<br>
inst->mlen = 1;<br>
inst->size_written = 4 * REG_SIZE;<br>
<br>
- bld.MOV(retype(dest, ret_payload.type), component(ret_payload, 0));<br>
+ /* SKL PRM, vol07, 3D Media GPGPU Engine, Bounds Checking and Faulting:<br>
+ *<br>
+ * "Out-of-bounds checking is always performed at a DWord granularity. If<br>
+ * any part of the DWord is out-of-bounds then the whole DWord is<br>
+ * considered out-of-bounds."<br>
+ *<br>
+ * This implies that types with size smaller than 4-bytes need to be<br>
+ * padded if they don't complete the last dword of the buffer. But as we<br>
+ * need to maintain the original size we need to reverse the padding<br>
+ * calculation to return the correct size to know the number of elements<br>
+ * of an unsized array. As we stored in the last two bits of the size of<br>
+ * the buffer the needed padding we calculate here:<br>
+ *<br>
+ * buffer_size = resinfo_size & ~3 - resinfo_size & 3<br></blockquote><div><br></div><div>Mind putting both calculations in this comment like you do in the one below?<br><br></div><div>rb still applies<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ */<br>
+<br>
+ fs_reg size_aligned4 = ubld.vgrf(BRW_REGISTER_TYPE_<wbr>UD);<br>
+ fs_reg size_padding = ubld.vgrf(BRW_REGISTER_TYPE_<wbr>UD);<br>
+ fs_reg buffer_size = ubld.vgrf(BRW_REGISTER_TYPE_<wbr>UD);<br>
+<br>
+ ubld.AND(size_padding, ret_payload, brw_imm_ud(3));<br>
+ ubld.AND(size_aligned4, ret_payload, brw_imm_ud(~3));<br>
+ ubld.ADD(buffer_size, size_aligned4, negate(size_padding));<br>
+<br>
+ bld.MOV(retype(dest, ret_payload.type), component(buffer_size, 0));<br>
+<br>
brw_mark_surface_used(prog_<wbr>data, index);<br>
break;<br>
}<br>
diff --git a/src/intel/isl/isl_surface_<wbr>state.c b/src/intel/isl/isl_surface_<wbr>state.c<br>
index bfb27fa4a44..c205b3d2c0b 100644<br>
--- a/src/intel/isl/isl_surface_<wbr>state.c<br>
+++ b/src/intel/isl/isl_surface_<wbr>state.c<br>
@@ -673,7 +673,27 @@ void<br>
isl_genX(buffer_fill_state_s)(<wbr>void *state,<br>
const struct isl_buffer_fill_state_info *restrict info)<br>
{<br>
- uint32_t num_elements = info->size / info->stride;<br>
+ uint64_t buffer_size = info->size;<br>
+<br>
+ /* Uniform and Storage buffers need to have surface size not less that the<br>
+ * aligned 32-bit size of the buffer. To calculate the array lenght on<br>
+ * unsized arrays in StorageBuffer the last 2 bits store the padding size<br>
+ * added to the surface, so we can calculate latter the original buffer<br>
+ * size to know the number of elements.<br>
+ *<br>
+ * surface_size = isl_align(buffer_size, 4) +<br>
+ * (isl_align(buffer_size) - buffer_size)<br>
+ *<br>
+ * buffer_size = (surface_size & ~3) - (surface_size & 3)<br>
+ */<br>
+ if (info->format == ISL_FORMAT_RAW ||<br>
+ info->stride < isl_format_get_layout(info-><wbr>format)->bpb / 8) {<br>
+ assert(info->stride == 1);<br>
+ uint64_t aligned_size = isl_align(buffer_size, 4);<br>
+ buffer_size = aligned_size + (aligned_size - buffer_size);<br>
+ }<br>
+<br>
+ uint32_t num_elements = buffer_size / info->stride;<br>
<br>
if (GEN_GEN >= 7) {<br>
/* From the IVB PRM, SURFACE_STATE::Height,<br>
diff --git a/src/intel/vulkan/anv_device.<wbr>c b/src/intel/vulkan/anv_device.<wbr>c<br>
index a83b7a39f6a..cedeed56219 100644<br>
--- a/src/intel/vulkan/anv_device.<wbr>c<br>
+++ b/src/intel/vulkan/anv_device.<wbr>c<br>
@@ -2103,6 +2103,17 @@ void anv_<wbr>GetBufferMemoryRequirements(<br>
<br>
pMemoryRequirements->size = buffer->size;<br>
pMemoryRequirements->alignment = alignment;<br>
+<br>
+ /* Storage and Uniform buffers should have their size aligned to<br>
+ * 32-bits to avoid boundary checks when last DWord is not complete.<br>
+ * This would ensure that not internal padding would be needed for<br>
+ * 16-bit types.<br>
+ */<br>
+ if (device->robust_buffer_access &&<br>
+ (buffer->usage & VK_BUFFER_USAGE_UNIFORM_<wbr>BUFFER_BIT ||<br>
+ buffer->usage & VK_BUFFER_USAGE_STORAGE_<wbr>BUFFER_BIT))<br>
+ pMemoryRequirements->size = align_u64(buffer->size, 4);<br>
+<br>
pMemoryRequirements-><wbr>memoryTypeBits = memory_types;<br>
}<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
2.14.3<br>
<br>
______________________________<wbr>_________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br></div></div>