<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Feb 26, 2018 at 6:10 AM, Chema Casanova <span dir="ltr"><<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">El 23/02/18 a las 22:31, Jason Ekstrand escribió:<br>
<span>> On Fri, Feb 23, 2018 at 12:28 PM, Chema Casanova <<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a><br>
</span><span>> <mailto:<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>><wbr>> wrote:<br>
><br>
><br>
><br>
> El 23/02/18 a las 17:26, Jason Ekstrand escribió:<br>
> > On Fri, Feb 23, 2018 at 1:26 AM, Jose Maria Casanova Crespo<br>
> > <<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a> <mailto:<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>><br>
</span><div><div class="m_-8339202472629852189m_-2955302428743802290h5">> <mailto:<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a> <mailto:<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>><wbr>>> wrote:<br>
> ><br>
> > The surfaces that backup the GPU buffers have a boundary check<br>
> that<br>
> > considers that access to partial dwords are considered<br>
> out-of-bounds.<br>
> > For example is basic 16-bit cases of buffers with size 2 or 6<br>
> where the<br>
> > last two bytes will always be read as 0 or its writting ignored.<br>
> ><br>
> > The introduction of 16-bit types implies that we need to align<br>
> the size<br>
> > to 4-bytes multiples so that partial dwords could be read/written.<br>
> > Adding an inconditional +2 size to buffers not being multiple of 2<br>
> > solves this issue for the general cases of UBO or SSBO.<br>
> ><br>
> > But, when unsized_arrays of 16-bit elements are used it is not<br>
> possible<br>
> > to know if the size was padded or not. To solve this issue the<br>
> > implementation of SSBO calculates the needed size of the surface,<br>
> > as suggested by Jason:<br>
> ><br>
> > surface_size = 2 * aling_u64(buffer_size, 4) - buffer_size<br>
> ><br>
> > So when we calculate backwards the SpvOpArrayLenght with a nir<br>
> expresion<br>
> > when the array stride is not multiple of 4.<br>
> ><br>
> > array_size = (surface_size & ~3) - (surface_size & 3)<br>
> ><br>
> > It is also exposed this buffer requirements when robust buffer<br>
> access<br>
> > is enabled so these buffer sizes recommend being multiple of 4.<br>
> > ---<br>
> ><br>
> > I have some doubts if vtn_variables.c is the best place to include<br>
> > this specific to calculate the real buffer size as this is new<br>
> > calculus seems to be quite HW dependent and maybe other drivers<br>
> > different<br>
> > to ANV don't need this kind of solution.<br>
> ><br>
> > src/compiler/spirv/vtn_varia<wbr>bles.c | 14 ++++++++++++++<br>
> > src/intel/vulkan/anv_descrip<wbr>tor_set.c | 16 ++++++++++++++++<br>
> > src/intel/vulkan/anv_device.<wbr>c | 11 +++++++++++<br>
> > 3 files changed, 41 insertions(+)<br>
> ><br>
> > diff --git a/src/compiler/spirv/vtn_varia<wbr>bles.c<br>
> > b/src/compiler/spirv/vtn_vari<wbr>ables.c<br>
> > index 9eb85c24e9..78adab3ed2 100644<br>
> > --- a/src/compiler/spirv/vtn_varia<wbr>bles.c<br>
> > +++ b/src/compiler/spirv/vtn_varia<wbr>bles.c<br>
> > @@ -2113,6 +2113,20 @@ vtn_handle_variables(struct vtn_builder *b,<br>
> > SpvOp opcode,<br>
> > nir_builder_instr_insert(&b-><wbr>nb, &instr->instr);<br>
> > nir_ssa_def *buf_size = &instr->dest.ssa;<br>
> ><br>
> > + /* Calculate real length if padding was done to align<br>
> the buffer<br>
> > + * to 32-bits. This only could happen is stride is not<br>
> multiple<br>
> > + * of 4. Introduced to support 16-bit type unsized<br>
> arrays in anv.<br>
> > + */<br>
> > + if (stride % 4) {<br>
> > + buf_size = nir_isub(&b->nb,<br>
> > + nir_iand(&b->nb,<br>
> > + buf_size,<br>
> > + nir_imm_int(&b->nb, ~3)),<br>
> > + nir_iand (&b->nb,<br>
> > + buf_size,<br>
> > + nir_imm_int(&b->nb, 3)));<br>
> ><br>
> ><br>
> > We can't do this in spirv_to_nir as it's also used by radv and<br>
> they may<br>
> > not have the same issue. Instead, we need to handle it either in<br>
> > anv_nir_apply_pipeline_layout or in the back-end compiler. Doing it<br>
> > here has the advantage that we can only do it in the "stride % 4 != 0"<br>
> > case but I don't think the three instructions are all that big of<br>
> a deal<br>
> > given that we just did a send and are about to do an integer<br>
> divide. My<br>
> > preference would be to put most of it in ISL and the back-end compiler<br>
> > if we can.<br>
><br>
> I've already had my doubts in my commit comment. So I'll implement it<br>
> properly in the backend implementation of nir_intrinsic_get_buffer_size.<br>
> I should have a look to that code before.<br></div></div></blockquote><div><br></div><div>Vulkan does not allow unsized arrays in UBOs. However, if we have to pad it, I'd rather us just do the same thing for both.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="m_-8339202472629852189m_-2955302428743802290h5">
> ><br>
> > + }<br>
> > +<br>
> > /* array_length = max(buffer_size - offset, 0) / stride */<br>
> > nir_ssa_def *array_length =<br>
> > nir_idiv(&b->nb,<br>
> > diff --git a/src/intel/vulkan/anv_descrip<wbr>tor_set.c<br>
> > b/src/intel/vulkan/anv_descri<wbr>ptor_set.c<br>
> > index edb829601e..a97f2f37dc 100644<br>
> > --- a/src/intel/vulkan/anv_descrip<wbr>tor_set.c<br>
> > +++ b/src/intel/vulkan/anv_descrip<wbr>tor_set.c<br>
> > @@ -704,6 +704,22 @@ anv_descriptor_set_write_buffe<wbr>r(struct<br>
> > anv_descriptor_set *set,<br>
> > bview->offset = buffer->offset + offset;<br>
> > bview->range = anv_buffer_get_range(buffer, offset, range);<br>
> ><br>
> > + /* Uniform and Storage buffers need to have surface size<br>
> > + * not less that the aligned 32-bit size of the buffer.<br>
> > + * To calculate the array lenght on unsized arrays<br>
> > + * in StorageBuffer the last 2 bits store the padding size<br>
> > + * added to the surface, so we can calculate latter the<br>
> original<br>
> > + * buffer size to know the number of elements.<br>
> > + *<br>
> > + * surface_size = 2 * aling_u64(buffer_size, 4) -<br>
> buffer_size<br>
> > + *<br>
> > + * array_size = (surface_size & ~3) - (surface_size & 3)<br>
> > + */<br>
> > + if (type == VK_DESCRIPTOR_TYPE_STORAGE_BUF<wbr>FER)<br>
> > + bview->range = 2 * align_u64(bview->range, 4) -<br>
> bview->range;<br>
> > + else if (type == VK_DESCRIPTOR_TYPE_UNIFORM_BUF<wbr>FER)<br>
> > + bview->range = align_u64(bview->range, 4);<br>
><br>
> > We chatted on another e-mail about doing this in ISL instead of<br>
> here but<br>
> > you mentioned issues with failing GL tests. Did you ever get to the<br>
> > bottom of those? <br>
><br>
> Yes I've already implemented that way but it was originally creating 10<br>
> regressions in Jenkins on DEQP in tests like<br>
> dEQP-GLES31.functional.textur<wbr>e.texture_buffer.modify.buffer<wbr>data.buffer_size_131071<br>
><br>
><br>
> Although it was not possible to reproduce it locally (it was a kind of<br>
> poltergeist)<br>
><br>
><br>
> I hate it when that happens.<br>
> <br>
><br>
> it was consistent as I was messing with the texture size of<br>
> the buffer.<br>
><br>
> So the solution was add extra restrictions to the size alignment padding<br>
> to detect UniformBuffers only when next 3 conditions matched:<br>
> - isl_format was ISL_FORMAT_R32G32B32A32_FLOAT<br>
> - stride = 1<br>
> - size % 4 != 0<br>
><br>
><br>
> I think that mechanism is more-or-less fine. You could also use a<br>
> condition like this:<br>
><br>
> if (format == ISL_FORMAT_RAW ||<br>
> stride < isl_format_get_layout(format)-<wbr>>bpb / 8) {<br>
> assert(stride == 1);<br>
> /* Do the math */<br>
> }<br>
<br>
> That would enable it only for UBOs SSBOs and storage texel buffers where<br>
> we don't have a matching typed format. Also, it would prevent us from<br>
> changing num_elements in any case other than stride == 1.<br>
<br>
<br>
</div></div>I'm sending an v2 now but I had another doubt with your code proposal. I<br>
initially differentiated the UBO and SSBO case so that for SSBO we do<br>
the math to solve the issue with unsized arrays, but for UBO i was just<br>
aligning the the size to 32-bits, just to avoid getting to the next dword.<br>
<br>
Would it be better to have do generally the SSBO math?<br>
<br>
That way the buffer_size intrinsic will be always return the right value.<br>
<span><br>
<br>
> But when working testing this solution I realized that for Storage<br>
> Buffers there was a another entry point where I couldn't differentiate<br>
> from an Storage Buffer at anv_image.c for buffer usage of<br>
> VK_BUFFER_USAGE_STORAGE_TEXEL<wbr>_BUFFER_BIT. at is uses stride = 1 and<br>
> ISL_RAW format.<br>
><br>
> So in order to avoid messing with the sizes in that case I though it was<br>
> better to move the padding decision to the place I could differentiate<br>
> it correctly. Maybe it was an unfounded doubt and it wouldn't mind to<br>
> modify the size in that VK_BUFFER_USAGE_STORAGE_TEXEL_<wbr>BUFFER_BIT case.<br>
><br>
> If so, what was the conclusion?<br>
><br>
> So after re-thinking aloud, and if my doubts with<br>
> VK_BUFFER_USAGE_STORAGE_TEXEL<wbr>_BUFFER_BIT is unfounded I think I'll<br>
> recover ISL approach patch, as this would be needed any case for mediump<br>
> support.<br>
><br>
><br>
> I wouldn't worry about those. The only time we will use SURFTYPE_RAW<br>
> for storage texel buffers is when we don't have a matching typed format<br>
> which only happens for formats which are at least 32bit and so the size<br>
> will be a multiple of 4 anyway.<br>
<br>
</span>Something less to worry about :)<br>
<br>
Chema<br>
<div><div class="m_-8339202472629852189m_-2955302428743802290h5"><br>
> <br>
><br>
> > +<br>
> > /* If we're writing descriptors through a push command, we<br>
> > need to<br>
> > * allocate the surface state from the command buffer.<br>
> > Otherwise it will<br>
> > * be allocated by the descriptor pool when calling<br>
> > diff --git a/src/intel/vulkan/anv_device.<wbr>c<br>
> > b/src/intel/vulkan/anv_device<wbr>.c<br>
> > index a83b7a39f6..cedeed5621 100644<br>
> > --- a/src/intel/vulkan/anv_device.<wbr>c<br>
> > +++ b/src/intel/vulkan/anv_device.<wbr>c<br>
> > @@ -2103,6 +2103,17 @@ void anv_GetBufferMemoryRequirement<wbr>s(<br>
> ><br>
> > pMemoryRequirements->size = buffer->size;<br>
> > pMemoryRequirements->alignment = alignment;<br>
> > +<br>
> > + /* Storage and Uniform buffers should have their size<br>
> aligned to<br>
> > + * 32-bits to avoid boundary checks when last DWord is not<br>
> complete.<br>
> > + * This would ensure that not internal padding would be<br>
> needed for<br>
> > + * 16-bit types.<br>
> > + */<br>
> > + if (device->robust_buffer_access &&<br>
> > + (buffer->usage & VK_BUFFER_USAGE_UNIFORM_BUFFER<wbr>_BIT ||<br>
> > + buffer->usage & VK_BUFFER_USAGE_STORAGE_BUFFER<wbr>_BIT))<br>
> > + pMemoryRequirements->size = align_u64(buffer->size, 4);<br>
> > +<br>
> > pMemoryRequirements->memoryTyp<wbr>eBits = memory_types;<br>
> > }<br>
> ><br>
> > --<br>
> > 2.14.3<br>
> ><br>
> > _____________________________<wbr>__________________<br>
> > mesa-dev mailing list<br>
> > <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.or<wbr>g</a><br>
> <mailto:<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freede<wbr>sktop.org</a>><br>
</div></div>> <mailto:<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freede<wbr>sktop.org</a><br>
<span>> <mailto:<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freede<wbr>sktop.org</a>>><br>
> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org<wbr>/mailman/listinfo/mesa-dev</a><br>
> <<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.or<wbr>g/mailman/listinfo/mesa-dev</a>><br>
</span>> > <<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.or<wbr>g/mailman/listinfo/mesa-dev</a><br>
> <<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.or<wbr>g/mailman/listinfo/mesa-dev</a>>><br>
<span>> ><br>
> ><br>
><br>
><br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
</span>> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
><br>
</blockquote></div><br></div></div>