<div dir="auto"><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Feb 14, 2017 9:06 AM, "Nicolai Hähnle" <<a href="mailto:nhaehnle@gmail.com">nhaehnle@gmail.com</a>> wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="elided-text">On 13.02.2017 18:01, Marek Olšák wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
From: Marek Olšák <<a href="mailto:marek.olsak@amd.com" target="_blank">marek.olsak@amd.com</a>><br>
<br>
So that we can disable u_vbuf for GL core profiles.<br>
<br>
This is a v2 of the previous VI-only patch.<br>
It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.<br>
---<br>
src/gallium/drivers/radeonsi/<wbr>si_descriptors.c | 1 -<br>
src/gallium/drivers/radeonsi/<wbr>si_pipe.c | 12 +++++++++---<br>
2 files changed, 9 insertions(+), 4 deletions(-)<br>
<br>
diff --git a/src/gallium/drivers/radeonsi<wbr>/si_descriptors.c b/src/gallium/drivers/radeonsi<wbr>/si_descriptors.c<br>
index 3c98176..8abbf10 100644<br>
--- a/src/gallium/drivers/radeonsi<wbr>/si_descriptors.c<br>
+++ b/src/gallium/drivers/radeonsi<wbr>/si_descriptors.c<br>
@@ -1014,21 +1014,20 @@ bool si_upload_vertex_buffer_descri<wbr>ptors(struct si_context *sctx)<br>
* up so that the hardware sees four components as<br>
* being inside the buffer if and only if the first<br>
* three components are in the buffer.<br>
*<br>
* Since the offset and stride are guaranteed to be<br>
* 4-byte aligned, this alignment will never cross the<br>
* winsys buffer boundary.<br>
*/<br>
size3 = (fix_size3 >> (2 * i)) & 3;<br>
if (vb->stride && size3) {<br>
- assert(offset % 4 == 0 && vb->stride % 4 == 0);<br>
assert(size3 <= 2);<br>
desc[2] = align(desc[2], size3 * 2);<br>
</blockquote>
<br></div>
I think there's still a bug with this. Consider the following setup:<br>
<br>
3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at an offset e.g. 12 bytes before the end of a buffer for a total of 4 vertices.<br>
<br>
We use unaligned 8_8_8_8 to read this vertex attribute, but when the last vertex is read, its last byte (of the extended 8_8_8_8) is out-of-bounds of the buffer. The hardware does bounds checks all-or-nothing (i.e., not per-channel) and drops the entire load as if the vertex data were out-of-bounds.<br>
<br>
This is admittedly a bit of a ridiculous case that should not prevent us from dropping u_vbuf. It'll probably need a shader work-around of loading each channel individually, like with your recent GL_DOUBLE patch. At least I can't think of anything else right now.<br>
<br>
In any case, the comment above that code snippet is no longer really correct.<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">We can also say that such cases don't exist in the real world and call it a day. Or we can increase the buffer size and potentially risk a VM fault.</div><div dir="auto"><br></div><div dir="auto">The small problem with fix_fetch is that we have only one unused value left in the 4 bits, but we need one fix for 888 and another for 161616.</div><div dir="auto"><br></div><div dir="auto">Marek</div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Cheers,<br>
Nicolai<div class="elided-text"><br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
}<br>
}<br>
<br>
desc[3] = velems->rsrc_word3[i];<br>
<br>
if (first_vb_use_mask & (1 << i)) {<br>
radeon_add_to_buffer_list(&sct<wbr>x->b, &sctx->b.gfx,<br>
(struct r600_resource*)vb->buffer,<br>
diff --git a/src/gallium/drivers/radeonsi<wbr>/si_pipe.c b/src/gallium/drivers/radeonsi<wbr>/si_pipe.c<br>
index 8806027..ec324b8 100644<br>
--- a/src/gallium/drivers/radeonsi<wbr>/si_pipe.c<br>
+++ b/src/gallium/drivers/radeonsi<wbr>/si_pipe.c<br>
@@ -353,23 +353,20 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)<br>
case PIPE_CAP_TGSI_FS_COORD_PIXEL_C<wbr>ENTER_INTEGER:<br>
case PIPE_CAP_SM3:<br>
case PIPE_CAP_SEAMLESS_CUBE_MAP:<br>
case PIPE_CAP_PRIMITIVE_RESTART:<br>
case PIPE_CAP_CONDITIONAL_RENDER:<br>
case PIPE_CAP_TEXTURE_BARRIER:<br>
case PIPE_CAP_INDEP_BLEND_ENABLE:<br>
case PIPE_CAP_INDEP_BLEND_FUNC:<br>
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER<wbr>_TEXTURE:<br>
case PIPE_CAP_VERTEX_COLOR_UNCLAMPE<wbr>D:<br>
- case PIPE_CAP_VERTEX_BUFFER_OFFSET_<wbr>4BYTE_ALIGNED_ONLY:<br>
- case PIPE_CAP_VERTEX_BUFFER_STRIDE_<wbr>4BYTE_ALIGNED_ONLY:<br>
- case PIPE_CAP_VERTEX_ELEMENT_SRC_OF<wbr>FSET_4BYTE_ALIGNED_ONLY:<br>
case PIPE_CAP_USER_INDEX_BUFFERS:<br>
case PIPE_CAP_USER_CONSTANT_BUFFERS<wbr>:<br>
case PIPE_CAP_START_INSTANCE:<br>
case PIPE_CAP_NPOT_TEXTURES:<br>
case PIPE_CAP_MIXED_FRAMEBUFFER_SIZ<wbr>ES:<br>
case PIPE_CAP_MIXED_COLOR_DEPTH_BIT<wbr>S:<br>
case PIPE_CAP_VERTEX_COLOR_CLAMPED:<br>
case PIPE_CAP_FRAGMENT_COLOR_CLAMPE<wbr>D:<br>
case PIPE_CAP_PREFER_BLIT_BASED_TEX<wbr>TURE_TRANSFER:<br>
case PIPE_CAP_TGSI_INSTANCEID:<br>
@@ -455,20 +452,29 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)<br>
<br>
case PIPE_CAP_GLSL_FEATURE_LEVEL:<br>
if (si_have_tgsi_compute(sscreen)<wbr>)<br>
return 450;<br>
return HAVE_LLVM >= 0x0309 ? 420 :<br>
HAVE_LLVM >= 0x0307 ? 410 : 330;<br>
<br>
case PIPE_CAP_MAX_TEXTURE_BUFFER_SI<wbr>ZE:<br>
return MIN2(sscreen->b.info.max_alloc<wbr>_size, INT_MAX);<br>
<br>
+ case PIPE_CAP_VERTEX_BUFFER_OFFSET_<wbr>4BYTE_ALIGNED_ONLY:<br>
+ case PIPE_CAP_VERTEX_BUFFER_STRIDE_<wbr>4BYTE_ALIGNED_ONLY:<br>
+ case PIPE_CAP_VERTEX_ELEMENT_SRC_OF<wbr>FSET_4BYTE_ALIGNED_ONLY:<br>
+ /* SI doesn't support unaligned loads.<br>
+ * CIK needs DRM 2.50.0 on radeon. */<br>
+ return sscreen->b.chip_class == SI ||<br>
+ (sscreen->b.info.drm_major == 2 &&<br>
+ sscreen->b.info.drm_minor < 50);<br>
+<br>
/* Unsupported features. */<br>
case PIPE_CAP_BUFFER_SAMPLER_VIEW_R<wbr>GBA_ONLY:<br>
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_<wbr>LOWER_LEFT:<br>
case PIPE_CAP_TGSI_CAN_COMPACT_CONS<wbr>TANTS:<br>
case PIPE_CAP_USER_VERTEX_BUFFERS:<br>
case PIPE_CAP_FAKE_SW_MSAA:<br>
case PIPE_CAP_TEXTURE_GATHER_OFFSET<wbr>S:<br>
case PIPE_CAP_VERTEXID_NOBASE:<br>
case PIPE_CAP_PRIMITIVE_RESTART_FOR<wbr>_PATCHES:<br>
case PIPE_CAP_TGSI_VOTE:<br>
<br>
</blockquote>
<br>
</div></blockquote></div><br></div></div></div>