[Mesa-dev] [PATCH] radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI
Nicolai Hähnle
nhaehnle at gmail.com
Tue Feb 14 12:48:57 UTC 2017
On 14.02.2017 11:58, Marek Olšák wrote:
>
>
> On Feb 14, 2017 9:06 AM, "Nicolai Hähnle" <nhaehnle at gmail.com
> <mailto:nhaehnle at gmail.com>> wrote:
>
> On 13.02.2017 18:01, Marek Olšák wrote:
>
> From: Marek Olšák <marek.olsak at amd.com <mailto:marek.olsak at amd.com>>
>
> So that we can disable u_vbuf for GL core profiles.
>
> This is a v2 of the previous VI-only patch.
> It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
> ---
> src/gallium/drivers/radeonsi/si_descriptors.c | 1 -
> src/gallium/drivers/radeonsi/si_pipe.c | 12 +++++++++---
> 2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 3c98176..8abbf10 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -1014,21 +1014,20 @@ bool
> si_upload_vertex_buffer_descriptors(struct si_context *sctx)
> * up so that the hardware sees four
> components as
> * being inside the buffer if and only
> if the first
> * three components are in the buffer.
> *
> * Since the offset and stride are
> guaranteed to be
> * 4-byte aligned, this alignment will
> never cross the
> * winsys buffer boundary.
> */
> size3 = (fix_size3 >> (2 * i)) & 3;
> if (vb->stride && size3) {
> - assert(offset % 4 == 0 &&
> vb->stride % 4 == 0);
> assert(size3 <= 2);
> desc[2] = align(desc[2], size3 * 2);
>
>
> I think there's still a bug with this. Consider the following setup:
>
> 3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at
> an offset e.g. 12 bytes before the end of a buffer for a total of 4
> vertices.
>
> We use unaligned 8_8_8_8 to read this vertex attribute, but when the
> last vertex is read, its last byte (of the extended 8_8_8_8) is
> out-of-bounds of the buffer. The hardware does bounds checks
> all-or-nothing (i.e., not per-channel) and drops the entire load as
> if the vertex data were out-of-bounds.
>
> This is admittedly a bit of a ridiculous case that should not
> prevent us from dropping u_vbuf. It'll probably need a shader
> work-around of loading each channel individually, like with your
> recent GL_DOUBLE patch. At least I can't think of anything else
> right now.
>
> In any case, the comment above that code snippet is no longer really
> correct.
>
>
> We can also say that such cases don't exist in the real world and call
> it a day.
I think we should strive for better than that.
> Or we can increase the buffer size and potentially risk a VM
> fault.
>
> The small problem with fix_fetch is that we have only one unused value
> left in the 4 bits, but we need one fix for 888 and another for 161616.
I see the problem. To be honest, I'd still prefer a proper fix, even if
it requires an additional bit there, but I guess we can increase the
buffer size and print a warning when it crosses a page boundary.
Nicolai
>
> Marek
>
>
> Cheers,
> Nicolai
>
>
>
> }
> }
>
> desc[3] = velems->rsrc_word3[i];
>
> if (first_vb_use_mask & (1 << i)) {
> radeon_add_to_buffer_list(&sctx->b,
> &sctx->b.gfx,
> (struct
> r600_resource*)vb->buffer,
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 8806027..ec324b8 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -353,23 +353,20 @@ static int si_get_param(struct
> pipe_screen* pscreen, enum pipe_cap param)
> case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
> case PIPE_CAP_SM3:
> case PIPE_CAP_SEAMLESS_CUBE_MAP:
> case PIPE_CAP_PRIMITIVE_RESTART:
> case PIPE_CAP_CONDITIONAL_RENDER:
> case PIPE_CAP_TEXTURE_BARRIER:
> case PIPE_CAP_INDEP_BLEND_ENABLE:
> case PIPE_CAP_INDEP_BLEND_FUNC:
> case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
> case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
> - case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> - case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> - case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_USER_INDEX_BUFFERS:
> case PIPE_CAP_USER_CONSTANT_BUFFERS:
> case PIPE_CAP_START_INSTANCE:
> case PIPE_CAP_NPOT_TEXTURES:
> case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
> case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
> case PIPE_CAP_VERTEX_COLOR_CLAMPED:
> case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
> case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
> case PIPE_CAP_TGSI_INSTANCEID:
> @@ -455,20 +452,29 @@ static int si_get_param(struct
> pipe_screen* pscreen, enum pipe_cap param)
>
> case PIPE_CAP_GLSL_FEATURE_LEVEL:
> if (si_have_tgsi_compute(sscreen))
> return 450;
> return HAVE_LLVM >= 0x0309 ? 420 :
> HAVE_LLVM >= 0x0307 ? 410 : 330;
>
> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
> return MIN2(sscreen->b.info.max_alloc_size,
> INT_MAX);
>
> + case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> + case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> + case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> + /* SI doesn't support unaligned loads.
> + * CIK needs DRM 2.50.0 on radeon. */
> + return sscreen->b.chip_class == SI ||
> + (sscreen->b.info.drm_major == 2 &&
> + sscreen->b.info.drm_minor < 50);
> +
> /* Unsupported features. */
> case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
> case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
> case PIPE_CAP_USER_VERTEX_BUFFERS:
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
>
>
>
More information about the mesa-dev
mailing list