[Mesa-dev] [PATCH] radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI
Nicolai Hähnle
nhaehnle at gmail.com
Tue Feb 14 08:06:00 UTC 2017
On 13.02.2017 18:01, Marek Olšák wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> So that we can disable u_vbuf for GL core profiles.
>
> This is a v2 of the previous VI-only patch.
> It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
> ---
> src/gallium/drivers/radeonsi/si_descriptors.c | 1 -
> src/gallium/drivers/radeonsi/si_pipe.c | 12 +++++++++---
> 2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 3c98176..8abbf10 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -1014,21 +1014,20 @@ bool si_upload_vertex_buffer_descriptors(struct si_context *sctx)
> * up so that the hardware sees four components as
> * being inside the buffer if and only if the first
> * three components are in the buffer.
> *
> * Since the offset and stride are guaranteed to be
> * 4-byte aligned, this alignment will never cross the
> * winsys buffer boundary.
> */
> size3 = (fix_size3 >> (2 * i)) & 3;
> if (vb->stride && size3) {
> - assert(offset % 4 == 0 && vb->stride % 4 == 0);
> assert(size3 <= 2);
> desc[2] = align(desc[2], size3 * 2);
I think there's still a bug with this. Consider the following setup:
3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at an
offset e.g. 12 bytes before the end of a buffer for a total of 4 vertices.
We use unaligned 8_8_8_8 to read this vertex attribute, but when the
last vertex is read, its last byte (of the extended 8_8_8_8) is
out-of-bounds of the buffer. The hardware does bounds checks
all-or-nothing (i.e., not per-channel) and drops the entire load as if
the vertex data were out-of-bounds.
This is admittedly a bit of a ridiculous case that should not prevent us
from dropping u_vbuf. It'll probably need a shader work-around of
loading each channel individually, like with your recent GL_DOUBLE
patch. At least I can't think of anything else right now.
In any case, the comment above that code snippet is no longer really
correct.
Cheers,
Nicolai
> }
> }
>
> desc[3] = velems->rsrc_word3[i];
>
> if (first_vb_use_mask & (1 << i)) {
> radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
> (struct r600_resource*)vb->buffer,
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c
> index 8806027..ec324b8 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -353,23 +353,20 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
> case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
> case PIPE_CAP_SM3:
> case PIPE_CAP_SEAMLESS_CUBE_MAP:
> case PIPE_CAP_PRIMITIVE_RESTART:
> case PIPE_CAP_CONDITIONAL_RENDER:
> case PIPE_CAP_TEXTURE_BARRIER:
> case PIPE_CAP_INDEP_BLEND_ENABLE:
> case PIPE_CAP_INDEP_BLEND_FUNC:
> case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
> case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
> - case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> - case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> - case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_USER_INDEX_BUFFERS:
> case PIPE_CAP_USER_CONSTANT_BUFFERS:
> case PIPE_CAP_START_INSTANCE:
> case PIPE_CAP_NPOT_TEXTURES:
> case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
> case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
> case PIPE_CAP_VERTEX_COLOR_CLAMPED:
> case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
> case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
> case PIPE_CAP_TGSI_INSTANCEID:
> @@ -455,20 +452,29 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
>
> case PIPE_CAP_GLSL_FEATURE_LEVEL:
> if (si_have_tgsi_compute(sscreen))
> return 450;
> return HAVE_LLVM >= 0x0309 ? 420 :
> HAVE_LLVM >= 0x0307 ? 410 : 330;
>
> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
> return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
>
> + case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> + case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> + case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> + /* SI doesn't support unaligned loads.
> + * CIK needs DRM 2.50.0 on radeon. */
> + return sscreen->b.chip_class == SI ||
> + (sscreen->b.info.drm_major == 2 &&
> + sscreen->b.info.drm_minor < 50);
> +
> /* Unsupported features. */
> case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
> case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
> case PIPE_CAP_USER_VERTEX_BUFFERS:
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
>
More information about the mesa-dev
mailing list