[Mesa-dev] [PATCH] radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI
Marek Olšák
maraeo at gmail.com
Tue Feb 14 10:58:13 UTC 2017
On Feb 14, 2017 9:06 AM, "Nicolai Hähnle" <nhaehnle at gmail.com> wrote:
On 13.02.2017 18:01, Marek Olšák wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> So that we can disable u_vbuf for GL core profiles.
>
> This is a v2 of the previous VI-only patch.
> It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
> ---
> src/gallium/drivers/radeonsi/si_descriptors.c | 1 -
> src/gallium/drivers/radeonsi/si_pipe.c | 12 +++++++++---
> 2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 3c98176..8abbf10 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -1014,21 +1014,20 @@ bool si_upload_vertex_buffer_descriptors(struct
> si_context *sctx)
> * up so that the hardware sees four components as
> * being inside the buffer if and only if the first
> * three components are in the buffer.
> *
> * Since the offset and stride are guaranteed to be
> * 4-byte aligned, this alignment will never cross
> the
> * winsys buffer boundary.
> */
> size3 = (fix_size3 >> (2 * i)) & 3;
> if (vb->stride && size3) {
> - assert(offset % 4 == 0 && vb->stride % 4
> == 0);
> assert(size3 <= 2);
> desc[2] = align(desc[2], size3 * 2);
>
I think there's still a bug with this. Consider the following setup:
3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at an
offset e.g. 12 bytes before the end of a buffer for a total of 4 vertices.
We use unaligned 8_8_8_8 to read this vertex attribute, but when the last
vertex is read, its last byte (of the extended 8_8_8_8) is out-of-bounds of
the buffer. The hardware does bounds checks all-or-nothing (i.e., not
per-channel) and drops the entire load as if the vertex data were
out-of-bounds.
This is admittedly a bit of a ridiculous case that should not prevent us
from dropping u_vbuf. It'll probably need a shader work-around of loading
each channel individually, like with your recent GL_DOUBLE patch. At least
I can't think of anything else right now.
In any case, the comment above that code snippet is no longer really
correct.
We can also say that such cases don't exist in the real world and call it a
day. Or we can increase the buffer size and potentially risk a VM fault.
The small problem with fix_fetch is that we have only one unused value left
in the 4 bits, but we need one fix for 888 and another for 161616.
Marek
Cheers,
Nicolai
}
> }
>
> desc[3] = velems->rsrc_word3[i];
>
> if (first_vb_use_mask & (1 << i)) {
> radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
> (struct
> r600_resource*)vb->buffer,
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 8806027..ec324b8 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -353,23 +353,20 @@ static int si_get_param(struct pipe_screen* pscreen,
> enum pipe_cap param)
> case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
> case PIPE_CAP_SM3:
> case PIPE_CAP_SEAMLESS_CUBE_MAP:
> case PIPE_CAP_PRIMITIVE_RESTART:
> case PIPE_CAP_CONDITIONAL_RENDER:
> case PIPE_CAP_TEXTURE_BARRIER:
> case PIPE_CAP_INDEP_BLEND_ENABLE:
> case PIPE_CAP_INDEP_BLEND_FUNC:
> case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
> case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
> - case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> - case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> - case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_USER_INDEX_BUFFERS:
> case PIPE_CAP_USER_CONSTANT_BUFFERS:
> case PIPE_CAP_START_INSTANCE:
> case PIPE_CAP_NPOT_TEXTURES:
> case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
> case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
> case PIPE_CAP_VERTEX_COLOR_CLAMPED:
> case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
> case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
> case PIPE_CAP_TGSI_INSTANCEID:
> @@ -455,20 +452,29 @@ static int si_get_param(struct pipe_screen* pscreen,
> enum pipe_cap param)
>
> case PIPE_CAP_GLSL_FEATURE_LEVEL:
> if (si_have_tgsi_compute(sscreen))
> return 450;
> return HAVE_LLVM >= 0x0309 ? 420 :
> HAVE_LLVM >= 0x0307 ? 410 : 330;
>
> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
> return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
>
> + case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> + case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> + case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> + /* SI doesn't support unaligned loads.
> + * CIK needs DRM 2.50.0 on radeon. */
> + return sscreen->b.chip_class == SI ||
> + (sscreen->b.info.drm_major == 2 &&
> + sscreen->b.info.drm_minor < 50);
> +
> /* Unsupported features. */
> case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
> case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
> case PIPE_CAP_USER_VERTEX_BUFFERS:
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170214/1720e710/attachment.html>
More information about the mesa-dev
mailing list