[Mesa-dev] [PATCH] radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI

Nicolai Hähnle nhaehnle at gmail.com
Tue Feb 14 08:06:00 UTC 2017


On 13.02.2017 18:01, Marek Olšák wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> So that we can disable u_vbuf for GL core profiles.
>
> This is a v2 of the previous VI-only patch.
> It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c |  1 -
>  src/gallium/drivers/radeonsi/si_pipe.c        | 12 +++++++++---
>  2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 3c98176..8abbf10 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -1014,21 +1014,20 @@ bool si_upload_vertex_buffer_descriptors(struct si_context *sctx)
>  			 * up so that the hardware sees four components as
>  			 * being inside the buffer if and only if the first
>  			 * three components are in the buffer.
>  			 *
>  			 * Since the offset and stride are guaranteed to be
>  			 * 4-byte aligned, this alignment will never cross the
>  			 * winsys buffer boundary.
>  			 */
>  			size3 = (fix_size3 >> (2 * i)) & 3;
>  			if (vb->stride && size3) {
> -				assert(offset % 4 == 0 && vb->stride % 4 == 0);
>  				assert(size3 <= 2);
>  				desc[2] = align(desc[2], size3 * 2);

I think there's still a bug with this. Consider the following setup:

3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at an 
offset e.g. 12 bytes before the end of a buffer for a total of 4 vertices.

We use unaligned 8_8_8_8 to read this vertex attribute, but when the 
last vertex is read, its last byte (of the extended 8_8_8_8) is 
out-of-bounds of the buffer. The hardware does bounds checks 
all-or-nothing (i.e., not per-channel) and drops the entire load as if 
the vertex data were out-of-bounds.

This is admittedly a bit of a ridiculous case that should not prevent us 
from dropping u_vbuf. It'll probably need a shader work-around of 
loading each channel individually, like with your recent GL_DOUBLE 
patch. At least I can't think of anything else right now.

In any case, the comment above that code snippet is no longer really 
correct.

Cheers,
Nicolai


>  			}
>  		}
>
>  		desc[3] = velems->rsrc_word3[i];
>
>  		if (first_vb_use_mask & (1 << i)) {
>  			radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
>  					      (struct r600_resource*)vb->buffer,
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c
> index 8806027..ec324b8 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -353,23 +353,20 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
>  	case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
>  	case PIPE_CAP_SM3:
>  	case PIPE_CAP_SEAMLESS_CUBE_MAP:
>  	case PIPE_CAP_PRIMITIVE_RESTART:
>  	case PIPE_CAP_CONDITIONAL_RENDER:
>  	case PIPE_CAP_TEXTURE_BARRIER:
>  	case PIPE_CAP_INDEP_BLEND_ENABLE:
>  	case PIPE_CAP_INDEP_BLEND_FUNC:
>  	case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
>  	case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
> -	case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> -	case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> -	case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
>  	case PIPE_CAP_USER_INDEX_BUFFERS:
>  	case PIPE_CAP_USER_CONSTANT_BUFFERS:
>  	case PIPE_CAP_START_INSTANCE:
>  	case PIPE_CAP_NPOT_TEXTURES:
>  	case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
>  	case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
>  	case PIPE_CAP_VERTEX_COLOR_CLAMPED:
>  	case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
>          case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
>  	case PIPE_CAP_TGSI_INSTANCEID:
> @@ -455,20 +452,29 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
>
>  	case PIPE_CAP_GLSL_FEATURE_LEVEL:
>  		if (si_have_tgsi_compute(sscreen))
>  			return 450;
>  		return HAVE_LLVM >= 0x0309 ? 420 :
>  		       HAVE_LLVM >= 0x0307 ? 410 : 330;
>
>  	case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
>  		return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
>
> +	case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> +	case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> +	case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> +		/* SI doesn't support unaligned loads.
> +		 * CIK needs DRM 2.50.0 on radeon. */
> +		return sscreen->b.chip_class == SI ||
> +		       (sscreen->b.info.drm_major == 2 &&
> +			sscreen->b.info.drm_minor < 50);
> +
>  	/* Unsupported features. */
>  	case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
>  	case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
>  	case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
>  	case PIPE_CAP_USER_VERTEX_BUFFERS:
>  	case PIPE_CAP_FAKE_SW_MSAA:
>  	case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
>  	case PIPE_CAP_VERTEXID_NOBASE:
>  	case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
>  	case PIPE_CAP_TGSI_VOTE:
>



More information about the mesa-dev mailing list