[Mesa-dev] [PATCH 3/8] radeonsi: set a better NUM_PATCHES hard limit
Marek Olšák
maraeo at gmail.com
Sat Jun 9 03:16:50 UTC 2018
From: Marek Olšák <marek.olsak at amd.com>
AMDVLK uses 64 (distributed) and 16 (non-distributed).
radeonsi will use 63 and 16.
* This might improve tessellation performance on Hawaii, Bonaire, Tahiti,
Pitcairn. (they will use 16)
* I'm not sure if this matters for 1 SE configs.
---
src/gallium/drivers/radeonsi/si_state_draw.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c
index d61374e95ca..b29135a1e68 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -158,24 +158,31 @@ static bool si_emit_derived_tess_state(struct si_context *sctx,
*/
hardware_lds_size = 32768;
*num_patches = MIN2(*num_patches, hardware_lds_size / (input_patch_size +
output_patch_size));
/* Make sure the output data fits in the offchip buffer */
*num_patches = MIN2(*num_patches,
(sctx->screen->tess_offchip_block_dw_size * 4) /
output_patch_size);
- /* Not necessary for correctness, but improves performance. The
- * specific value is taken from the proprietary driver.
+ /* Not necessary for correctness, but improves performance.
+ * The hardware can do more, but the radeonsi shader constant is
+ * limited to 6 bits.
*/
- *num_patches = MIN2(*num_patches, 40);
+ *num_patches = MIN2(*num_patches, 63); /* triangles: 3 full waves except 3 lanes */
+
+ /* When distributed tessellation is unsupported, switch between SEs
+ * at a higher frequency to compensate for it.
+ */
+ if (!sctx->screen->has_distributed_tess && sctx->screen->info.max_se > 1)
+ *num_patches = MIN2(*num_patches, 16); /* recommended */
/* Make sure that vector lanes are reasonably occupied. It probably
* doesn't matter much because this is LS-HS, and TES is likely to
* occupy significantly more CUs.
*/
unsigned temp_verts_per_tg = *num_patches * max_verts_per_patch;
if (temp_verts_per_tg > 64 && temp_verts_per_tg % 64 < 48)
*num_patches = (temp_verts_per_tg & ~63) / max_verts_per_patch;
if (sctx->chip_class == SI) {
--
2.17.1
More information about the mesa-dev
mailing list