Mesa (staging/21.3): nir: fix lower_memcpy
GitLab Mirror
gitlab-mirror at kemper.freedesktop.org
Sun Feb 20 17:40:41 UTC 2022
Module: Mesa
Branch: staging/21.3
Commit: fa191f93db3a5afd56c3b1dd87488c29456f3122
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=fa191f93db3a5afd56c3b1dd87488c29456f3122
Author: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
Date: Wed Feb 16 23:14:15 2022 +0200
nir: fix lower_memcpy
memcpy is divided into chunks that are vec4 sized max. The problem
here happens with a structure of 24 bytes :
struct {
float3 a;
float3 b;
}
If you memcpy that struct, the lowering will emit 2 load/store, one of
sized 8, next one sized 16. But both end up located at offset 0, so we
effectively drop 2 floats.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
Fixes: a3177cca996145 ("nir: Add a lowering pass to lower memcpy")
Reviewed-by: Jason Ekstrand <jason.ekstrand at collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15049>
(cherry picked from commit 768930a73a43e48172df00b6c934de582bd9422b)
---
.pick_status.json | 2 +-
src/compiler/nir/nir_lower_memcpy.c | 11 +++++++----
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/.pick_status.json b/.pick_status.json
index df6040d400b..b7ce30669d9 100644
--- a/.pick_status.json
+++ b/.pick_status.json
@@ -787,7 +787,7 @@
"description": "nir: fix lower_memcpy",
"nominated": true,
"nomination_type": 1,
- "resolution": 0,
+ "resolution": 1,
"main_sha": null,
"because_sha": "a3177cca9961452b436b12fd0790c6ffaa8f0eee"
},
diff --git a/src/compiler/nir/nir_lower_memcpy.c b/src/compiler/nir/nir_lower_memcpy.c
index b7a3f1752cb..768537a3478 100644
--- a/src/compiler/nir/nir_lower_memcpy.c
+++ b/src/compiler/nir/nir_lower_memcpy.c
@@ -111,11 +111,14 @@ lower_memcpy_impl(nir_function_impl *impl)
uint64_t size = nir_src_as_uint(cpy->src[2]);
uint64_t offset = 0;
while (offset < size) {
- uint64_t remaining = offset - size;
- /* For our chunk size, we choose the largest power-of-two that
- * divides size with a maximum of 16B (a vec4).
+ uint64_t remaining = size - offset;
+ /* Find the largest chunk size power-of-two (MSB in remaining)
+ * and limit our chunk to 16B (a vec4). It's important to do as
+ * many 16B chunks as possible first so that the index
+ * computation is correct for
+ * memcpy_(load|store)_deref_elem_imm.
*/
- unsigned copy_size = 1u << MIN2(ffsll(remaining) - 1, 4);
+ unsigned copy_size = 1u << MIN2(util_last_bit64(remaining) - 1, 4);
const struct glsl_type *copy_type =
copy_type_for_byte_size(copy_size);
More information about the mesa-commit
mailing list