[Mesa-dev] [PATCH] main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storage
Nils Wallménius
nils.wallmenius at gmail.com
Wed Jul 13 11:53:26 UTC 2016
When possible, do the memcpy on larger blocks. This reduces cycles
spent in _mesa_propagate_uniforms_to_driver_storage from
1.51 % to 0.62% according to perf during the Unigine Heaven benchmark.
It did not affect the framerate of the benchmark. The system used for
testing was an i5 6600K with a Radeon R9 380.
Piglit hangs randomly on this system both with and without the patch
so i could not make a comparison.
Signed-off-by: Nils Wallménius <nils.wallmenius at gmail.com>
---
src/mesa/main/uniform_query.cpp | 29 +++++++++++++++++++++++------
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index ab22a0e..5e82aec 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -578,14 +578,31 @@ _mesa_propagate_uniforms_to_driver_storage(struct gl_uniform_storage *uni,
unsigned j;
unsigned v;
- for (j = 0; j < count; j++) {
- for (v = 0; v < vectors; v++) {
- memcpy(dst, src, src_vector_byte_stride);
- src += src_vector_byte_stride;
- dst += store->vector_stride;
+ if (src_vector_byte_stride == store->vector_stride) {
+ if (extra_stride) {
+ for (j = 0; j < count; j++) {
+ memcpy(dst, src, src_vector_byte_stride * vectors);
+ src += src_vector_byte_stride * vectors;
+ dst += store->vector_stride * vectors;
+
+ dst += extra_stride;
+ }
+ } else {
+ /* Unigine Heaven benchmark gets here */
+ memcpy(dst, src, src_vector_byte_stride * vectors * count);
+ src += src_vector_byte_stride * vectors * count;
+ dst += store->vector_stride * vectors * count;
}
+ } else {
+ for (j = 0; j < count; j++) {
+ for (v = 0; v < vectors; v++) {
+ memcpy(dst, src, src_vector_byte_stride);
+ src += src_vector_byte_stride;
+ dst += store->vector_stride;
+ }
- dst += extra_stride;
+ dst += extra_stride;
+ }
}
break;
}
--
2.8.1
More information about the mesa-dev
mailing list