Mesa (gallium-0.2): tnl: Optimize SSE load[23]f_1 since they don' t need the identity swizzle.

Alan Hourihane alanh at kemper.freedesktop.org
Fri Dec 12 23:02:25 UTC 2008


Module: Mesa
Branch: gallium-0.2
Commit: b66495a0d915f5d5cc5ab50c843c9c1b296a5851
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b66495a0d915f5d5cc5ab50c843c9c1b296a5851

Author: Guillaume Melquiond <guillaume.melquiond at gmail.com>
Date:   Tue Dec  9 13:10:56 2008 -0800

tnl: Optimize SSE load[23]f_1 since they don't need the identity swizzle.

SSE movss from memory zeroes out everything above the destination dword, so
we get the (a, 0) or (a, 0, 0) result that these functions needed.

Bug #16520.

---

 src/mesa/tnl/t_vertex_sse.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/tnl/t_vertex_sse.c b/src/mesa/tnl/t_vertex_sse.c
index d8021a3..07adc1e 100644
--- a/src/mesa/tnl/t_vertex_sse.c
+++ b/src/mesa/tnl/t_vertex_sse.c
@@ -146,7 +146,8 @@ static void emit_load3f_1( struct x86_program *p,
 			   struct x86_reg dest,
 			   struct x86_reg arg0 )
 {
-   emit_load4f_1(p, dest, arg0);
+   /* Loading from memory erases the upper bits. */
+   sse_movss(&p->func, dest, arg0);
 }
 
 static void emit_load2f_2( struct x86_program *p, 
@@ -160,7 +161,8 @@ static void emit_load2f_1( struct x86_program *p,
 			   struct x86_reg dest,
 			   struct x86_reg arg0 )
 {
-   emit_load4f_1(p, dest, arg0);
+   /* Loading from memory erases the upper bits. */
+   sse_movss(&p->func, dest, arg0);
 }
 
 static void emit_load1f_1( struct x86_program *p, 




More information about the mesa-commit mailing list