[Mesa-dev] [PATCH] gallium: reflect likely outcome in execution flow
Timothy Arceri
t_arceri at yahoo.com.au
Wed May 21 23:56:01 PDT 2014
On Wed, 2014-05-21 at 18:05 -0400, Ilia Mirkin wrote:
> On Wed, May 21, 2014 at 6:02 PM, Timothy Arceri <t_arceri at yahoo.com.au> wrote:
> > Unless we run out of memory the old if statement would always fail so reflect the more likely outcome. Should be be faster most of the time and slightly cleaner looking code.
> >
> > Signed-off-by: Timothy Arceri <t_arceri at yahoo.com.au>
> > ---
> > src/gallium/auxiliary/util/u_upload_mgr.c | 7 +++----
> > 1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/gallium/auxiliary/util/u_upload_mgr.c b/src/gallium/auxiliary/util/u_upload_mgr.c
> > index 744ea2e..99f9a08 100644
> > --- a/src/gallium/auxiliary/util/u_upload_mgr.c
> > +++ b/src/gallium/auxiliary/util/u_upload_mgr.c
> > @@ -247,11 +247,10 @@ enum pipe_error u_upload_data( struct u_upload_mgr *upload,
> > enum pipe_error ret = u_upload_alloc(upload, min_out_offset, size,
> > out_offset, outbuf,
> > (void**)&ptr);
> > - if (ret != PIPE_OK)
> > - return ret;
> > + if (ret == PIPE_OK)
> > + memcpy(ptr, data, size);
> >
> > - memcpy(ptr, data, size);
> > - return PIPE_OK;
> > + return ret;
> > }
>
> Have you actually compared the generated code? I find it can be fairly
> instructive to do so... btw, there's a likely() (and unlikely() )
> primitive (in p_compiler.h) which will indicate to gcc whether a
> particular condition is likely or unlikely, so that it can adjust its
> code generation accordingly.
>
> -ilia
Hi Ilia,
I've taken a look at the results (see below) and read up on
likely/unlikely (I was aware of these but wasn't confident in when they
should be used).
The result is my patch avoids a jump which I assume should help with
branch prediction in the cpu? Something like this is what I was
expecting with the change.
The interesting thing is that in this case using likely seems to have
added an extra 'test' instruction for no added value (please correct me
if I'm wrong). However there is probably other places in
u_upload_alloc(), etc where likely might be more helpful.
Tim
Common code:
444fce: 55 push %rbp
444fcf: 48 89 e5 mov %rsp,%rbp
444fd2: 48 83 ec 40 sub $0x40,%rsp
444fd6: 48 89 7d e8 mov %rdi,-0x18(%rbp)
444fda: 89 75 e4 mov %esi,-0x1c(%rbp)
444fdd: 89 55 e0 mov %edx,-0x20(%rbp)
444fe0: 48 89 4d d8 mov %rcx,-0x28(%rbp)
444fe4: 4c 89 45 d0 mov %r8,-0x30(%rbp)
444fe8: 4c 89 4d c8 mov %r9,-0x38(%rbp)
444fec: 4c 8d 45 f0 lea -0x10(%rbp),%r8
444ff0: 48 8b 7d c8 mov -0x38(%rbp),%rdi
444ff4: 48 8b 4d d0 mov -0x30(%rbp),%rcx
444ff8: 8b 55 e0 mov -0x20(%rbp),%edx
444ffb: 8b 75 e4 mov -0x1c(%rbp),%esi
444ffe: 48 8b 45 e8 mov -0x18(%rbp),%rax
445002: 4d 89 c1 mov %r8,%r9
445005: 49 89 f8 mov %rdi,%r8
445008: 48 89 c7 mov %rax,%rdi
44500b: e8 78 fd ff ff callq 444d88 <u_upload_alloc>
445010: 89 45 fc mov %eax,-0x4(%rbp)
445013: 83 7d fc 00 cmpl $0x0,-0x4(%rbp)
Pre change:
0000000000444fce <u_upload_data>:
445017: 74 05 je 44501e <u_upload_data+0x50>
445019: 8b 45 fc mov -0x4(%rbp),%eax
44501c: eb 1b jmp 445039 <u_upload_data+0x6b>
44501e: 8b 55 e0 mov -0x20(%rbp),%edx
445021: 48 8b 45 f0 mov -0x10(%rbp),%rax
445025: 48 8b 4d d8 mov -0x28(%rbp),%rcx
445029: 48 89 ce mov %rcx,%rsi
44502c: 48 89 c7 mov %rax,%rdi
44502f: e8 2c 40 c0 ff callq 49060 <memcpy at plt>
445034: b8 00 00 00 00 mov $0x0,%eax
445039: c9 leaveq
44503a: c3 retq
Post change:
0000000000444fce <u_upload_data>:
445017: 75 16 jne 44502f <u_upload_data+0x61>
445019: 8b 55 e0 mov -0x20(%rbp),%edx
44501c: 48 8b 45 f0 mov -0x10(%rbp),%rax
445020: 48 8b 4d d8 mov -0x28(%rbp),%rcx
445024: 48 89 ce mov %rcx,%rsi
445027: 48 89 c7 mov %rax,%rdi
44502a: e8 31 40 c0 ff callq 49060 <memcpy at plt>
44502f: 8b 45 fc mov -0x4(%rbp),%eax
445032: c9 leaveq
445033: c3 retq
With likely:
0000000000444fce <u_upload_data>:
44501d: 48 85 c0 test %rax,%rax
445020: 74 16 je 445038 <u_upload_data+0x6a>
445022: 8b 55 e0 mov -0x20(%rbp),%edx
445025: 48 8b 45 f0 mov -0x10(%rbp),%rax
445029: 48 8b 4d d8 mov -0x28(%rbp),%rcx
44502d: 48 89 ce mov %rcx,%rsi
445030: 48 89 c7 mov %rax,%rdi
445033: e8 28 40 c0 ff callq 49060 <memcpy at plt>
445038: 8b 45 fc mov -0x4(%rbp),%eax
44503b: c9 leaveq
44503c: c3 retq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140522/330935fb/attachment.html>
More information about the mesa-dev
mailing list