[Mesa-dev] [PATCH v3 2/6] anv: Add a helper for doing mass allocations

Chris Wilson chris at chris-wilson.co.uk
Fri Apr 7 23:57:02 UTC 2017


On Fri, Apr 07, 2017 at 04:30:49PM -0700, Jason Ekstrand wrote:
>    On Fri, Apr 7, 2017 at 3:19 PM, Chris Wilson <[1]chris at chris-wilson.co.uk>
>    wrote:
> 
>      On Fri, Apr 07, 2017 at 02:41:13PM -0700, Jason Ekstrand wrote:
>      >    On Fri, Apr 7, 2017 at 1:26 PM, Chris Wilson
>      <[1][2]chris at chris-wilson.co.uk>
>      >    wrote:
>      >
>      >      On Fri, Apr 07, 2017 at 12:55:53PM -0700, Jason Ekstrand wrote:
>      >      > +#define _ANV_MULTIALLOC_UPDATE_POINTER(_i) \
>      >      > +   if ((_i) < ma->ptr_count) \
>      >      > +      *ma->ptrs[_i] = ptr + (uintptr_t)*ma->ptrs[_i]
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(0);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(1);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(2);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(3);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(4);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(5);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(6);
>      >      > +   _ANV_MULTIALLOC_UPDATE_POINTER(7);
>      >      > +#undef _ANV_MULTIALLOC_UPDATE_POINTER
>      >
>      >      #define _ANV_MULTIALLOC_UPDATE_POINTER(_i) case _i + 1:
>      *ma->ptrs[_i] =
>      >      ptr +(uintptr)*ma->ptrs[_i]
>      >
>      >      switch (ma->ptr_count) {
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(7);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(6);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(5);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(4);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(3);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(2);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(1);
>      >      _ANV_MULTIALLOC_UPDATE_POINTER(0);
>      >      }
>      >
>      >      #undef _ANV_MULITALLOC_UPDATE_POINTER
>      >
>      >    If ma->ptr_count is constant, they generate exactly the same code. 
>      If it
>      >    isn't (i.e. if one of the multialloc_adds is predicated), then they
>      still
>      >    generate basically the same code with the code for the if version
>      being
>      >    slightly more straightforward.
> 
>      Took a look at this with [3]https://godbolt.org/g/UwrMk1
> 
>    Weird... That's not at all what I'm seeing with my demo file.  In fact,
>    when I try to compile your demo file with GCC on my local machine, it
>    reduces the entire thing down to less than a dozen instrutions.

Yes, if I force inline the add, gcc and clang both realise that the
function doesn't use any of the values and discards everything. In the
end, gcc actually generates very smart code.

consume_pointer:
        movl    $0, (%rdi)
        ret
main:
        subq    $8, %rsp
        movl    $200, %edi
        call    malloc
        testq   %rax, %rax
        je      .L5
        movq    %rax, %rdi
        call    consume_pointer
        leaq    4(%rax), %rdi
        call    consume_pointer
        leaq    72(%rax), %rdi
        call    consume_pointer
        xorl    %eax, %eax
.L3:
        addq    $8, %rsp
        ret
.L5:
        orl     $-1, %eax
        jmp     .L3

It's generated a single allocation, and yet still passed around the
various offsets within that block without having to store the offsets.
anv_multialloc_add() definitely needs __attribute__((always_inline)).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the mesa-dev mailing list