[Mesa-dev] [PATCH 1/1] nir: Use a freelist in nir_opt_dce to avoid spamming ralloc

Dieter Nützel Dieter at nuetzel-hh.de
Wed Mar 14 19:58:35 UTC 2018


Hello Thomas,

is this useful even after '[Mesa-dev] [PATCH 0/2] V2: Use hash table 
cloning in copy propagation' landed?

I've running both together with Dave's '[Mesa-dev] [PATCH] radv/winsys: 
replace bo list searchs with a hash table.' patch.

Dieter

Am 24.01.2018 08:33, schrieb Thomas Helland:
> 2018-01-21 23:58 GMT+01:00 Eric Anholt <eric at anholt.net>:
>> Thomas Helland <thomashelland90 at gmail.com> writes:
>> 
>>> Also, allocate worklist_elem in groups of 20, to reduce the burden of
>>> allocation. Do not use rzalloc, as there is no need. This lets us 
>>> drop
>>> the number of calls to ralloc from aproximately 10% of all calls to
>>> ralloc(130 000 calls), down to a mere 2000 calls to 
>>> ralloc_array_size.
>>> This cuts the runtime of shader-db by 1%, while at the same time
>>> reducing the number of stalled cycles, executed cycles, and executed
>>> instructions by about 1 % as reported by perf. I did a five-run
>>> benchmark pre and post and got a statistical variance less than 0.1% 
>>> pre
>>> and post. This was with i965's ir validation polluting the benchmark, 
>>> so
>>> the numbers are even better in release builds.
>>> 
>>> Performance change as found with perf-diff:
>>> 4.74%     -0.23%  libc-2.26.so            [.] _int_malloc
>>> 1.88%     -0.21%  libc-2.26.so            [.] malloc
>>> 2.27%     +0.16%  libmesa_dri_drivers.so  [.] match_value.part.7
>>> 2.95%     -0.12%  libc-2.26.so            [.] _int_free
>>>           +0.11%  libmesa_dri_drivers.so  [.] worklist_push
>>> 1.22%     -0.08%  libc-2.26.so            [.] malloc_consolidate
>>> 0.16%     -0.06%  libmesa_dri_drivers.so  [.] mark_live_cb
>>> 1.21%     +0.06%  libmesa_dri_drivers.so  [.] match_expression.part.6
>>> 0.75%     -0.05%  libc-2.26.so            [.] cfree at GLIBC_2.2.5
>>> 0.50%     -0.05%  libmesa_dri_drivers.so  [.] ralloc_size
>>> 0.57%     +0.04%  libmesa_dri_drivers.so  [.] nir_replace_instr
>>> 1.29%     -0.04%  libmesa_dri_drivers.so  [.] unsafe_free
>> 
>> I'm curious, since a NIR instruction worklist seems like a generally
>> useful thing to have:
>> 
>> Could nir_worklist.c keep the implementation of this?
>> 
>> Also, I wonder if it wouldn't be even better to have a u_dynarray of
>> instructions in the worklist, with push/pop on the end of the array, 
>> and
>> a struct set tracking the instructions in the array to avoid
>> double-adding.  I actually don't know if that would be better or not, 
>> so
>> I'd be happy with the worklist management just moved to 
>> nir_worklist.c.
> 
> I'll look into this to see what I can do. nir_worklist.c at this time 
> has only
> a block worklist. This numbers all the blocks, uses a bitset for 
> checking
> if the item is present, and uses an array with an index pointing to the
> start of the queue of blocks in the buffer.
> 
> The same scheme could be easily used for ssa-defs, as these are
> also numbered. I actually did this for the VRP pass I wrote years ago.
> 
> However, for instructions we do not have a way of numbering them,
> so a different scheme would have to be used. A dynarray + set type
> of thing, us you're suggesting, might get us where we want.
> I'll see what I can come up with.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list