[Mesa-dev] [PATCH 1/2] glsl: Don't rehash the values when copying to new table
Thomas Helland
thomashelland90 at gmail.com
Mon Jan 2 12:11:43 UTC 2017
That is a very nice suggestion!
I'll have a look at that, and see if I can hack a quick implementation.
It would probably be much better than my multi-map approach.
It felt like sort of a hack-fix on the current algorithm.
Doing all replacements in one run would also be really nice,
as my investigations have found copy propagation to be one of the
few passes that benefit from multiple runs of the pass.
It is not unlikely that some of the other passes that benefit from 3+
runs of the pass get those late effects simply because copy propagation
has not fully completed the work it can do until the second pass.
On Dec 31, 2016 05:39, "Connor Abbott" <cwabbott0 at gmail.com> wrote:
On Fri, Dec 30, 2016 at 10:04 AM, Vladislav Egorov <vegorov180 at gmail.com>
wrote:
> Oh, thanks! A very nice torture case. However, it seems to be related to
> another problem. The test case has no IF or LOOP blocks, so it can't be
> possibly improved with speeding up the cloning of the table (the torture
> case can be modified with a few ifs and fors on top though...).
>
> Anyway, on Gallium the time goes down from 1.91s before my WIP patch to
> 0.10s after my patch-series mostly due to improvements in
> glsl_to_tgsi_visitor::copy_propagation(). On Intel I don't see much of
> improvement however: 157s (!) -> 132s.
>
> Seems to be another case of linear scanning of the ACP table. Mesa has
> several variations of the same copy propagation algorithm, and all of them
> (with exception do_copy_propagation_elements that uses a hash-table)
> linearly scan the entire table when they want to kill all the copies of a
> variable. Which makes the implementation O(N^2) and blows up execution
time
> on huge functions like the ones from shadertoy. However most of the time
it
> still spends in tree_grafting that also uses all types of linear scans and
> tree walks instead of lookup tables (ir_expression::get_num_operands()
gets
> executed > 6 billion times!) -- I'll try to look into it.
Since the problem seemed interesting, I thought about it a little bit
and realized that we could do something even better than what
copy_propagation_elements does. Instead of having a hash table from
ir_variables to ir_variables, we could have a hash table from
ir_variable to acp_entry, where acp_entry contains the original
variable as well as the acp_entry (*not* just the variable) we want to
replace it with. That is, when we see a copy where we've never seen
the lhs or rhs before, we insert two new acp_entry's and make one
point to the other. To kill a variable, we would simply mark its
associated acp_entry as dead and remove it from the table. When we go
to look up a variable's replacement, we check if the acp_entry for the
replacement is dead and ignore it if it is. That should make
everything O(1) (assuming hash table lookup is O(1)) while being much
simpler and probably faster than keeping a separate hash table of rhs
variables.
Another thing that would be potentially interesting is to recursively
replace variables. Right now, with something like:
bar = foo
baz = bar
... = baz
we need two iterations of copy propagation in order to replace "baz"
with "foo" on the rhs. If instead of looking up baz -> bar we looped
through the acp_entry's until the replacement is killed or NULL, we
could look up baz -> bar -> foo and do the whole thing in one shot.
That might make some shaders require less iterations to reach a fixed
point, helping compile times further.
I haven't thought too much about copy_propagation_elements, but we
could probably adapt it to do a similar thing too. Of course, we'd get
less of a benefit out of that.
>
> Also, reading the bug report, I don't understand why did you treat
> do_copy_propagation_elements() and do_copy_propagation() differently -- in
> one case added hash-table for both sides of COPY(to, from), in another
only
> for one side?
>
>
> 30.12.2016 09:08, Tapani Pälli пишет:
>
>>
>>
>> On 12/30/2016 05:53 AM, Vladislav Egorov wrote:
>>>
>>> I've looked into it recently (I'm working on series of many various
>>> trivial optimizations)
>>>
>>> and it's faster to just memcpy() it. Just throwing out superfluous
>>> hashing still keeps slow
>>>
>>> hash-table insertion around -- with resizing, rehashing, memory
>>> allocation/deallocation, internal
>>>
>>> hash-function through integer division, collisions and so on. It
>>> produces a nice speed improvement
>>>
>>> actually. It's possible to explore approaches without any copying at
>>> LOOP/IF entering at all,
>>>
>>> but I am not sure it will improve performance.
>>
>>
>> When profiling copy_propagation(_elements) pass you can use Martina's
>> testcase from this bug:
>>
>> https://bugs.freedesktop.org/show_bug.cgi?id=94477
>>
>> We still have a very long compile time for the WebGL case mentioned in
the
>> bug so would be cool to have some optimizations there.
>>
>>
>>>
>>> 30.12.2016 02:49, Thomas Helland пишет:
>>>>
>>>> Really, we should have some kind of function for copying the whole
>>>> table,
>>>> but this will work for now.
>>>> ---
>>>> src/compiler/glsl/opt_copy_propagation.cpp | 6 ++++--
>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/src/compiler/glsl/opt_copy_propagation.cpp
>>>> b/src/compiler/glsl/opt_copy_propagation.cpp
>>>> index 247c498..e9f82e0 100644
>>>> --- a/src/compiler/glsl/opt_copy_propagation.cpp
>>>> +++ b/src/compiler/glsl/opt_copy_propagation.cpp
>>>> @@ -210,7 +210,8 @@
>>>> ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
>>>> /* Populate the initial acp with a copy of the original */
>>>> struct hash_entry *entry;
>>>> hash_table_foreach(orig_acp, entry) {
>>>> - _mesa_hash_table_insert(acp, entry->key, entry->data);
>>>> + _mesa_hash_table_insert_pre_hashed(acp, entry->hash,
>>>> + entry->key, entry->data);
>>>> }
>>>> visit_list_elements(this, instructions);
>>>> @@ -259,7 +260,8 @@ ir_copy_propagation_visitor::handle_loop(ir_loop
>>>> *ir, bool keep_acp)
>>>> if (keep_acp) {
>>>> struct hash_entry *entry;
>>>> hash_table_foreach(orig_acp, entry) {
>>>> - _mesa_hash_table_insert(acp, entry->key, entry->data);
>>>> + _mesa_hash_table_insert_pre_hashed(acp, entry->hash,
>>>> + entry->key, entry->data);
>>>> }
>>>> }
>>>>
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170102/b6ea80bc/attachment.html>
More information about the mesa-dev
mailing list