# [Mesa-dev] [PATCH 0/8] Hash table and hash set reworking

Thomas Helland thomashelland90 at gmail.com
Thu Mar 5 17:36:51 PST 2015

```2015-02-28 17:05 GMT+01:00 Jason Ekstrand <jason at jlekstrand.net>:
>
> On Feb 28, 2015 4:55 AM, "Thomas Helland" <thomashelland90 at gmail.com> wrote:
>>
>> So here comes my hash-table series mentioned earlier.
>>
>> So, first of all, there's some issues.
>> I've been strugling with hitting assertion failures.
>> The table returns null at times when it apparently should not.
>> It occurs after patch 1, and is fixed by patch 2.
>> It also occured when I was tweaking the amount of free space in the
>> table.
>>
>> With this series it should be tweaked and working (apart from patch 1),
>> but it would be wonderfull if people gave it a run.
>> I'm suspecting we might have some threading issues,
>> but I don't really have much experience with threading in C/C++,
>> so I haven't really debugged it to much.
>
> It sounds like we need more unit tests.  The ones we currently have are
> pretty sparse (as seen by the double-insertion bug I found recently).
> Please figure out why the first patch fails and write a unit test to trigger
> that.

What happens is that we fail to find an empty slot in the table.
(We roll around back to the starting address, and give up)
Should we add an assertion to flag that we fail to insert?

The Wikipedia article on quadratic probing sheds some light
on what is actually going on here, and somehow I managed
to not notice it when I first read it. Quick summary:

If we assume a form of:
h(k,i) = k + c1 * i + c2 * i*i mod m
(h is the hash, k is the starting hash, m is size)

Then
"For prime m > 2, most choices of c1 and c2 will make h(k,i)
distinct for i in [0, (m-1)/2]. Such choices include c1 = c2 = 1/2,
c1 = c2 = 1, and c1 = 0, c2 = 1. Because there are only about
m/2 distinct probes for a given element, it is difficult to guarantee
that insertions will succeed when the load factor is > 1/2."

However, if the table is power-of-two sized:
"... a good choice for the constants are c1 = c2 = 1/2,
as the values of h(k,i) for i in [0,m-1] are all distinct."

So this is not an issue if I fix up the hashing somewhat to use
c1 = c2 = 1/2 as suggested, and we use 2^n table-size.
In fact, it should guarantee that we don't fail to insert!

I'll post a V2 of the series that does this, and squashes
it together with the power-of-two change, to avoid regressions.

Alternatively, we can use a different type of probing,
like linear probing or robin-hood hashing.
These have worse clustering behavior however.

I'm a bit unsure how to write a test for this that works.
I tried modifying the insert_many test to use the hash_function.
My hope was that this should cause clustering and failure.
Sadly this didn't trigger the fault even with 10'000 keys.

Is there some way we can save the shader-ir, and then replay it?
Then we can replay one of the shaders that trigger the fault as a test.
Surely, the best "unit test" is to run a full shader-db run.
This really trashes the table quite hard.

Any suggestions would be appreciated.

>
>> I did not have any good testcases, so a shader-db run had to do.
>> If anyone knows of any cpu-bound applications that specifically
>> hit the hash-table I would be happy to hear about it.
>
> If you've got an Intel card please also test and give shader-db results with
> INTEL_USE_NIR=1.  The NIR code thrashes the hash table/set code more than
> anything else in the driver.  NIR also uses different access patterns than
> the rest of the driver since most of the hash table lookups outside of NIR
> are for glUint  -> object lookups in entry points.
>
> If this is shown to have good improvements, it might be worth looking at the
> C++-based hash table we use in the GLSL compiler which doesn't currently use
> the implementation in util.
>
>> These are results from running shader-db with oprofile after each commit.
>> Function           master comm1  comm2  comm3  comm4  comm5  comm6  comm7
>> comm8
>> mesa_hash_data     2.81   ?      3.09   3.05   2.99   3.23   3.25   3.11
>> 3.12
>> hash_table_insert  4.28   ?      2.57   2.58   2.58   2.71   2.52   2.52
>> 2.50
>> hash_table_search  4.59   ?      2.67   2.72   2.70   2.87   2.64   2.64
>> 2.59
>> set_add            2.69   ?      2.81   2.80   2.70   1.69   1.65   1.74
>> 1.72
>> set_search         2.75   ?      2.84   2.86   2.74   2.11   2.07   2.08
>> 2.09
>> runtime            175    ?      170    162    165    157    160    160
>> 164
>>
>> With regards to cheaper hash-functions:
>> It seems this is a case of "much pain for no gain".
>> Looking at this site[1] it looks like most popular
>> hashes have about the same instruction cost
>> for smaller amounts of data(10 bytes).
>> I tried copying the C-code example from the
>> Wikipedia-article on Murmur-hash 3, and using this.
>> It did show a speed-up, but not close to what I would expect.
>>
>> It should also be noted that the crc32c instruction has a latency
>> of three cycles on older processors[2].
>> That article also has a different implementation of crc32c
>> with fallback that I have not tried.
>>
>> [1] https://github.com/rurban/smhasher
>> [2]
>> http://stackoverflow.com/questions/17645167/implementing-sse-4-2s-crc32c-in-software
>>
>> Thomas Helland (8):
>>   util: Change hash_table to use quadratic probing.
>>   util: Change table size to be power of two
>>   glsl: Change ir_const_expr to use util/hash_table
>>   util: Change util/set to use quadratic probing
>>   util: Change set to use power-of-two sized table
>>   util: Change size of table to have 23% free
>>   util: Add murmur3 hashing function
>>
>>  src/glsl/ir_constant_expression.cpp |  20 +++---
>>  src/mesa/x86/common_x86.c           |   4 ++
>>  src/mesa/x86/common_x86_features.h  |   8 +++
>>  src/util/crc32c_hw.h                |  67 ++++++++++++++++++++
>>  src/util/hash_table.c               | 120
>> +++++++++++++++++++-----------------
>>  src/util/hash_table.h               |  55 ++++++++++++++++-
>>  src/util/set.c                      |  97 ++++++++++++++---------------
>>  src/util/set.h                      |   1 -
>>  8 files changed, 258 insertions(+), 114 deletions(-)
>>  create mode 100644 src/util/crc32c_hw.h
>>
>> --
>> 2.2.1
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
```