[Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup
Roland Scheidegger
sroland at vmware.com
Thu Jun 30 07:36:47 PDT 2011
Am 30.06.2011 12:14, schrieb Jose Fonseca:
>
>
> ----- Original Message -----
>> Hmm.
>> Forgive my ignorance, but isn't memcmp() on structs pretty prone to
>> give
>> incorrect != results, given that there may be padding between members
>> in
>> structs and that IIRC gcc struct assignment is member-wise.
>
> There's no alternative to bitwise comparison on C:
>
> $ cat cmp.c
> struct foo {
> int a;
> int b;
> };
>
>
> int cmp(const struct foo *a, const struct foo *b) {
> return *a == *b;
> }
> $ gcc -c cmp.c
> cmp.c: In function ‘cmp’:
> cmp.c:7:12: error: invalid operands to binary == (have ‘const struct foo’ and ‘const struct foo’)
>
>> What happens if there's padding between the jit_context and variant
>> members of struct lp_rast_state?
>>
>> I seem to recall hitting similar issues a number of times in the
>> past.
>
> I recall that as well, but my memory is the other way around: struct assignment is considerer harmful because it breaks memcmp. Instead all structures should be initialized with memset(0) first, and always copied with memcpy. This should ensure that padding doesn't get clobbered.
>
> But now that you mention it, I'm not still 100% that unsed bits on bitfields are preserved like that or not, when being assigned. We probably should ensure that the all bits in bitfields are used, using "reserved" members, so that the zeros there stay zero.
We've definitely hit issues like that in the past - I think if you use
struct assignment you'll need to initialize the dst struct to 0
initially (but only once - even though the padding is probably undefined
after such an assignment, any implementation should either copy
everything including the padding or leave padding alone).
I don't think anything else will touch the unused parts, though I guess
it might be possible for instance if a 32bit int is assigned to a 16 bit
bitfield which has padding after it.
But generally using memcmp/memcpy should work ok, and it gives the
compiler all the information it needs to do it fast. Well if it uses it
or not is another question... I think the problem with gcc is that when
it inserts the comparisons with repz cmpsb it knows alignment and size
to copy but doesn't know the result is only used as a strict comparison
- that makes it impossible to generate really optimized code there (as
you need some byte comparison to get correct memcmp semantics unless you
use bswap or do dword comparison then byte comparison on a non-match,
both of which are probably slower if you expect the comparison to fail
on first byte which is the case for instance in substring searches) and
later it can't optimize that into something more sensible. So this might
not be trivial to fix in gcc. Too bad no builtin is available which only
returns true/false...
Roland
More information about the mesa-dev
mailing list