[Mesa-dev] Reducing get.c size (and get_es1.c and get_es2.c)
Kristian Høgsberg
krh at bitplanet.net
Thu May 6 15:27:39 PDT 2010
2010/5/6 Brian Paul <brianp at vmware.com>:
> Kristian Høgsberg wrote:
>>
>> Hi,
>>
>> Ok, I suppose this is not the most pressing issue in mesa, but I was
>> toying with an idea of how to reduce get.c size and integrate
>> get_es1.c and get_es2.c and I had to try it out. Of course it ended
>> up being a bigger project and took a couple of days, but in the end I
>> think it turned out to be a worthwhile effort. The result is the two
>> patches on the get-optimagix branch in my personal mesa repo:
>>
>> http://cgit.freedesktop.org/~krh/mesa/log/?h=get-optimagix
>>
>> The basic idea is that most getters just look up an int somewhere in
>> GLcontext and then convert it to a bool or float according to which of
>> glGetIntegerv() glGetBooleanv() etc is being called. Instead of
>> generating code to do this, we can just record the enum value and the
>> offset into GLcontext in an array of structs. Then in glGet*(), we
>> lookup the struct for the enum in question, and use the offset to get
>> the int we need.
>>
>> Of course, sometimes we need to look up a float, a boolean, a bit in a
>> bitfield, a matrix or other types, so we need to track the type of
>> the value in GLcontext. And sometimes the value isn't in GLcontext
>> but in the drawbuffer, the array object, current texture unit, or
>> maybe it's a computed value. So we need to also track where or how to
>> find the value. Finally, we sometimes need to check that one of a
>> number of extensions are enabled, the gl version or flush or call
>> _mesa_update_state(). This is done by attaching optional extra
>> information to the value description struct, it's sort of like an
>> array of opcodes that describe extra checks or actions.
>>
>> Putting all this together we end up with struct value_desc in the
>> patch, and with a couple of macros to help, the table of struct
>> value_desc is about as concise as the specification in the python
>> code.
>>
>> All we need now is a way to look up the value struct from the enum.
>> The code generated by gcc for the current generated big switch
>> statement is a big, balanced, open coded if/else tree (I'm giving gcc
>> the benefit of the doubt here, I didn't validate that the tree was
>> balanced). It would be natural to sort the new enum table and use
>> bsearch(), but I decided to use a read-only hash table instead.
>> bsearch() has a nice guaranteed worst case performance, but we're also
>> guaranteed to hit that worst case (log2(n) iterations) for about half
>> the enums. Instead, using a simple, direct hashing hash table, we can
>> find the enum on the first try for 80% of the enums, 1 collision for
>> 10% and never more than 5 collisions for any enum (typical numbers).
>> And the code is very simple, even though it feels a little magic.
>>
>> Benefits:
>>
>> 1) Smaller. Much smaller. Generated code is much bigger than the
>> corresponding data tables. Looking at an i965 DRI driver with GLES1
>> and GLES2 APIs enabled we get:
>>
>> [krh at hinata mesa]$ size lib/i965_dri*.so
>> text data bss dec hex filename
>> 2658275 29132 61664 2749071 29f28f lib/i965_dri_old.so
>> 2505275 36980 63712 2605967 27c38f lib/i965_dri.so
>>
>> That is, a 140kb difference, or a 5% size reduction. And since the
>> reduction is in libmesa.a, it applies to all DRI drivers, which adds
>> up to a nice space savings if you're to squeeze 14 DRI drivers onto a
>> live CD (looking at Fedoras mesa-dri-drivers RPM).
>
> The size savings is probably even greater when compared to what's in Mesa
> 7.8 since I consolidated some error handling code in master a few weeks ago.
>
>
>> 2) Faster; the hash table will find the enum in zero to one
>> iterations most of the time and never more that five. Of course, this
>> is all academic, since glGet*() aren't typically in any kind of
>> hotpath, but it's nice to just verify that we're not replacing get.c
>> with something slower.
>>
>> 2) No code-generation, the C file *is* the spec and is about as
>> concise as the python script was.
>>
>> 3) A non-hacky glGetDoublev(). The current implementation calls
>> glGetFloatv() with a local variable array, which it fills with the
>> magic value -1234.5 to be able to determine how many values was
>> returned from glGetFloatv(). So if your matrix has an entry with the
>> value -1234.5 you're out of luck.
>>
>> 4) A clean way to integrate get.c, get-es1.c and get-es2.c. We can
>> initialize the hash table with the values that are valid for the API
>> we're initializing and use the same _mesa_Get*() entry points to
>> implement the glGet* functions for the different APIs.
>>
>> Drawbacks:
>>
>> 1) Uhm, regressions? I went back and double checked the new get.c
>> against the enum list in get_gen.py after finishing the patch. While
>> I didn't find any inconsistencies, it's a long list and I may have
>> overlooked something. I'm running piglit on it now, but I suspect
>> I'll have to add a few testcases to hit the different code paths in
>> the new glGet*() implementation.
>>
>> 2) More complex code (though if you consider the get-gen.py script,
>> it's probably about the same total complexity as the current
>> solution).
>>
>> Let me know what you think about this - I'd like to merge it once I've
>> tested it a bit.
>
> Sounds good. If you could send me the new get.[ch] files I could do a bit
> more testing here.
I can send it to you if you really want, but there are a few changes
in other files (for exampel, to initialize the hash table in
one_time_init()). It's probably easier if you just pull the branch
from my git tree. The fastest way to do this is something like:
[krh at hinata mesa-7.8]$ git fetch
git://people.freedesktop.org/~krh/mesa get-optimagix
[krh at hinata mesa-7.8]$ git checkout -b test FETCH_HEAD
from a mesa checkout, which will create a new branch called test where
you can take a look at the new get.c.
Kristian
More information about the mesa-dev
mailing list