[Mesa-dev] mesa: Reduce libGL.so binary size by about 15%
Andreas Boll
andreas.boll.dev at gmail.com
Thu Jan 21 06:18:00 PST 2016
I've pushed this patch.
Thanks,
Andreas
2016-01-21 1:35 GMT+01:00 Jeremy Huddleston Sequoia <jeremyhu at apple.com>:
> Sorry, I thought I responded to this a while ago.
>
> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu at apple.com>
> Tested-by: Jeremy Huddleston Sequoia <jeremyhu at apple.com>
>
> --Jeremy
>
>> On Jan 19, 2016, at 05:21, Andreas Boll <andreas.boll.dev at gmail.com> wrote:
>>
>> Jeremy, did you have a chance to test this patch?
>> This patch would be still useful for OS X. For non-OS X this patch [1]
>> reduces the size of libGL.so further more.
>>
>> Thanks,
>> Andreas
>>
>> [1] https://patchwork.freedesktop.org/patch/70372/
>>
>> 2015-09-28 19:46 GMT+02:00 Jeremy Huddleston Sequoia <jeremyhu at apple.com>:
>>> I'll give it a go.
>>>
>>> It is still needed on OS X (and I think Windows). It's just not used by the X server any more.
>>>
>>> --Jeremy
>>>
>>>> On Sep 28, 2015, at 10:26, Arlie Davis <arlied at google.com> wrote:
>>>>
>>>> I tried building Mesa on OS X, but I'm not nearly as familiar with development on OS X, so I wasn't able to get it to build. If someone could build / test that on OS X, it would certainly give more confidence in its correctness. I *think* the generated code is correct, but you know how that is.
>>>>
>>>> If there is no need for this on OS X any longer, then the best thing might be to remove it entirely.
>>>>
>>>> On Sat, Sep 26, 2015 at 5:56 PM, Jeremy Huddleston Sequoia <jeremyhu at apple.com> wrote:
>>>> Reviewing diffs of code that generates code is always ick. =(
>>>>
>>>> This *looks* right to me, but has it been given a beating for correctness? If not, let me know, and I'll give it a whirl when I have some cycles.
>>>>
>>>> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu at apple.com>
>>>>
>>>> ---
>>>>
>>>> You're right that this used to be use in xserver as well, but that was removed in:
>>>>
>>>> commit e61e19959d9138d5b81b1f25b7aa3e257918170d
>>>> Author: Adam Jackson <ajax at redhat.com>
>>>> Date: Tue Dec 3 13:45:43 2013 -0500
>>>>
>>>> xquartz/glx: Convert to non-glapi dispatch
>>>>
>>>> CGL doesn't have anything like glXGetProcAddress, and the old code just
>>>> called down to dlsym in any case. It's a little mind-warping since
>>>> dlopening a framework actually loads multiple dylibs, but that's just
>>>> how OSX rolls.
>>>>
>>>> Signed-off-by: Adam Jackson <ajax at redhat.com>
>>>> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu at apple.com>
>>>>
>>>>
>>>>> On Sep 22, 2015, at 15:55, Ian Romanick <idr at freedesktop.org> wrote:
>>>>>
>>>>> On 09/17/2015 03:19 PM, Arlie Davis wrote:
>>>>>> Ok, here's v2 of the change, with the suggested edits.
>>>>>
>>>>> So... I think this code is fine, and I admire the effort. I have a
>>>>> couple concerns.
>>>>>
>>>>> 1. We have no way to test this, so it's quite possible something was broken.
>>>>>
>>>>> 2. This function is only used in the OSX builds. Jeremy is the
>>>>> maintainer for those builds, so I've added him to the CC list.
>>>>>
>>>>> For every non-OSX build, we should just stop linking
>>>>> src/mapi/glapi/glapi_gentable.c. I thought maybe the X sever used it,
>>>>> but I couldn't find any evidence of that.
>>>>>
>>>>> If this is still a viable route, I have a few suggestions of follow-on
>>>>> patches...
>>>>>
>>>>> I guess this patch is
>>>>>
>>>>> Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
>>>>>
>>>>> but I really think we need to get Jeremy's approval before pushing it.
>>>>>
>>>>>> From 5f393faa058f453408dfc640eecae3fe6335dfed Mon Sep 17 00:00:00 2001
>>>>>> From: Arlie Davis <arlied at google.com>
>>>>>> Date: Tue, 15 Sep 2015 09:58:34 -0700
>>>>>> Subject: [PATCH] This patch significantly reduces the size of the libGL.so
>>>>>> binary. It does not change the (externally visible) behavior of libGL.so at
>>>>>> all.
>>>>>>
>>>>>> gl_gentable.py generates a function, _glapi_create_table_from_handle.
>>>>>> This function allocates a large dispatch table, consisting of 1300 or so
>>>>>> function pointers, and fills this dispatch table by doing symbol lookups
>>>>>> on a given shared library. Previously, gl_gentable.py would generate a
>>>>>> single, very large _glapi_create_table_from_handle function, with a short
>>>>>> cluster of lines for each entry point (function). The idiom it generates
>>>>>> was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
>>>>>> and then a store into the dispatch table. Since this function processes
>>>>>> a large number of entry points, this code is duplicated many times over.
>>>>>>
>>>>>> We can encode the same information much more compactly, by using a lookup
>>>>>> table. The previous total size of _glapi_create_table_from_handle on x64
>>>>>> was 125848 bytes. By using a lookup table, the size of
>>>>>> _glapi_create_table_from_handle (and the related lookup tables) is reduced
>>>>>> to 10840 bytes. In other words, this enormous function is reduced by 91%.
>>>>>> The size of the entire libGL.so binary (measured when stripped) itself drops
>>>>>> by 15%.
>>>>>>
>>>>>> So the purpose of this change is to reduce the binary size, which frees up
>>>>>> disk space, memory, etc.
>>>>>> ---
>>>>>> src/mapi/glapi/gen/gl_gentable.py | 57 ++++++++++++++++++++++++++++-----------
>>>>>> 1 file changed, 41 insertions(+), 16 deletions(-)
>>>>>>
>>>>>> diff --git a/src/mapi/glapi/gen/gl_gentable.py b/src/mapi/glapi/gen/gl_gentable.py
>>>>>> index 1b3eb72..7cd475a 100644
>>>>>> --- a/src/mapi/glapi/gen/gl_gentable.py
>>>>>> +++ b/src/mapi/glapi/gen/gl_gentable.py
>>>>>> @@ -113,6 +113,9 @@ __glapi_gentable_set_remaining_noop(struct _glapi_table *disp) {
>>>>>> dispatch[i] = p.v;
>>>>>> }
>>>>>>
>>>>>> +"""
>>>>>> +
>>>>>> +footer = """
>>>>>> struct _glapi_table *
>>>>>> _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
>>>>>> struct _glapi_table *disp = calloc(_glapi_get_dispatch_table_size(), sizeof(_glapi_proc));
>>>>>> @@ -123,27 +126,28 @@ _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
>>>>>>
>>>>>> if(symbol_prefix == NULL)
>>>>>> symbol_prefix = "";
>>>>>> -"""
>>>>>>
>>>>>> -footer = """
>>>>>> - __glapi_gentable_set_remaining_noop(disp);
>>>>>> -
>>>>>> - return disp;
>>>>>> -}
>>>>>> -"""
>>>>>> + /* Note: This code relies on _glapi_table_func_names being sorted by the
>>>>>> + * entry point index of each function.
>>>>>> + */
>>>>>> + for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
>>>>>> + const char *name = _glapi_table_func_names[func_index];
>>>>>> + void ** procp = &((void **)disp)[func_index];
>>>>>>
>>>>>> -body_template = """
>>>>>> - if(!disp->%(name)s) {
>>>>>> - void ** procp = (void **) &disp->%(name)s;
>>>>>> - snprintf(symboln, sizeof(symboln), "%%s%(entry_point)s", symbol_prefix);
>>>>>> + snprintf(symboln, sizeof(symboln), \"%s%s\", symbol_prefix, name);
>>>>>> #ifdef _WIN32
>>>>>> *procp = GetProcAddress(handle, symboln);
>>>>>> #else
>>>>>> *procp = dlsym(handle, symboln);
>>>>>> #endif
>>>>>> }
>>>>>> + __glapi_gentable_set_remaining_noop(disp);
>>>>>> +
>>>>>> + return disp;
>>>>>> +}
>>>>>> """
>>>>>>
>>>>>> +
>>>>>> class PrintCode(gl_XML.gl_print_base):
>>>>>>
>>>>>> def __init__(self):
>>>>>> @@ -180,12 +184,33 @@ class PrintCode(gl_XML.gl_print_base):
>>>>>>
>>>>>>
>>>>>> def printBody(self, api):
>>>>>> - for f in api.functionIterateByOffset():
>>>>>> - for entry_point in f.entry_points:
>>>>>> - vars = { 'entry_point' : entry_point,
>>>>>> - 'name' : f.name }
>>>>>>
>>>>>> - print body_template % vars
>>>>>> + # Determine how many functions have a defined offset.
>>>>>> + func_count = 0
>>>>>> + for f in api.functions_by_name.itervalues():
>>>>>> + if f.offset != -1:
>>>>>> + func_count += 1
>>>>>> +
>>>>>> + # Build the mapping from offset to function name.
>>>>>> + funcnames = [None] * func_count
>>>>>> + for f in api.functions_by_name.itervalues():
>>>>>> + if f.offset != -1:
>>>>>> + if not (funcnames[f.offset] is None):
>>>>>> + raise Exception("Function table has more than one function with same offset (offset %d, func %s)" % (f.offset, f.name))
>>>>>> + funcnames[f.offset] = f.name
>>>>>> +
>>>>>> + # Check that the table has no gaps. We expect a function at every offset,
>>>>>> + # and the code which generates the table relies on this.
>>>>>> + for i in xrange(0, func_count):
>>>>>> + if funcnames[i] is None:
>>>>>> + raise Exception("Function table has no function at offset %d" % (i))
>>>>>> +
>>>>>> + print "#define GLAPI_TABLE_COUNT %d" % func_count
>>>>>> + print "static const char * const _glapi_table_func_names[GLAPI_TABLE_COUNT] = {"
>>>>>> + for i in xrange(0, func_count):
>>>>>> + print " /* %5d */ \"%s\"," % (i, funcnames[i])
>>>>>> + print "};"
>>>>>> +
>>>>>> return
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 09/16/2015 07:07 AM, Matt Turner wrote:
>>>>>>> On Tue, Sep 15, 2015 at 10:38 AM, Arlie Davis <arlied at google.com> wrote:
>>>>>>>>
>>>>>>>> Hello! I noticed an inefficiency in libGL.so, so I thought I'd take a
>>>>>>>> stab at fixing it. This is my first patch submitted to mesa-dev, so
>>>>>>>> if I'm doing anything dumb, let me know. I can't use git send-email,
>>>>>>>> but I've formatted the patch using git format-patch, which should
>>>>>>>> hopefully produce similar output. The patch text (below) describes
>>>>>>>> the inefficiency and the improvement.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> From 0abde226eed1b9f6052193f36f6cdc060698f621 Mon Sep 17 00:00:00 2001
>>>>>>>> From: Arlie Davis <arlied at google.com>
>>>>>>>> Date: Tue, 15 Sep 2015 09:58:34 -0700
>>>>>>>> Subject: [PATCH] This patch significantly reduces the size of the libGL.so
>>>>>>>> binary. It does not change the (externally visible) behavior of libGL.so at
>>>>>>>> all.
>>>>>>>>
>>>>>>>> gl_gentable.py generates a function, _glapi_create_table_from_handle.
>>>>>>>> This function allocates a large dispatch table, consisting of 1300 or so
>>>>>>>> function pointers, and fills this dispatch table by doing symbol lookups
>>>>>>>> on a given shared library. Previously, gl_gentable.py would generate a
>>>>>>>> single, very large _glapi_create_table_from_handle function, with a short
>>>>>>>> cluster of lines for each entry point (function). The idiom it generates
>>>>>>>> was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
>>>>>>>> and then a store into the dispatch table. Since this function processes
>>>>>>>> a large number of entry points, this code is duplicated many times over.
>>>>>>>>
>>>>>>>> We can encode the same information much more compactly, by using a lookup
>>>>>>>> table. The previous total size of _glapi_create_table_from_handle on x64
>>>>>>>> was 125848 bytes. By using a lookup table, the size of
>>>>>>>> _glapi_create_table_from_handle (and the related lookup tables) is reduced
>>>>>>>> to 10840 bytes. In other words, this enormous function is reduced by 91%.
>>>>>>>> The size of the entire libGL.so binary (measured when stripped) itself drops
>>>>>>>> by 15%.
>>>>>>>>
>>>>>>>> So the purpose of this change is to reduce the binary size, which frees up
>>>>>>>> disk space, memory, etc.
>>>>>>>> ---
>>>>>>>
>>>>>>> Seems like a nice change. size lib/libGL.so.1.2.0 on my system shows
>>>>>>>
>>>>>>> text data bss dec hex filename
>>>>>>> 604031 11360 2792 618183 96ec7 lib/libGL.so.1.2.0 before
>>>>>>> 490751 21920 2792 515463 7dd87 lib/libGL.so.1.2.0 after
>>>>>>>
>>>>>>> Feel free to include that in the commit message.
>>>>>>>
>>>>>>>> src/mapi/glapi/gen/gl_gentable.py | 56 ++++++++++++++++++++++++++++-----------
>>>>>>>> 1 file changed, 40 insertions(+), 16 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/src/mapi/glapi/gen/gl_gentable.py b/src/mapi/glapi/gen/gl_gentable.py
>>>>>>>> index 1b3eb72..2563b6b 100644
>>>>>>>> --- a/src/mapi/glapi/gen/gl_gentable.py
>>>>>>>> +++ b/src/mapi/glapi/gen/gl_gentable.py
>>>>>>>> @@ -113,6 +113,9 @@ __glapi_gentable_set_remaining_noop(struct _glapi_table *disp) {
>>>>>>>> dispatch[i] = p.v;
>>>>>>>> }
>>>>>>>>
>>>>>>>> +"""
>>>>>>>> +
>>>>>>>> +footer = """
>>>>>>>> struct _glapi_table *
>>>>>>>> _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
>>>>>>>> struct _glapi_table *disp = calloc(_glapi_get_dispatch_table_size(), sizeof(_glapi_proc));
>>>>>>>> @@ -123,27 +126,27 @@ _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
>>>>>>>>
>>>>>>>> if(symbol_prefix == NULL)
>>>>>>>> symbol_prefix = "";
>>>>>>>> -"""
>>>>>>>>
>>>>>>>> -footer = """
>>>>>>>> - __glapi_gentable_set_remaining_noop(disp);
>>>>>>>> -
>>>>>>>> - return disp;
>>>>>>>> -}
>>>>>>>> -"""
>>>>>>>> + /* Note: This code relies on _glapi_table_func_names being sorted by the
>>>>>>>> + entry point index of each function. */
>>>>>>>
>>>>>>> Mesa style puts the */ on its own line for multiline comments.
>>>>>>>
>>>>>>>> + for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
>>>>>>>> + const char* name = _glapi_table_func_names[func_index];
>>>>>>>
>>>>>>> * goes with the var name, not the type. That is, "char* " should be "char *"
>>>>>>>
>>>>>>>> + void ** procp = &((void **)disp)[func_index];
>>>>>>>>
>>>>>>>> -body_template = """
>>>>>>>> - if(!disp->%(name)s) {
>>>>>>>
>>>>>>> We're removing the null check. Is that okay to do?
>>>>>>>
>>>>>>>> - void ** procp = (void **) &disp->%(name)s;
>>>>>>>> - snprintf(symboln, sizeof(symboln), "%%s%(entry_point)s", symbol_prefix);
>>>>>>>> + snprintf(symboln, sizeof(symboln), \"%s%s\", symbol_prefix, name);
>>>>>>>> #ifdef _WIN32
>>>>>>>> *procp = GetProcAddress(handle, symboln);
>>>>>>>> #else
>>>>>>>> *procp = dlsym(handle, symboln);
>>>>>>>> #endif
>>>>>>>> }
>>>>>>>> + __glapi_gentable_set_remaining_noop(disp);
>>>>>>>> +
>>>>>>>> + return disp;
>>>>>>>> +}
>>>>>>>> """
>>>>>>>>
>>>>>>>> +
>>>>>>>> class PrintCode(gl_XML.gl_print_base):
>>>>>>>>
>>>>>>>> def __init__(self):
>>>>>>>> @@ -180,12 +183,33 @@ class PrintCode(gl_XML.gl_print_base):
>>>>>>>>
>>>>>>>>
>>>>>>>> def printBody(self, api):
>>>>>>>> - for f in api.functionIterateByOffset():
>>>>>>>> - for entry_point in f.entry_points:
>>>>>>>> - vars = { 'entry_point' : entry_point,
>>>>>>>> - 'name' : f.name }
>>>>>>>>
>>>>>>>> - print body_template % vars
>>>>>>>> + # Determine how many functions have a defined offset.
>>>>>>>> + func_count = 0
>>>>>>>> + for f in api.functions_by_name.itervalues():
>>>>>>>> + if f.offset != -1:
>>>>>>>> + func_count += 1
>>>>>>>> +
>>>>>>>> + # Build the mapping from offset to function name.
>>>>>>>> + funcnames = [None] * func_count
>>>>>>>> + for f in api.functions_by_name.itervalues():
>>>>>>>> + if f.offset != -1:
>>>>>>>> + if not (funcnames[f.offset] is None):
>>>>>>>> + raise Exception("Function table has more than one function with same offset (offset %d, func %s)" % (f.offset, f.name))
>>>>>>>> + funcnames[f.offset] = f.name
>>>>>>>> +
>>>>>>>> + # Check that the table has no gaps. We expect a function at every offset,
>>>>>>>> + # and the code which generates the table relies on this.
>>>>>>>> + for i in range(0, func_count):
>>>>>>>
>>>>>>> I don't know much python, but I think you want to use xrange instead of range.
>>>>>>>
>>>>>>>> + if funcnames[i] is None:
>>>>>>>> + raise Exception("Function table has no function at offset %d" % (i))
>>>>>>>> +
>>>>>>>> + print "#define GLAPI_TABLE_COUNT %d" % func_count
>>>>>>>> + print "static const char* const _glapi_table_func_names[GLAPI_TABLE_COUNT] = {"
>>>>>>>
>>>>>>> "char*" -> "char *"
>>>>>>>
>>>>>>>> + for i in range(0, func_count):
>>>>>>>
>>>>>>> xrange instead of range, I think.
>>>>>>>
>>>>>>>> + print " /* %5d */ \"%s\"," % (i, funcnames[i])
>>>>>>>> + print "};"
>>>>>>>> +
>>>>>>>> return
>>>>>> _______________________________________________
>>>>>> mesa-dev mailing list
>>>>>> mesa-dev at lists.freedesktop.org
>>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>
More information about the mesa-dev
mailing list