[Mesa-dev] mesa: Reduce libGL.so binary size by about 15%

Jeremy Huddleston Sequoia jeremyhu at apple.com
Mon Sep 28 10:46:03 PDT 2015


I'll give it a go.

It is still needed on OS X (and I think Windows).  It's just not used by the X server any more.

--Jeremy

> On Sep 28, 2015, at 10:26, Arlie Davis <arlied at google.com> wrote:
> 
> I tried building Mesa on OS X, but I'm not nearly as familiar with development on OS X, so I wasn't able to get it to build.  If someone could build / test that on OS X, it would certainly give more confidence in its correctness.  I *think* the generated code is correct, but you know how that is.
> 
> If there is no need for this on OS X any longer, then the best thing might be to remove it entirely.
> 
> On Sat, Sep 26, 2015 at 5:56 PM, Jeremy Huddleston Sequoia <jeremyhu at apple.com> wrote:
> Reviewing diffs of code that generates code is always ick. =(
> 
> This *looks* right to me, but has it been given a beating for correctness?  If not, let me know, and I'll give it a whirl when I have some cycles.
> 
> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu at apple.com>
> 
> ---
> 
> You're right that this used to be use in xserver as well, but that was removed in:
> 
> commit e61e19959d9138d5b81b1f25b7aa3e257918170d
> Author: Adam Jackson <ajax at redhat.com>
> Date:   Tue Dec 3 13:45:43 2013 -0500
> 
>     xquartz/glx: Convert to non-glapi dispatch
> 
>     CGL doesn't have anything like glXGetProcAddress, and the old code just
>     called down to dlsym in any case.  It's a little mind-warping since
>     dlopening a framework actually loads multiple dylibs, but that's just
>     how OSX rolls.
> 
>     Signed-off-by: Adam Jackson <ajax at redhat.com>
>     Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu at apple.com>
> 
> 
> > On Sep 22, 2015, at 15:55, Ian Romanick <idr at freedesktop.org> wrote:
> >
> > On 09/17/2015 03:19 PM, Arlie Davis wrote:
> >> Ok, here's v2 of the change, with the suggested edits.
> >
> > So... I think this code is fine, and I admire the effort.  I have a
> > couple concerns.
> >
> > 1. We have no way to test this, so it's quite possible something was broken.
> >
> > 2. This function is only used in the OSX builds.  Jeremy is the
> > maintainer for those builds, so I've added him to the CC list.
> >
> > For every non-OSX build, we should just stop linking
> > src/mapi/glapi/glapi_gentable.c.  I thought maybe the X sever used it,
> > but I couldn't find any evidence of that.
> >
> > If this is still a viable route, I have a few suggestions of follow-on
> > patches...
> >
> > I guess this patch is
> >
> > Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
> >
> > but I really think we need to get Jeremy's approval before pushing it.
> >
> >> From 5f393faa058f453408dfc640eecae3fe6335dfed Mon Sep 17 00:00:00 2001
> >> From: Arlie Davis <arlied at google.com>
> >> Date: Tue, 15 Sep 2015 09:58:34 -0700
> >> Subject: [PATCH] This patch significantly reduces the size of the libGL.so
> >> binary. It does not change the (externally visible) behavior of libGL.so at
> >> all.
> >>
> >> gl_gentable.py generates a function, _glapi_create_table_from_handle.
> >> This function allocates a large dispatch table, consisting of 1300 or so
> >> function pointers, and fills this dispatch table by doing symbol lookups
> >> on a given shared library.  Previously, gl_gentable.py would generate a
> >> single, very large _glapi_create_table_from_handle function, with a short
> >> cluster of lines for each entry point (function).  The idiom it generates
> >> was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
> >> and then a store into the dispatch table.  Since this function processes
> >> a large number of entry points, this code is duplicated many times over.
> >>
> >> We can encode the same information much more compactly, by using a lookup
> >> table.  The previous total size of _glapi_create_table_from_handle on x64
> >> was 125848 bytes.  By using a lookup table, the size of
> >> _glapi_create_table_from_handle (and the related lookup tables) is reduced
> >> to 10840 bytes.  In other words, this enormous function is reduced by 91%.
> >> The size of the entire libGL.so binary (measured when stripped) itself drops
> >> by 15%.
> >>
> >> So the purpose of this change is to reduce the binary size, which frees up
> >> disk space, memory, etc.
> >> ---
> >> src/mapi/glapi/gen/gl_gentable.py | 57 ++++++++++++++++++++++++++++-----------
> >> 1 file changed, 41 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/src/mapi/glapi/gen/gl_gentable.py b/src/mapi/glapi/gen/gl_gentable.py
> >> index 1b3eb72..7cd475a 100644
> >> --- a/src/mapi/glapi/gen/gl_gentable.py
> >> +++ b/src/mapi/glapi/gen/gl_gentable.py
> >> @@ -113,6 +113,9 @@ __glapi_gentable_set_remaining_noop(struct _glapi_table *disp) {
> >>             dispatch[i] = p.v;
> >> }
> >>
> >> +"""
> >> +
> >> +footer = """
> >> struct _glapi_table *
> >> _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
> >>     struct _glapi_table *disp = calloc(_glapi_get_dispatch_table_size(), sizeof(_glapi_proc));
> >> @@ -123,27 +126,28 @@ _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
> >>
> >>     if(symbol_prefix == NULL)
> >>         symbol_prefix = "";
> >> -"""
> >>
> >> -footer = """
> >> -    __glapi_gentable_set_remaining_noop(disp);
> >> -
> >> -    return disp;
> >> -}
> >> -"""
> >> +    /* Note: This code relies on _glapi_table_func_names being sorted by the
> >> +     * entry point index of each function.
> >> +     */
> >> +    for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
> >> +        const char *name = _glapi_table_func_names[func_index];
> >> +        void ** procp = &((void **)disp)[func_index];
> >>
> >> -body_template = """
> >> -    if(!disp->%(name)s) {
> >> -        void ** procp = (void **) &disp->%(name)s;
> >> -        snprintf(symboln, sizeof(symboln), "%%s%(entry_point)s", symbol_prefix);
> >> +        snprintf(symboln, sizeof(symboln), \"%s%s\", symbol_prefix, name);
> >> #ifdef _WIN32
> >>         *procp = GetProcAddress(handle, symboln);
> >> #else
> >>         *procp = dlsym(handle, symboln);
> >> #endif
> >>     }
> >> +    __glapi_gentable_set_remaining_noop(disp);
> >> +
> >> +    return disp;
> >> +}
> >> """
> >>
> >> +
> >> class PrintCode(gl_XML.gl_print_base):
> >>
> >>     def __init__(self):
> >> @@ -180,12 +184,33 @@ class PrintCode(gl_XML.gl_print_base):
> >>
> >>
> >>     def printBody(self, api):
> >> -        for f in api.functionIterateByOffset():
> >> -            for entry_point in f.entry_points:
> >> -                vars = { 'entry_point' : entry_point,
> >> -                         'name' : f.name }
> >>
> >> -                print body_template % vars
> >> +        # Determine how many functions have a defined offset.
> >> +        func_count = 0
> >> +        for f in api.functions_by_name.itervalues():
> >> +            if f.offset != -1:
> >> +                func_count += 1
> >> +
> >> +        # Build the mapping from offset to function name.
> >> +        funcnames = [None] * func_count
> >> +        for f in api.functions_by_name.itervalues():
> >> +            if f.offset != -1:
> >> +                if not (funcnames[f.offset] is None):
> >> +                    raise Exception("Function table has more than one function with same offset (offset %d, func %s)" % (f.offset, f.name))
> >> +                funcnames[f.offset] = f.name
> >> +
> >> +        # Check that the table has no gaps.  We expect a function at every offset,
> >> +        # and the code which generates the table relies on this.
> >> +        for i in xrange(0, func_count):
> >> +            if funcnames[i] is None:
> >> +                raise Exception("Function table has no function at offset %d" % (i))
> >> +
> >> +        print "#define GLAPI_TABLE_COUNT %d" % func_count
> >> +        print "static const char * const _glapi_table_func_names[GLAPI_TABLE_COUNT] = {"
> >> +        for i in xrange(0, func_count):
> >> +            print "    /* %5d */ \"%s\"," % (i, funcnames[i])
> >> +        print "};"
> >> +
> >>         return
> >>
> >>
> >>
> >>
> >>
> >>
> >> On 09/16/2015 07:07 AM, Matt Turner wrote:
> >>> On Tue, Sep 15, 2015 at 10:38 AM, Arlie Davis <arlied at google.com> wrote:
> >>>>
> >>>> Hello!  I noticed an inefficiency in libGL.so, so I thought I'd take a
> >>>> stab at fixing it.  This is my first patch submitted to mesa-dev, so
> >>>> if I'm doing anything dumb, let me know.  I can't use git send-email,
> >>>> but I've formatted the patch using git format-patch, which should
> >>>> hopefully produce similar output.  The patch text (below) describes
> >>>> the inefficiency and the improvement.
> >>>>
> >>>>
> >>>>
> >>>> From 0abde226eed1b9f6052193f36f6cdc060698f621 Mon Sep 17 00:00:00 2001
> >>>> From: Arlie Davis <arlied at google.com>
> >>>> Date: Tue, 15 Sep 2015 09:58:34 -0700
> >>>> Subject: [PATCH] This patch significantly reduces the size of the libGL.so
> >>>>  binary. It does not change the (externally visible) behavior of libGL.so at
> >>>>  all.
> >>>>
> >>>> gl_gentable.py generates a function, _glapi_create_table_from_handle.
> >>>> This function allocates a large dispatch table, consisting of 1300 or so
> >>>> function pointers, and fills this dispatch table by doing symbol lookups
> >>>> on a given shared library.  Previously, gl_gentable.py would generate a
> >>>> single, very large _glapi_create_table_from_handle function, with a short
> >>>> cluster of lines for each entry point (function).  The idiom it generates
> >>>> was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
> >>>> and then a store into the dispatch table.  Since this function processes
> >>>> a large number of entry points, this code is duplicated many times over.
> >>>>
> >>>> We can encode the same information much more compactly, by using a lookup
> >>>> table.  The previous total size of _glapi_create_table_from_handle on x64
> >>>> was 125848 bytes.  By using a lookup table, the size of
> >>>> _glapi_create_table_from_handle (and the related lookup tables) is reduced
> >>>> to 10840 bytes.  In other words, this enormous function is reduced by 91%.
> >>>> The size of the entire libGL.so binary (measured when stripped) itself drops
> >>>> by 15%.
> >>>>
> >>>> So the purpose of this change is to reduce the binary size, which frees up
> >>>> disk space, memory, etc.
> >>>> ---
> >>>
> >>> Seems like a nice change. size lib/libGL.so.1.2.0 on my system shows
> >>>
> >>>    text   data    bss    dec    hex filename
> >>>  604031  11360   2792 618183  96ec7 lib/libGL.so.1.2.0 before
> >>>  490751  21920   2792 515463  7dd87 lib/libGL.so.1.2.0 after
> >>>
> >>> Feel free to include that in the commit message.
> >>>
> >>>>  src/mapi/glapi/gen/gl_gentable.py | 56 ++++++++++++++++++++++++++++-----------
> >>>>  1 file changed, 40 insertions(+), 16 deletions(-)
> >>>>
> >>>> diff --git a/src/mapi/glapi/gen/gl_gentable.py b/src/mapi/glapi/gen/gl_gentable.py
> >>>> index 1b3eb72..2563b6b 100644
> >>>> --- a/src/mapi/glapi/gen/gl_gentable.py
> >>>> +++ b/src/mapi/glapi/gen/gl_gentable.py
> >>>> @@ -113,6 +113,9 @@ __glapi_gentable_set_remaining_noop(struct _glapi_table *disp) {
> >>>>              dispatch[i] = p.v;
> >>>>  }
> >>>>
> >>>> +"""
> >>>> +
> >>>> +footer = """
> >>>>  struct _glapi_table *
> >>>>  _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
> >>>>      struct _glapi_table *disp = calloc(_glapi_get_dispatch_table_size(), sizeof(_glapi_proc));
> >>>> @@ -123,27 +126,27 @@ _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
> >>>>
> >>>>      if(symbol_prefix == NULL)
> >>>>          symbol_prefix = "";
> >>>> -"""
> >>>>
> >>>> -footer = """
> >>>> -    __glapi_gentable_set_remaining_noop(disp);
> >>>> -
> >>>> -    return disp;
> >>>> -}
> >>>> -"""
> >>>> +    /* Note: This code relies on _glapi_table_func_names being sorted by the
> >>>> +       entry point index of each function. */
> >>>
> >>> Mesa style puts the */ on its own line for multiline comments.
> >>>
> >>>> +    for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
> >>>> +        const char* name = _glapi_table_func_names[func_index];
> >>>
> >>> * goes with the var name, not the type. That is, "char* " should be "char *"
> >>>
> >>>> +        void ** procp = &((void **)disp)[func_index];
> >>>>
> >>>> -body_template = """
> >>>> -    if(!disp->%(name)s) {
> >>>
> >>> We're removing the null check. Is that okay to do?
> >>>
> >>>> -        void ** procp = (void **) &disp->%(name)s;
> >>>> -        snprintf(symboln, sizeof(symboln), "%%s%(entry_point)s", symbol_prefix);
> >>>> +        snprintf(symboln, sizeof(symboln), \"%s%s\", symbol_prefix, name);
> >>>>  #ifdef _WIN32
> >>>>          *procp = GetProcAddress(handle, symboln);
> >>>>  #else
> >>>>          *procp = dlsym(handle, symboln);
> >>>>  #endif
> >>>>      }
> >>>> +    __glapi_gentable_set_remaining_noop(disp);
> >>>> +
> >>>> +    return disp;
> >>>> +}
> >>>>  """
> >>>>
> >>>> +
> >>>>  class PrintCode(gl_XML.gl_print_base):
> >>>>
> >>>>      def __init__(self):
> >>>> @@ -180,12 +183,33 @@ class PrintCode(gl_XML.gl_print_base):
> >>>>
> >>>>
> >>>>      def printBody(self, api):
> >>>> -        for f in api.functionIterateByOffset():
> >>>> -            for entry_point in f.entry_points:
> >>>> -                vars = { 'entry_point' : entry_point,
> >>>> -                         'name' : f.name }
> >>>>
> >>>> -                print body_template % vars
> >>>> +        # Determine how many functions have a defined offset.
> >>>> +        func_count = 0
> >>>> +        for f in api.functions_by_name.itervalues():
> >>>> +            if f.offset != -1:
> >>>> +                func_count += 1
> >>>> +
> >>>> +        # Build the mapping from offset to function name.
> >>>> +        funcnames = [None] * func_count
> >>>> +        for f in api.functions_by_name.itervalues():
> >>>> +            if f.offset != -1:
> >>>> +                if not (funcnames[f.offset] is None):
> >>>> +                    raise Exception("Function table has more than one function with same offset (offset %d, func %s)" % (f.offset, f.name))
> >>>> +                funcnames[f.offset] = f.name
> >>>> +
> >>>> +        # Check that the table has no gaps.  We expect a function at every offset,
> >>>> +        # and the code which generates the table relies on this.
> >>>> +        for i in range(0, func_count):
> >>>
> >>> I don't know much python, but I think you want to use xrange instead of range.
> >>>
> >>>> +            if funcnames[i] is None:
> >>>> +                raise Exception("Function table has no function at offset %d" % (i))
> >>>> +
> >>>> +        print "#define GLAPI_TABLE_COUNT %d" % func_count
> >>>> +        print "static const char* const _glapi_table_func_names[GLAPI_TABLE_COUNT] = {"
> >>>
> >>> "char*" -> "char *"
> >>>
> >>>> +        for i in range(0, func_count):
> >>>
> >>> xrange instead of range, I think.
> >>>
> >>>> +            print "    /* %5d */ \"%s\"," % (i, funcnames[i])
> >>>> +        print "};"
> >>>> +
> >>>>          return
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>
> >
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4109 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150928/96f667ce/attachment-0001.bin>


More information about the mesa-dev mailing list