[Mesa-dev] [PATCH 1/9] mesa: Handle common extension checks with more compact code

Nanley Chery nanleychery at gmail.com
Mon May 22 20:25:20 UTC 2017


On Fri, May 19, 2017 at 04:29:16PM -0700, Ian Romanick wrote:
> On 05/19/2017 10:28 AM, Nanley Chery wrote:
> > On Fri, May 19, 2017 at 06:38:03AM -0700, Ian Romanick wrote:
> >> From: Ian Romanick <ian.d.romanick at intel.com>
> >>
> >> The previous code handled everything with the general case.  I noticed
> >> that every time I converted an open-coded check to use a
> >> _mesa_has_EXT_foo() function, the text size of the driver increased.
> >>
> >> Almost all extensions only care what the current context API is, and
> >> the version does not matter.  Handle those using more compact checks.
> >>
> >>    text	   data	    bss	    dec	    hex	filename
> >> 7037675	 235248	  37280	7310203	 6f8b7b	32-bit i965_dri.so before
> >> 7034307	 235248	  37280	7306835	 6f7e53	32-bit i965_dri.so after
> >> 6679695	 303400	  50608	7033703	 6b5367	64-bit i965_dri.so before
> >> 6676143	 303400	  50608	7030151	 6b4587	64-bit i965_dri.so after
> > 
> > Hi Ian,
> > 
> > I wrote a patch some time ago that reduces the cost of the extension
> > checks by a lot more with less code. The only thing I think may need
> > addressing is endianness. Would you consider using it instead if I
> > reworked it and sent it out to the list? You can find it here:
> > https://cgit.freedesktop.org/~nchery/mesa/commit/?h=1/ext/optimize&id=a02d88eba1d3129b27d3b5e6aaa976c3ca20cf79
> 
> I was not able to reproduce that result on current Mesa.  I had a lot
> of trouble believing that more than 18% of our driver binary was
> extension check code.  I also had a sick feeling that may have just
> been the first stage of grief talking... :)  Are you able to reproduce
> your original result?
> 

I'm actually not able to reproduce my results anymore. Also, after
fixing endianness with more bit shifting, I was only able to save ~400B
of .text (using your build flags). I retract my earlier statement.

> I also tried a similar patch to yours that wouldn't have endianness
> problems:
> 
> #define EXT(name_str, driver_cap, gll, glc, es1, es2, ...)                     \
> static inline bool                                                             \
> _mesa_has_##name_str(const struct gl_context *ctx)                             \
> {                                                                              \
>    static const uint8_t ver[4] = { (uint8_t)gll, (uint8_t)es1, (uint8_t)es2, (uint8_t)glc }; \
>    return ctx->Extensions.driver_cap &&                                        \
>           (ctx->Extensions.Version >= ver[ctx->API]);                          \
> }
> 
> Here's what I got for all four methods:
> 
> 7037675	 235248	  37280	7310203	 6f8b7b	32-bit i965_dri.so before
> 7034307	 235248	  37280	7306835	 6f7e53	32-bit i965_dri.so after idr
> 7038343	 235248	  37280	7310871	 6f8e17	32-bit i965_dri.so after Nanley
> 7036271	 235248	  37280	7308799	 6f85ff	32-bit i965_dri.so w/arrays
> 
> 6679695	 303400	  50608	7033703	 6b5367	64-bit i965_dri.so before
> 6676143	 303400	  50608	7030151	 6b4587	64-bit i965_dri.so after idr
> 6684767	 303400	  50608	7038775	 6b6737	64-bit i965_dri.so after Nanley
> 6678567	 303400	  50608	7032575	 6b4eff	64-bit i965_dri.so w/arrays
> 
> For reference, I build with the same flags that Fedora 23 (I think?)
> used for release builds:
> 
> -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
> 
> And either "-m64 -mtune=generic" or "-m32 -march=i686 -mtune=atom
> -fasynchronous-unwind-tables" depending on the platform.
> --enable-debug is not passed to configure.
> 
> Even if I were able to reproduce your original result, there are still
> two cases where your approach may generate more code than is necessary:
> 
> 1. All of the APIs that support the extension use dummy_true.  There
> are many examples of this, but I don't think there are many matching
> users of _mesa_has_XXX_foo().
> 
> 2. All of the APIs support the extension in any version.
> GL_EXT_polygon_offset_clamp is an example.
> 
> I think we can blend the strengths to get something even better.
> 

Agreed.

-Nanley

> I missed that glsl_parser_extras.cpp has its own implementation of the
> has_XXX_foo() functions that take the API and version as explicit
> parameters.  Your patch predates that change.  There's room for some
> modest savings there too.
> 
> I'll send a couple follow-up patches soon.
> 
> > Thanks,
> > Nanley
> 


More information about the mesa-dev mailing list