[Mesa-dev] [PATCH 1/9] mesa: Handle common extension checks with more compact code
Ian Romanick
idr at freedesktop.org
Fri May 19 23:29:16 UTC 2017
On 05/19/2017 10:28 AM, Nanley Chery wrote:
> On Fri, May 19, 2017 at 06:38:03AM -0700, Ian Romanick wrote:
>> From: Ian Romanick <ian.d.romanick at intel.com>
>>
>> The previous code handled everything with the general case. I noticed
>> that every time I converted an open-coded check to use a
>> _mesa_has_EXT_foo() function, the text size of the driver increased.
>>
>> Almost all extensions only care what the current context API is, and
>> the version does not matter. Handle those using more compact checks.
>>
>> text data bss dec hex filename
>> 7037675 235248 37280 7310203 6f8b7b 32-bit i965_dri.so before
>> 7034307 235248 37280 7306835 6f7e53 32-bit i965_dri.so after
>> 6679695 303400 50608 7033703 6b5367 64-bit i965_dri.so before
>> 6676143 303400 50608 7030151 6b4587 64-bit i965_dri.so after
>
> Hi Ian,
>
> I wrote a patch some time ago that reduces the cost of the extension
> checks by a lot more with less code. The only thing I think may need
> addressing is endianness. Would you consider using it instead if I
> reworked it and sent it out to the list? You can find it here:
> https://cgit.freedesktop.org/~nchery/mesa/commit/?h=1/ext/optimize&id=a02d88eba1d3129b27d3b5e6aaa976c3ca20cf79
I was not able to reproduce that result on current Mesa. I had a lot
of trouble believing that more than 18% of our driver binary was
extension check code. I also had a sick feeling that may have just
been the first stage of grief talking... :) Are you able to reproduce
your original result?
I also tried a similar patch to yours that wouldn't have endianness
problems:
#define EXT(name_str, driver_cap, gll, glc, es1, es2, ...) \
static inline bool \
_mesa_has_##name_str(const struct gl_context *ctx) \
{ \
static const uint8_t ver[4] = { (uint8_t)gll, (uint8_t)es1, (uint8_t)es2, (uint8_t)glc }; \
return ctx->Extensions.driver_cap && \
(ctx->Extensions.Version >= ver[ctx->API]); \
}
Here's what I got for all four methods:
7037675 235248 37280 7310203 6f8b7b 32-bit i965_dri.so before
7034307 235248 37280 7306835 6f7e53 32-bit i965_dri.so after idr
7038343 235248 37280 7310871 6f8e17 32-bit i965_dri.so after Nanley
7036271 235248 37280 7308799 6f85ff 32-bit i965_dri.so w/arrays
6679695 303400 50608 7033703 6b5367 64-bit i965_dri.so before
6676143 303400 50608 7030151 6b4587 64-bit i965_dri.so after idr
6684767 303400 50608 7038775 6b6737 64-bit i965_dri.so after Nanley
6678567 303400 50608 7032575 6b4eff 64-bit i965_dri.so w/arrays
For reference, I build with the same flags that Fedora 23 (I think?)
used for release builds:
-O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
And either "-m64 -mtune=generic" or "-m32 -march=i686 -mtune=atom
-fasynchronous-unwind-tables" depending on the platform.
--enable-debug is not passed to configure.
Even if I were able to reproduce your original result, there are still
two cases where your approach may generate more code than is necessary:
1. All of the APIs that support the extension use dummy_true. There
are many examples of this, but I don't think there are many matching
users of _mesa_has_XXX_foo().
2. All of the APIs support the extension in any version.
GL_EXT_polygon_offset_clamp is an example.
I think we can blend the strengths to get something even better.
I missed that glsl_parser_extras.cpp has its own implementation of the
has_XXX_foo() functions that take the API and version as explicit
parameters. Your patch predates that change. There's room for some
modest savings there too.
I'll send a couple follow-up patches soon.
> Thanks,
> Nanley
More information about the mesa-dev
mailing list