[Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup
Keith Whitwell
keithw at vmware.com
Thu Jun 30 02:09:17 PDT 2011
On Thu, 2011-06-30 at 03:36 +0200, Roland Scheidegger wrote:
> Ok in fact there's a gcc bug about memcmp:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> In short gcc's memcmp builtin is totally lame and loses to glibc's
> memcmp (including call overhead, no knowledge about alignment etc.) even
> when comparing only very few bytes (and loses BIG time for lots of bytes
> to compare). Oops. Well at least if the strings are the same (I'd guess
> if the first byte is different it's hard to beat the gcc builtin...).
> So this is really a gcc bug. The bug is quite old though with no fix in
> sight apparently so might need to think about some workaround (but just
> not doing the comparison doesn't look like the right idea, since
> apparently it would be faster with the comparison if gcc's memcmp got
> fixed).
Looking at the struct again (it's been a while), it seems like it could
be rearranged to be variable-sized and on average significantly smaller:
struct lp_rast_state {
struct lp_jit_context jit_context;
struct lp_fragment_shader_variant *variant;
};
struct lp_jit_context {
const float *constants;
float alpha_ref_value;
uint32_t stencil_ref_front, stencil_ref_back;
uint8_t *blend_color;
struct lp_jit_texture textures[PIPE_MAX_SAMPLERS];
};
If we moved the jit_context part behind "variant", and then hopefully
note that most of those lp_jit_texture structs are not in use. That
would save time on the memcmp *and* space in the binned data.
It's weird this wasn't showing up in past profiling.
Kieth
More information about the mesa-dev
mailing list