[Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

Jose Fonseca jfonseca at vmware.com
Thu Jun 30 03:27:35 PDT 2011



----- Original Message -----
> On Thu, 2011-06-30 at 03:36 +0200, Roland Scheidegger wrote:
> > Ok in fact there's a gcc bug about memcmp:
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > In short gcc's memcmp builtin is totally lame and loses to glibc's
> > memcmp (including call overhead, no knowledge about alignment etc.)
> > even
> > when comparing only very few bytes (and loses BIG time for lots of
> > bytes
> > to compare). Oops. Well at least if the strings are the same (I'd
> > guess
> > if the first byte is different it's hard to beat the gcc
> > builtin...).
> > So this is really a gcc bug. The bug is quite old though with no
> > fix in
> > sight apparently so might need to think about some workaround (but
> > just
> > not doing the comparison doesn't look like the right idea, since
> > apparently it would be faster with the comparison if gcc's memcmp
> > got
> > fixed).
> 
> Looking at the struct again (it's been a while), it seems like it
> could
> be rearranged to be variable-sized and on average significantly
> smaller:
> 
> struct lp_rast_state {
>    struct lp_jit_context jit_context;
>    struct lp_fragment_shader_variant *variant;
> };
> 
> struct lp_jit_context {
>    const float *constants;
>    float alpha_ref_value;
>    uint32_t stencil_ref_front, stencil_ref_back;
>    uint8_t *blend_color;
>    struct lp_jit_texture textures[PIPE_MAX_SAMPLERS];
> };
> 
> If we moved the jit_context part behind "variant", and then hopefully
> note that most of those lp_jit_texture structs are not in use.  That
> would save time on the memcmp *and* space in the binned data.

Yeah, sounds a good idea.

But there's some subtletly to computing the number of textures: it can't be just the NULL textures, because they may be reffered by the JIT code, which has no NULL checks and  relies on the state setup to provide storage for all textures, or dummy memory if one is not bound.

I think a better idea would be:
- split the texture/sampler state
- to make the lp_jit_context::textures an array of pointers, and put the struct lp_jit_texture in the pipe_texture object themselves
- to make the lp_jit_context::samplers an array of pointers, and put the struct lp_jit_sampler in the pipe_sampler_state CSO

struct lp_jit_context {
    struct lp_jit_texture *textures[PIPE_MAX_SAMPLERS];
    struct lp_jit_sampler *samplers[PIPE_MAX_SAMPLERS];
};

struct lp_jit_texture
{
   uint32_t width;
   uint32_t height;
   uint32_t depth;
   uint32_t first_level;
   uint32_t last_level;
   uint32_t row_stride[LP_MAX_TEXTURE_LEVELS];
   uint32_t img_stride[LP_MAX_TEXTURE_LEVELS];
   const void *data[LP_MAX_TEXTURE_LEVELS];
   /* sampler state, actually */
   float min_lod;
   float max_lod;
   float lod_bias;
   float border_color[4];
};

struct lp_jit_sampler
{
   float min_lod;
   float max_lod;
   float lod_bias;
   float border_color[4];
};


Jose


More information about the mesa-dev mailing list