[Mesa-dev] [PATCH] llvmpipe: Optimize new fs state setup

Roland Scheidegger sroland at vmware.com
Wed Jun 29 11:33:15 PDT 2011


Ohh that's interesting, you'd think the comparison shouldn't be that
expensive (though I guess in ipers case the comparison is never true).
memcmp is quite extensively used everywhere. Maybe we could replace that
with something faster (since we only ever care if the blocks are the
same but not care about the lexographic ordering and always compare
whole structs, should compare dwords instead of bytes for a 4 time
speedup)? Or isn't that the reason cmpsb instead of cmpsd is used?
Also I guess it would help if the values which are more likely to be
unequal are first in the struct (if we can tell that).
Of course though if it's unlikely to be the same as the compared value
anyway not comparing at all still might be a win (here).

Roland

Am 29.06.2011 19:19, schrieb Adam Jackson:
> Perversely, do this by eliminating the comparison between stored and
> current fs state.  On ipers, a perf trace showed try_update_scene_state
> using 31% of a CPU, and 98% of that was in 'repz cmpsb', ie, the memcmp.
> Taking that out takes try_update_scene_state down to 6.5% of the
> profile; more importantly, ipers goes from 10 to 14fps and gears goes
> from 790 to 830fps.
> 
> Signed-off-by: Adam Jackson <ajax at redhat.com>
> ---
>  src/gallium/drivers/llvmpipe/lp_setup.c |   61 ++++++++++++++-----------------
>  1 files changed, 27 insertions(+), 34 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c
> index cbe06e5..9118db5 100644
> --- a/src/gallium/drivers/llvmpipe/lp_setup.c
> +++ b/src/gallium/drivers/llvmpipe/lp_setup.c
> @@ -839,42 +839,35 @@ try_update_scene_state( struct lp_setup_context *setup )
>        setup->dirty |= LP_SETUP_NEW_FS;
>     }
>  
> -
>     if (setup->dirty & LP_SETUP_NEW_FS) {
> -      if (!setup->fs.stored ||
> -          memcmp(setup->fs.stored,
> -                 &setup->fs.current,
> -                 sizeof setup->fs.current) != 0)
> -      {
> -         struct lp_rast_state *stored;
> -         uint i;
> -         
> -         /* The fs state that's been stored in the scene is different from
> -          * the new, current state.  So allocate a new lp_rast_state object
> -          * and append it to the bin's setup data buffer.
> -          */
> -         stored = (struct lp_rast_state *) lp_scene_alloc(scene, sizeof *stored);
> -         if (!stored) {
> -            assert(!new_scene);
> -            return FALSE;
> -         }
> +      struct lp_rast_state *stored;
> +      uint i;
> +      
> +      /* The fs state that's been stored in the scene is different from
> +       * the new, current state.  So allocate a new lp_rast_state object
> +       * and append it to the bin's setup data buffer.
> +       */
> +      stored = (struct lp_rast_state *) lp_scene_alloc(scene, sizeof *stored);
> +      if (!stored) {
> +         assert(!new_scene);
> +         return FALSE;
> +      }
>  
> -         memcpy(stored,
> -                &setup->fs.current,
> -                sizeof setup->fs.current);
> -         setup->fs.stored = stored;
> -         
> -         /* The scene now references the textures in the rasterization
> -          * state record.  Note that now.
> -          */
> -         for (i = 0; i < Elements(setup->fs.current_tex); i++) {
> -            if (setup->fs.current_tex[i]) {
> -               if (!lp_scene_add_resource_reference(scene,
> -                                                    setup->fs.current_tex[i],
> -                                                    new_scene)) {
> -                  assert(!new_scene);
> -                  return FALSE;
> -               }
> +      memcpy(stored,
> +             &setup->fs.current,
> +             sizeof setup->fs.current);
> +      setup->fs.stored = stored;
> +      
> +      /* The scene now references the textures in the rasterization
> +       * state record.  Note that now.
> +       */
> +      for (i = 0; i < Elements(setup->fs.current_tex); i++) {
> +         if (setup->fs.current_tex[i]) {
> +            if (!lp_scene_add_resource_reference(scene,
> +                                                 setup->fs.current_tex[i],
> +                                                 new_scene)) {
> +               assert(!new_scene);
> +               return FALSE;
>              }
>           }
>        }



More information about the mesa-dev mailing list