[Mesa-dev] [PATCH 3/8] i965: Store the atoms directly in the context

Kenneth Graunke kenneth at whitecape.org
Sun Dec 21 21:11:47 PST 2014


On Friday, December 19, 2014 02:20:54 PM Ian Romanick wrote:
> From: Ian Romanick <ian.d.romanick at intel.com>
> 
> Instead of having an extra pointer indirection in one of the hottest
> loops in the driver.
> 
> On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
> for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
> Gl32Batch7:
> 
> 32-bit: Difference at 95.0% confidence 1.98515% +/- 0.20814% (n=40)
> 64-bit: Difference at 95.0% confidence 1.5163% +/- 0.811016% (n=60)
> 
> Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_context.h      |  2 +-
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 19 ++++++++++++++++---
>  2 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> index 92eb022..5f5f807 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1384,7 +1384,7 @@ struct brw_context
>     } perfmon;
>  
>     int num_atoms;
> -   const struct brw_tracked_state **atoms;
> +   const struct brw_tracked_state atoms[64];

Making this atoms[57] would save memory, and the static asserts make it easy
to remember to increase the size if we ever add additional atoms.

Honestly, it seems like there should be a better way to do this.  Currently,
the atoms list only depends on the generation number, and thus is identical
for all GL contexts.  Storing an additional 1.5 kB of redundant data in each
GL context seems a bit wasteful.

Would it help to make brw_upload_state cache the array and size?

void
brw_upload_state(struct brw_context *brw)
{
   static const struct brw_tracked_state *atoms = brw->atoms;
   static const int num_atoms = brw->num_atoms;

   ...

   for (i = 0; i < num_atoms; i++) {
      ...use atoms[i]...
   }

   ...
}

It seems like that would achieve the same effect - one indirection.

One downside to that approach is that we couldn't have separate atom
lists for API_CORE vs. API_COMPAT (while we could using your approach).
(We could skip stippling for core contexts, and skip GS/HS/DS for legacy
contexts, which might help a tiny bit...)

>     /* If (INTEL_DEBUG & DEBUG_BATCH) */
>     struct {
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index a579781..9a5d49a 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -357,6 +357,11 @@ void brw_init_state( struct brw_context *brw )
>     const struct brw_tracked_state **atoms;
>     int num_atoms;
>  
> +   STATIC_ASSERT(ARRAY_SIZE(gen4_atoms) <= ARRAY_SIZE(brw->atoms));
> +   STATIC_ASSERT(ARRAY_SIZE(gen6_atoms) <= ARRAY_SIZE(brw->atoms));
> +   STATIC_ASSERT(ARRAY_SIZE(gen7_atoms) <= ARRAY_SIZE(brw->atoms));
> +   STATIC_ASSERT(ARRAY_SIZE(gen8_atoms) <= ARRAY_SIZE(brw->atoms));

I do like the static asserts.

> +
>     brw_init_caches(brw);
>  
>     if (brw->gen >= 8) {
> @@ -373,9 +378,17 @@ void brw_init_state( struct brw_context *brw )
>        num_atoms = ARRAY_SIZE(gen4_atoms);
>     }
>  
> -   brw->atoms = atoms;
>     brw->num_atoms = num_atoms;
>  
> +   /* This is to work around brw_context::atoms being declared const.  We want
> +    * it to be const, but it needs to be initialized somehow!
> +    */
> +   struct brw_tracked_state *context_atoms =
> +      (struct brw_tracked_state *) &brw->atoms[0];

Ugly, but... not sure what else to suggest.

I suppose this patch is fine.  It does offer a small CPU overhead reduction in
the hottest path, for a small memory usage penalty.  I think we can do better,
but I don't know that it makes sense to block your patch over nebulous ideas
for future improvements.  We can do those in the future...

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

> +
> +   for (int i = 0; i < num_atoms; i++)
> +      context_atoms[i] = *atoms[i];
> +
>     while (num_atoms--) {
>        assert((*atoms)->dirty.mesa | (*atoms)->dirty.brw);
>        assert((*atoms)->emit);
> @@ -607,7 +620,7 @@ void brw_upload_state(struct brw_context *brw)
>        prev = *state;
>  
>        for (i = 0; i < brw->num_atoms; i++) {
> -	 const struct brw_tracked_state *atom = brw->atoms[i];
> +	 const struct brw_tracked_state *atom = &brw->atoms[i];
>  	 struct brw_state_flags generated;
>  
>  	 if (check_state(state, &atom->dirty)) {
> @@ -627,7 +640,7 @@ void brw_upload_state(struct brw_context *brw)
>     }
>     else {
>        for (i = 0; i < brw->num_atoms; i++) {
> -	 const struct brw_tracked_state *atom = brw->atoms[i];
> +	 const struct brw_tracked_state *atom = &brw->atoms[i];
>  
>  	 if (check_state(state, &atom->dirty)) {
>  	    atom->emit(brw);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20141221/7ec644aa/attachment-0001.sig>


More information about the mesa-dev mailing list