[Intel-gfx] [PATCH] drm/i915: Fixed VS thread distribution between slices

Kenneth Graunke kenneth at whitecape.org
Tue Apr 10 20:38:32 CEST 2012

On 04/09/2012 09:10 PM, Ben Widawsky wrote:
> From: "bernard.r.kilarski"<your-user-name at intel.com>
> Full disclosure: my IVB hangs on nexuiz both before, and after this patch, and
> I haven't done any debug
> This patch was given to me by Bernard, by way of Daniel. The patch came
> with very little description, and I haven't really spent too much time
> in the spec to make much sense out of it.
> The improvement (IVB only) with my nexuiz configuration is almost
> unbelievable. It has a more humble improvement on OpenArena. I'll throw
> other tests at it, and try to debug the hangs a bit, but given the
> nexuiz results, I was compelled to post this sooner, rather than later.
> I see about a 2% improvement on OA
> and about a 40$% improvement on nexuiz
> CC: Daniel Vetter<daniel.vetter at ffwll.ch>
> Signed-off-by: Ben Widawsky<benjamin.widawsky at intel.com>
> ---
>   drivers/gpu/drm/i915/i915_reg.h      |    3 +++
>   drivers/gpu/drm/i915/intel_display.c |    3 +++
>   2 files changed, 6 insertions(+)
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index cb55444..755bdff 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -682,6 +682,9 @@
>   #define GEN6_BSD_RNCID			0x12198
> +#define GEN7_FF_THREAD_MODE		0x20a0
> +#define GEN7_FF_THREAD_SIMPLE_SCHED	0x28a00010
> +
>   /*
>    * Framebuffer compression (915+ only)
>    */
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 6ec81b4..bbd5f46 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -8893,6 +8893,9 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
>   		intel_flush_display_plane(dev_priv, pipe);
>   	}
> +
> +	/* Fix vertex load balancing */
>   }
>   static void valleyview_init_clock_gating(struct drm_device *dev)

I definitely approve of this patch.  It appears to make the hardware use 
a simple scheduler, as opposed to a complex one which is default.

The default scheduler has this lovely comment: "Note: this will cause 
possible corruption if input handles are reused due to instancing or 
topologies that reuse vertices (i.e. strips and fans)".

That is frightening.  We use instancing, tristrips, and trifans all the 
time.  While I haven't observed a problem, I don't want this to come 
back and bite us later.  The other scheduling options in VS Thread 
Dispatch Mode are useless.  Simple scheduling appears to be the only 
option that's safe.  And apparently it performs well too :)

So, as is, I think this qualifies for a:

Cc: stable at kernel.org
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

However!  I would like to be more paranoid and program it to 0x28a00000.
In other words, the IVB default minus bit 4 and bit 12:

     0x28a01010 & ~(1 << 4 | 1 << 12)

Bit 12 is the VS Thread Dispatch Override, which Bernard's patch 
cleared.  Bit 4 is the equivalent bit for DS (Domain Shaders).  It also 
has the scary comment about instancing and strips/fans not working.

Likewise, bit 16 also should be off (but is already off by default).

Since domain shaders are an OpenGL 4.0 feature, this shouldn't matter in 
the short term...but I doubt any of us will remember to come back here 
and change these bits when we eventually need them.

More information about the Intel-gfx mailing list