[Mesa-dev] [PATCH 09/13] i965: Only use the SIMD16 program for per-sample shading on Broadwell.
Anuj Phogat
anuj.phogat at gmail.com
Wed Feb 19 13:32:52 PST 2014
On Wed, Feb 19, 2014 at 2:04 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> This is a straight port from gen7_wm_state.c; I haven't looked into
> whether we can do both.
>
Verified that restriction still holds true in BDW.
See 3D Pipeline Stages > Pixel > Pixel Shader Thread Generation >
Pixel Grouping (Dispatch Size) Control
> v2: Actually do it right.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> src/mesa/drivers/dri/i965/gen8_ps_state.c | 38 ++++++++++++++++++++++++-------
> 1 file changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> index 57bf053..a834b85 100644
> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> @@ -183,10 +183,6 @@ upload_ps_state(struct brw_context *brw)
> if (brw->wm.prog_data->nr_params > 0)
> dw6 |= GEN7_PS_PUSH_CONSTANT_ENABLE;
>
> - dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
> - if (brw->wm.prog_data->prog_offset_16)
> - dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
> -
> /* From the documentation for this packet:
> * "If the PS kernel does not need the Position XY Offsets to
> * compute a Position Value, then this field should be programmed
> @@ -205,13 +201,39 @@ upload_ps_state(struct brw_context *brw)
> else
> dw6 |= GEN7_PS_POSOFFSET_NONE;
>
> - dw7 |=
> - brw->wm.prog_data->first_curbe_grf << GEN7_PS_DISPATCH_START_GRF_SHIFT_0 |
> - brw->wm.prog_data->first_curbe_grf_16<< GEN7_PS_DISPATCH_START_GRF_SHIFT_2;
> + /* In case of non 1x per sample shading, only one of SIMD8 and SIMD16
> + * should be enabled. We do 'SIMD16 only' dispatch if a SIMD16 shader
> + * is successfully compiled. In majority of the cases that bring us
> + * better performance than 'SIMD8 only' dispatch.
> + */
> + int min_invocations_per_fragment =
> + _mesa_get_min_invocations_per_fragment(ctx, brw->fragment_program, false);
> + assert(min_invocations_per_fragment >= 1);
> +
> + if (brw->wm.prog_data->prog_offset_16) {
> + dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
> + if (min_invocations_per_fragment == 1) {
> + dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
> + dw7 |= (brw->wm.prog_data->first_curbe_grf <<
> + GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
> + dw7 |= (brw->wm.prog_data->first_curbe_grf_16 <<
> + GEN7_PS_DISPATCH_START_GRF_SHIFT_2);
> + } else {
> + dw7 |= (brw->wm.prog_data->first_curbe_grf_16 <<
> + GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
> + }
> + } else {
> + dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
> + dw7 |= (brw->wm.prog_data->first_curbe_grf <<
> + GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
> + }
>
> BEGIN_BATCH(12);
> OUT_BATCH(_3DSTATE_PS << 16 | (12 - 2));
> - OUT_BATCH(brw->wm.base.prog_offset);
> + if (brw->wm.prog_data->prog_offset_16 && min_invocations_per_fragment > 1)
> + OUT_BATCH(brw->wm.base.prog_offset + brw->wm.prog_data->prog_offset_16);
> + else
> + OUT_BATCH(brw->wm.base.prog_offset);
> OUT_BATCH(0);
> OUT_BATCH(dw3);
> if (brw->wm.prog_data->total_scratch) {
> --
> 1.8.4.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list