[Mesa-dev] [PATCH 3/7] i965/fs: Use the sampler for FS pull constant loading on Ivybridge.

Paul Berry stereotype441 at gmail.com
Wed Sep 19 15:10:48 PDT 2012


On 19 September 2012 13:27, Kenneth Graunke <kenneth at whitecape.org> wrote:

> Data port reads are absurdly slow on Ivybridge due to cache issues.
>
> The LD message ignores the sampler unit index and SAMPLER_STATE pointer,
> instead relying on hard-wired default state.  Thus, there's no need to
> worry about running out of sampler units or providing SAMPLER_STATE;
> this small patch should be all that's required.
>
> NOTE: This is a candidate for all release branches.
>

Given that this affects only performance and not correctness, I'm having
trouble convincing myself that this patch should be a candidate for release
branches.  Don't we usually try to restrict release cherry-picks to things
like rendering issues and avoiding GPU hangs?


>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h        |  3 +++
>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 36
> ++++++++++++++++++++++++++++++-
>  2 files changed, 38 insertions(+), 1 deletion(-)
>
> I did this a long time ago for VS pull constant loading, which resulted in
> a 2-5x speedup for certain benchmarks.  Apparently at the time I never got
> FS pull constant loading working, and didn't have a benchmark that needed
> it, so I never finished and pushed it.
>
> Now I have a game that needs it.  No concrete data as I haven't figured out
> how to get consistent FPS numbers out of it.
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index e69de31..b5f2152 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -295,6 +295,9 @@ public:
>     void generate_pull_constant_load(fs_inst *inst, struct brw_reg dst,
>                                     struct brw_reg index,
>                                     struct brw_reg offset);
> +   void gen7_generate_pull_constant_load(fs_inst *inst, struct brw_reg
> dst,
> +                                         struct brw_reg index,
> +                                         struct brw_reg offset);
>     void generate_mov_dispatch_to_flags();
>
>     void emit_dummy_fs();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> index 5900c0e..4059660 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> @@ -585,6 +585,37 @@ fs_visitor::generate_unspill(fs_inst *inst, struct
> brw_reg dst)
>  }
>
>  void
> +fs_visitor::gen7_generate_pull_constant_load(fs_inst *inst, struct
> brw_reg dst,
> +                                             struct brw_reg index,
> +                                             struct brw_reg offset)
> +{
> +   assert(intel->gen == 7);
> +   assert(index.file == BRW_IMMEDIATE_VALUE &&
> +         index.type == BRW_REGISTER_TYPE_UD);
> +   assert(offset.file == BRW_IMMEDIATE_VALUE &&
> +         offset.type == BRW_REGISTER_TYPE_UD);
> +   uint32_t surf_index = index.dw1.ud;
> +   uint32_t read_offset = offset.dw1.ud;
> +
> +   /* offset is an IMM; SEND needs to be from a GRF. */
> +   offset = retype(brw_vec8_grf(127, 0), BRW_REGISTER_TYPE_UD);
> +   brw_MOV(p, offset, brw_imm_ud(read_offset / 16));
> +
> +   brw_instruction *insn = brw_next_insn(p, BRW_OPCODE_SEND);
> +   brw_set_dest(p, insn, dst);
> +   brw_set_src0(p, insn, offset);
> +   brw_set_sampler_message(p, insn,
> +                           surf_index,
> +                           0, /* LD message ignores sampler unit */
> +                           GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
> +                           1, /* rlen */
> +                           1, /* mlen */
> +                           false, /* no header */
> +                           BRW_SAMPLER_SIMD_MODE_SIMD4X2,
> +                           0);
> +}
> +
> +void
>  fs_visitor::generate_pull_constant_load(fs_inst *inst, struct brw_reg dst,
>                                         struct brw_reg index,
>                                         struct brw_reg offset)
> @@ -980,7 +1011,10 @@ fs_visitor::generate_code()
>          break;
>
>        case FS_OPCODE_PULL_CONSTANT_LOAD:
> -        generate_pull_constant_load(inst, dst, src[0], src[1]);
> +        if (intel->gen == 7)
> +           gen7_generate_pull_constant_load(inst, dst, src[0], src[1]);
> +        else
> +           generate_pull_constant_load(inst, dst, src[0], src[1]);
>          break;
>
>        case FS_OPCODE_FB_WRITE:
> --
> 1.7.11.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20120919/57316c06/attachment.html>


More information about the mesa-dev mailing list