[Mesa-dev] [PATCH 3/4] i965/fs: Simplify texture destination fixups
Jason Ekstrand
jason at jlekstrand.net
Wed May 4 00:47:37 UTC 2016
On Tue, May 3, 2016 at 4:54 PM, Francisco Jerez <currojerez at riseup.net>
wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
> > On Tue, May 3, 2016 at 4:24 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> >
> >>
> >>
> >> On Tue, May 3, 2016 at 4:18 PM, Kenneth Graunke <kenneth at whitecape.org>
> >> wrote:
> >>
> >>> On Tuesday, May 3, 2016 3:00:26 PM PDT Jason Ekstrand wrote:
> >>> > There are a few different fixups that we have to do for texture
> >>> > destinations that re-arrange channels, fix hardware vs. API
> mismatches,
> >>> or
> >>> > just shrink the result to fit in the NIR destination. These were all
> >>> being
> >>> > done in a somewhat haphazard manner. This commit replaces all of the
> >>> > shuffling with a single LOAD_PAYLOAD operation at the end and makes
> it
> >>> much
> >>> > easier to insert fixups between the texture instruction itself and
> the
> >>> > LOAD_PAYLOAD.
> >>> >
> >>> > Shader-db results on Haswell:
> >>> >
> >>> > total instructions in shared programs: 6227035 -> 6226669 (-0.01%)
> >>> > instructions in affected programs: 19119 -> 18753 (-1.91%)
> >>> > helped: 85
> >>> > HURT: 0
> >>> >
> >>> > total cycles in shared programs: 56491626 -> 56476126 (-0.03%)
> >>> > cycles in affected programs: 672420 -> 656920 (-2.31%)
> >>> > helped: 92
> >>> > HURT: 42
> >>> > ---
> >>> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 ++++++++++
> >>> +---------------------
> >>> > 1 file changed, 11 insertions(+), 21 deletions(-)
> >>> >
> >>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >>> b/src/mesa/drivers/
> >>> dri/i965/brw_fs_nir.cpp
> >>> > index ebc54ad..e0a76b4 100644
> >>> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >>> > @@ -3311,43 +3311,33 @@ fs_visitor::nir_emit_texture(const fs_builder
> >>> &bld,
> >>> nir_tex_instr *instr)
> >>> > emit_gen6_gather_wa(key_tex->gen6_gather_wa[texture], dst);
> >>> > }
> >>> >
> >>> > - if (instr->op == nir_texop_query_levels) {
> >>> > - /* # levels is in .w */
> >>> > - dst = offset(dst, bld, 3);
> >>> > - }
> >>> > + fs_reg *nir_dest = ralloc_array(mem_ctx, fs_reg, dest_size);
> >>>
> >>> Why not just stack allocate this using the pessimal size?
> >>>
> >>> fs_reg nir_dest[4];
> >>>
> >>
> >> Good point! Fixed locally.
> >>
> >
> > Never mind. It's because bld.LOAD_PAYLOAD doesn't make a copy of the
> > array, it just copies the pointer into the fs_inst.
> >
> fs_builder::LOAD_PAYLOAD just passes the pointer to the fs_inst
> constructor which does the copy itself when you use the variable-source
> variant -- I fixed that a while ago precisely in order to be able to
> allocate source arrays in the stack. ;)
>
Yup. I made the change locally (after double and tripple-checking what
curro just said) and verified with piglit.
--Jason
> >
> >> > + for (unsigned i = 0; i < dest_size; i++)
> >>> > + nir_dest[i] = offset(dst, bld, i);
> >>> >
> >>> > bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE
> &&
> >>> > instr->is_array;
> >>> >
> >>> > - /* fixup #layers for cube map arrays */
> >>> > - if (instr->op == nir_texop_txs && (devinfo->gen < 7 ||
> >>> is_cube_array)) {
> >>> > + if (instr->op == nir_texop_query_levels) {
> >>> > + /* # levels is in .w */
> >>> > + nir_dest[0] = offset(dst, bld, 3);
> >>> > + } else if (instr->op == nir_texop_txs && dest_size >= 3 &&
> >>> > + (devinfo->gen < 7 || is_cube_array)) {
> >>> > fs_reg depth = offset(dst, bld, 2);
> >>> > fs_reg fixed_depth = vgrf(glsl_type::int_type);
> >>> >
> >>> > if (is_cube_array) {
> >>> > + /* fixup #layers for cube map arrays */
> >>> > bld.emit(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth,
> >>> brw_imm_d(6));
> >>> > } else if (devinfo->gen < 7) {
> >>> > /* Gen4-6 return 0 instead of 1 for single layer surfaces.
> */
> >>> > bld.emit_minmax(fixed_depth, depth, brw_imm_d(1),
> >>> BRW_CONDITIONAL_GE);
> >>> > }
> >>> >
> >>> > - fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, inst-
> >>> >regs_written);
> >>> > - int components = inst->regs_written / (inst->exec_size / 8);
> >>> > - for (int i = 0; i < components; i++) {
> >>> > - if (i == 2) {
> >>> > - fixed_payload[i] = fixed_depth;
> >>> > - } else {
> >>> > - fixed_payload[i] = offset(dst, bld, i);
> >>> > - }
> >>> > - }
> >>> > - bld.LOAD_PAYLOAD(dst, fixed_payload, components, 0);
> >>> > + nir_dest[2] = fixed_depth;
> >>> > }
> >>> >
> >>> > - fs_reg nir_dest = get_nir_dest(instr->dest);
> >>> > - nir_dest.type = dst.type;
> >>> > - emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
> >>> > - nir_dest, dst),
> >>> > - (1 << dest_size) - 1);
> >>> > + bld.LOAD_PAYLOAD(get_nir_dest(instr->dest), nir_dest, dest_size,
> 0);
> >>> > }
> >>> >
> >>> > void
> >>> >
> >>>
> >>>
> >>
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160503/cf9d3896/attachment-0001.html>
More information about the mesa-dev
mailing list