<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 3, 2016 at 4:54 PM, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br>
<br>
> On Tue, May 3, 2016 at 4:24 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
><br>
>><br>
>><br>
>> On Tue, May 3, 2016 at 4:18 PM, Kenneth Graunke <<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>><br>
>> wrote:<br>
>><br>
>>> On Tuesday, May 3, 2016 3:00:26 PM PDT Jason Ekstrand wrote:<br>
>>> > There are a few different fixups that we have to do for texture<br>
>>> > destinations that re-arrange channels, fix hardware vs. API mismatches,<br>
>>> or<br>
>>> > just shrink the result to fit in the NIR destination. These were all<br>
>>> being<br>
>>> > done in a somewhat haphazard manner. This commit replaces all of the<br>
>>> > shuffling with a single LOAD_PAYLOAD operation at the end and makes it<br>
>>> much<br>
>>> > easier to insert fixups between the texture instruction itself and the<br>
>>> > LOAD_PAYLOAD.<br>
>>> ><br>
>>> > Shader-db results on Haswell:<br>
>>> ><br>
>>> > total instructions in shared programs: 6227035 -> 6226669 (-0.01%)<br>
>>> > instructions in affected programs: 19119 -> 18753 (-1.91%)<br>
>>> > helped: 85<br>
>>> > HURT: 0<br>
>>> ><br>
>>> > total cycles in shared programs: 56491626 -> 56476126 (-0.03%)<br>
>>> > cycles in affected programs: 672420 -> 656920 (-2.31%)<br>
>>> > helped: 92<br>
>>> > HURT: 42<br>
>>> > ---<br>
>>> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 ++++++++++<br>
>>> +---------------------<br>
>>> > 1 file changed, 11 insertions(+), 21 deletions(-)<br>
>>> ><br>
>>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp<br>
>>> b/src/mesa/drivers/<br>
>>> dri/i965/brw_fs_nir.cpp<br>
>>> > index ebc54ad..e0a76b4 100644<br>
>>> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp<br>
>>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp<br>
>>> > @@ -3311,43 +3311,33 @@ fs_visitor::nir_emit_texture(const fs_builder<br>
>>> &bld,<br>
>>> nir_tex_instr *instr)<br>
>>> > emit_gen6_gather_wa(key_tex->gen6_gather_wa[texture], dst);<br>
>>> > }<br>
>>> ><br>
>>> > - if (instr->op == nir_texop_query_levels) {<br>
>>> > - /* # levels is in .w */<br>
>>> > - dst = offset(dst, bld, 3);<br>
>>> > - }<br>
>>> > + fs_reg *nir_dest = ralloc_array(mem_ctx, fs_reg, dest_size);<br>
>>><br>
>>> Why not just stack allocate this using the pessimal size?<br>
>>><br>
>>> fs_reg nir_dest[4];<br>
>>><br>
>><br>
>> Good point! Fixed locally.<br>
>><br>
><br>
> Never mind. It's because bld.LOAD_PAYLOAD doesn't make a copy of the<br>
> array, it just copies the pointer into the fs_inst.<br>
><br>
</div></div>fs_builder::LOAD_PAYLOAD just passes the pointer to the fs_inst<br>
constructor which does the copy itself when you use the variable-source<br>
variant -- I fixed that a while ago precisely in order to be able to<br>
allocate source arrays in the stack. ;)<br></blockquote><div><br></div><div>Yup. I made the change locally (after double and tripple-checking what curro just said) and verified with piglit.<br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
><br>
>> > + for (unsigned i = 0; i < dest_size; i++)<br>
>>> > + nir_dest[i] = offset(dst, bld, i);<br>
>>> ><br>
>>> > bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE &&<br>
>>> > instr->is_array;<br>
>>> ><br>
>>> > - /* fixup #layers for cube map arrays */<br>
>>> > - if (instr->op == nir_texop_txs && (devinfo->gen < 7 ||<br>
>>> is_cube_array)) {<br>
>>> > + if (instr->op == nir_texop_query_levels) {<br>
>>> > + /* # levels is in .w */<br>
>>> > + nir_dest[0] = offset(dst, bld, 3);<br>
>>> > + } else if (instr->op == nir_texop_txs && dest_size >= 3 &&<br>
>>> > + (devinfo->gen < 7 || is_cube_array)) {<br>
>>> > fs_reg depth = offset(dst, bld, 2);<br>
>>> > fs_reg fixed_depth = vgrf(glsl_type::int_type);<br>
>>> ><br>
>>> > if (is_cube_array) {<br>
>>> > + /* fixup #layers for cube map arrays */<br>
>>> > bld.emit(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth,<br>
>>> brw_imm_d(6));<br>
>>> > } else if (devinfo->gen < 7) {<br>
>>> > /* Gen4-6 return 0 instead of 1 for single layer surfaces. */<br>
>>> > bld.emit_minmax(fixed_depth, depth, brw_imm_d(1),<br>
>>> BRW_CONDITIONAL_GE);<br>
>>> > }<br>
>>> ><br>
>>> > - fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, inst-<br>
>>> >regs_written);<br>
>>> > - int components = inst->regs_written / (inst->exec_size / 8);<br>
>>> > - for (int i = 0; i < components; i++) {<br>
>>> > - if (i == 2) {<br>
>>> > - fixed_payload[i] = fixed_depth;<br>
>>> > - } else {<br>
>>> > - fixed_payload[i] = offset(dst, bld, i);<br>
>>> > - }<br>
>>> > - }<br>
>>> > - bld.LOAD_PAYLOAD(dst, fixed_payload, components, 0);<br>
>>> > + nir_dest[2] = fixed_depth;<br>
>>> > }<br>
>>> ><br>
>>> > - fs_reg nir_dest = get_nir_dest(instr->dest);<br>
>>> > - nir_dest.type = dst.type;<br>
>>> > - emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),<br>
>>> > - nir_dest, dst),<br>
>>> > - (1 << dest_size) - 1);<br>
>>> > + bld.LOAD_PAYLOAD(get_nir_dest(instr->dest), nir_dest, dest_size, 0);<br>
>>> > }<br>
>>> ><br>
>>> > void<br>
>>> ><br>
>>><br>
>>><br>
>><br>
</div></div>> _______________________________________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div><br></div></div>