<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, May 26, 2018 at 12:21 PM, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br>
<br>
> On Sat, May 26, 2018 at 11:10 AM, Francisco Jerez <<a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>><br>
> wrote:<br>
><br>
>> Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br>
>><br>
>> > From: Francisco Jerez <<a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>><br>
>> ><br>
>> > When we don't have PLN (gen4 and gen11+), we implement LINTERP as either<br>
>> > LINE+MAC or a pair of MADs. In both cases, the accumulator is written<br>
>> > by the first of the two instructions and read by the second. Even<br>
>> > though the accumulator value isn't actually ever used from a logical<br>
>> > instruction perspective, it is trashed so we need to make the scheduler<br>
>> > aware. Otherwise, the scheduler could end up re-ordering instructions<br>
>> > and putting a LINTERP between another an instruction which writes the<br>
>> > accumulator and another which tries to use that result.<br>
>> ><br>
>> > Cc: <a href="mailto:mesa-stable@lists.freedesktop.org">mesa-stable@lists.freedesktop.<wbr>org</a><br>
>> > ---<br>
>> > src/intel/compiler/brw_shader.<wbr>cpp | 3 ++-<br>
>> > 1 file changed, 2 insertions(+), 1 deletion(-)<br>
>> ><br>
>> > diff --git a/src/intel/compiler/brw_<wbr>shader.cpp b/src/intel/compiler/brw_<br>
>> shader.cpp<br>
>> > index 141b64e..dfd2c5c 100644<br>
>> > --- a/src/intel/compiler/brw_<wbr>shader.cpp<br>
>> > +++ b/src/intel/compiler/brw_<wbr>shader.cpp<br>
>> > @@ -984,7 +984,8 @@ backend_instruction::writes_<wbr>accumulator_implicitly(const<br>
>> struct gen_device_info<br>
>> > return writes_accumulator ||<br>
>> > (devinfo->gen < 6 &&<br>
>> > ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) ||<br>
>> > - (opcode >= FS_OPCODE_DDX_COARSE && opcode <=<br>
>> FS_OPCODE_LINTERP)));<br>
>> > + (opcode >= FS_OPCODE_DDX_COARSE && opcode <=<br>
>> FS_OPCODE_LINTERP))) ||<br>
>> > + (opcode == FS_OPCODE_LINTERP && !devinfo->has_pln);<br>
>><br>
>> The devinfo->has_pln condition is technically inaccurate, because even<br>
>> SNB will fall back to the non-PLN path which overwrites the accumulator<br>
>> for certain valid IR, which yeah I'm aware is not *typically*<br>
>> encountered before this series,<br>
><br>
><br>
> I'm pretty sure it's impossible before this series because, without SIMD32,<br>
> the barycentric coordinates always start at g2 and get incremented by 2<br>
> every time. The only other way to get something into the coordinates<br>
> source of the PLN is with a pixel interpolator message. For that, the<br>
> register allocator has a workaround to ensure that it's assigned an even<br>
> register on SNB. One of the early patches in my series replaces the broken<br>
> gen4.5-6 PLN fall-back (it didn't work in SIMD16 because it assumed the<br>
> wrong register layout for coordinates) with an assert and Jenkins is just<br>
> fine with the assert.<br>
><br>
<br>
</div></div>I know the conditions for the non-PLN fall-back are not typically<br>
encountered on Gen5-6, but it's still valid IR, so this implementation<br>
of writes_accumulator_implicitly(<wbr>) relies on the behavior of the<br>
register allocator, the NIR-to-i965 translation pass and the rest of the<br>
visitor being exactly what you expect in every possible codepath for<br>
correctness.</blockquote><div><br></div><div>It's already relied on for correctness because the LINE+MAC fallback that we had before is wrong in SIMD16 (see patch 33).<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">That wasn't the case in my original patch, nor in your<br>
series after PATCHv2 33 because this inaccuracy actually becomes a<br>
problem. Instead of introducing code which we know is dubiously<br>
correct, CC'ing mesa-stable, and then fixing it immediately, why don't<br>
we just do the obviously correct thing from the start?<br><div class="HOEnZb"><div class="h5"></div></div></blockquote><div><br></div><div>Sure, we can squash them together if you really want. But if you're concerned about this restriction being valid in live code, then 33/53 also needs to go to stable. I'm fine with that if you're unconvinced by my argument that LINTERP with an odd coordinate never occurs.<br></div><div><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">
><br>
>> but why make things more inaccurate than<br>
>> the original only to revert back to a devinfo->gen based check in<br>
>> PATCHv2 33?<br>
>><br>
><br>
> See above.<br>
><br>
><br>
>> I think I'd squash the last hunk of PATCHv2 33 into this one which would<br>
>> give you something functionally equivalent to v1 but updated to handle<br>
>> Gen11 correctly.<br>
>><br>
>> > }<br>
>> ><br>
>> > bool<br>
>> > --<br>
>> > 2.5.0.400.gff86faf<br>
>><br>
</div></div></blockquote></div><br></div></div>