[Mesa-dev] [PATCH 2/3] i965: Lower min/max after optimization on Gen4/5.
Francisco Jerez
currojerez at riseup.net
Sat Feb 13 23:30:40 UTC 2016
Matt Turner <mattst88 at gmail.com> writes:
> On Sat, Feb 13, 2016 at 2:02 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>> Matt Turner <mattst88 at gmail.com> writes:
>>
>>> On Thu, Feb 11, 2016 at 4:41 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>>> Gen4/5's SEL instruction cannot use conditional modifiers, so min/max
>>>> are implemented as CMP + SEL. Handling that after optimization lets us
>>>> CSE more.
>>>>
>>>> On Ironlake:
>>>>
>>>> total instructions in shared programs: 6426035 -> 6422753 (-0.05%)
>>>> instructions in affected programs: 326604 -> 323322 (-1.00%)
>>>> helped: 1411
>>>>
>>>> total cycles in shared programs: 129184700 -> 129101586 (-0.06%)
>>>> cycles in affected programs: 18950290 -> 18867176 (-0.44%)
>>>> helped: 2419
>>>> HURT: 328
>>>> ---
>>>> src/mesa/drivers/dri/i965/brw_fs.cpp | 37 +++++++++++++++++++++++++
>>>> src/mesa/drivers/dri/i965/brw_fs.h | 1 +
>>>> src/mesa/drivers/dri/i965/brw_fs_builder.h | 10 ++-----
>>>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 20 +++-----------
>>>> src/mesa/drivers/dri/i965/brw_vec4.cpp | 38 ++++++++++++++++++++++++++
>>>> src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++
>>>> src/mesa/drivers/dri/i965/brw_vec4_builder.h | 10 ++-----
>>>> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 14 ++--------
>>>> 8 files changed, 88 insertions(+), 44 deletions(-)
>>>>
>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>>> index 0ce7ed1..e83f0ba 100644
>>>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>>> @@ -3475,6 +3475,36 @@ fs_visitor::lower_integer_multiplication()
>>>> return progress;
>>>> }
>>>>
>>>> +bool
>>>> +fs_visitor::lower_minmax()
>>>> +{
>>>> + assert(devinfo->gen < 6);
>>>> +
>>>> + bool progress = false;
>>>> +
>>>> + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
>>>> + const fs_builder ibld(this, block, inst);
>>>> +
>>>> + if (inst->opcode == BRW_OPCODE_SEL &&
>>>> + inst->predicate == BRW_PREDICATE_NONE) {
>>>> + assert(inst->conditional_mod == BRW_CONDITIONAL_GE ||
>>>> + inst->conditional_mod == BRW_CONDITIONAL_L);
>>>
>>> Ken asked at the office if this assertion is necessary. I think it is.
>>> The PRM doesn't say anything about SEL with conditional modifiers
>>> other than .ge or .l.
>>
>> I'm pretty sure it's not, the SEL instruction works fine with other
>> conditional mods, and I've found it moderately useful in the past. And
>> at least the internal hardware docs mention explicitly that conditional
>> mods other than .l and .ge follow the cmp rules (rather than the cmpn
>> rules), which implies they're allowed...
>
> Okay, right. The PRM says "and all other conditional modifiers follow
> the cmp rules."
>
> Which ones are be useful? .z/.nz/.o/.u don't make sense.
>
These are all well-defined. ISTR having used SEL with .o at some point.
> I see that the SEL documentation says
>
> """
> For a sel instruction with a .l or .ge conditional modifier, if one
> source is NaN and the other not NaN, the non-NaN source is the result.
> If both sources are NaNs, the result is NaN. For all other conditional
> modifiers, if either source is NaN then src1 is selected.
> """
>
> So .ge/.l return non-NaN if one source is NaN, while .g/.le propagate NaNs.
>
> We have mistakenly used the wrong conditional modifier before (see
> commit 3b7f683f3).
>
The old conditional modifiers were only "wrong" because some specific
API requires certain NaN propagation behavior for certain built-ins.
It's not wrong to use .g/.le internally, the condmod is not required to
be .l/ge for the consistency of the IR to be guaranteed or to produce
well-formed machine code. Seems rather mean to me to assert on the
condmod being .ge/l. This is the kind of check that belongs in an
API-level integration test (i.e. piglit) rather than in the backend
IMHO.
Which brings me to the question... If the whole point of asserting that
condmod is .ge/l is enforcing the CMPN-style NaN propagation behavior,
why do you translate the instruction into CMP? AFAICT in order to
preserve the semantics of the original instruction you'd have to
translate this into CMPN/SEL for .ge/l with floating-point execution
type, and CMP/SEL in all other cases.
> If there's no code that wants the propagate-NaN behavior now, I'd
> rather we remove the assertion when we expect to use that behavior.
> Otherwise, it may just allow us to make a mistake.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160213/6c087792/attachment-0001.sig>
More information about the mesa-dev
mailing list