[Mesa-dev] [PATCH 1/6] nv50/ir: add LIMM form of mad to gk110
Samuel Pitoiset
samuel.pitoiset at gmail.com
Sat Oct 8 18:17:45 UTC 2016
On 10/08/2016 07:59 PM, Karol Herbst wrote:
> 2016-10-08 18:54 GMT+02:00 Samuel Pitoiset <samuel.pitoiset at gmail.com>:
>> Please, update the prefix.
>>
>> Also the same comment applies here, and I think the best way is to enable
>> that PostRAConstantFoldingPass for nvc0+ in a separate patch at the end of
>> that series. That way you won't break things and mupuf will appreciate. :)
>>
>
> I think you read the patches in the wrong order. The two first patches
> are the changes in the emiter.
I think you misunderstood what I said. :) But my explanation was not
really precise as well.
First, the prefix for commit messages in the emitters are nvc0/ir for
GF100/GK104, gk110/ir for GK110 and gm107/ir for GM107+, but this is not
the more important thing.
With this patch (and the following one) the emitters might want to use
the LIMM form but you can't be sure that def == src2. This is why
FFMA32I is not currently enabled on GM107 for example. So, those two
patches might break something (without the rest of the series) and this
is bad for bisecting. It would be nice to avoid such a situation.
So, I guess you want to update the emitters after you are sure that def
== src2 (when using the LIMM forms). I added a note about that in patch 2.
>
>> On 10/08/2016 05:43 PM, Karol Herbst wrote:
>>>
>>> Signed-off-by: Karol Herbst <karolherbst at gmail.com>
>>> ---
>>> .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 49
>>> ++++++++++++++--------
>>> 1 file changed, 32 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> index ce20ed3..5c28fd4 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> @@ -47,7 +47,7 @@ private:
>>> private:
>>> void emitForm_21(const Instruction *, uint32_t opc2, uint32_t opc1);
>>> void emitForm_C(const Instruction *, uint32_t opc, uint8_t ctg);
>>> - void emitForm_L(const Instruction *, uint32_t opc, uint8_t ctg,
>>> Modifier);
>>> + void emitForm_L(const Instruction *, uint32_t opc, uint8_t ctg,
>>> Modifier, int sCount = 3);
>>>
>>> void emitPredicate(const Instruction *);
>>>
>>> @@ -364,7 +364,7 @@ CodeEmitterGK110::setImmediate32(const Instruction *i,
>>> const int s,
>>>
>>> void
>>> CodeEmitterGK110::emitForm_L(const Instruction *i, uint32_t opc, uint8_t
>>> ctg,
>>> - Modifier mod)
>>> + Modifier mod, int sCount)
>>> {
>>> code[0] = ctg;
>>> code[1] = opc << 20;
>>> @@ -373,7 +373,7 @@ CodeEmitterGK110::emitForm_L(const Instruction *i,
>>> uint32_t opc, uint8_t ctg,
>>>
>>> defId(i->def(0), 2);
>>>
>>> - for (int s = 0; s < 3 && i->srcExists(s); ++s) {
>>> + for (int s = 0; s < sCount && i->srcExists(s); ++s) {
>>> switch (i->src(s).getFile()) {
>>> case FILE_GPR:
>>> srcId(i->src(s), s ? 42 : 10);
>>> @@ -486,25 +486,40 @@ CodeEmitterGK110::emitNOP(const Instruction *i)
>>> void
>>> CodeEmitterGK110::emitFMAD(const Instruction *i)
>>> {
>>> - assert(!isLIMM(i->src(1), TYPE_F32));
>>> + bool neg1 = (i->src(0).mod ^ i->src(1).mod).neg();
>>>
>>> - emitForm_21(i, 0x0c0, 0x940);
>>> + if (isLIMM(i->src(1), TYPE_F32)) {
>>> + // last source is dst, so force 2 sources
>>> + emitForm_L(i, 0x600, 0x0, Modifier(0), 2);
>>>
>>> - NEG_(34, 2);
>>> - SAT_(35);
>>> - RND_(36, F);
>>> - FTZ_(38);
>>> - DNZ_(39);
>>> + SAT_(3a);
>>> + NEG_(3b, 0);
>>> + NEG_(3c, 2);
>>>
>>> - bool neg1 = (i->src(0).mod ^ i->src(1).mod).neg();
>>> + // neg 1
>>> + if (neg1) {
>>> + code[1] |= 1 << 27;
>>> + }
>>> + } else {
>>> + emitForm_21(i, 0x0c0, 0x940);
>>>
>>> - if (code[0] & 0x1) {
>>> - if (neg1)
>>> - code[1] ^= 1 << 27;
>>> - } else
>>> - if (neg1) {
>>> - code[1] |= 1 << 19;
>>> + NEG_(33, 0);
>>> + NEG_(34, 2);
>>> + SAT_(35);
>>> + RND_(36, F);
>>> +
>>> + // neg 1
>>> + if (code[0] & 0x1) {
>>> + if (neg1)
>>> + code[1] ^= 1 << 27;
>>> + } else
>>> + if (neg1) {
>>> + code[1] |= 1 << 19;
>>> + }
>>> }
>>> +
>>> + FTZ_(38);
>>> + DNZ_(39);
>>> }
>>>
>>> void
>>>
>>
More information about the mesa-dev
mailing list