[Mesa-dev] [PATCH] radeon/llvm: Handle TGSI KIL opcode for SI.
Michel Dänzer
michel at daenzer.net
Tue Aug 28 11:11:04 PDT 2012
On Die, 2012-08-28 at 12:07 -0400, Tom Stellard wrote:
> On Tue, Aug 28, 2012 at 04:26:43PM +0200, Michel Dänzer wrote:
> > From: Michel Dänzer <michel.daenzer at amd.com>
> >
> > Fixes piglit fp-kil with radeonsi.
> >
>
> > Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
> > ---
> > src/gallium/drivers/radeon/SIISelLowering.cpp | 35 +++++++++++++++++++++++++
> > src/gallium/drivers/radeon/SIISelLowering.h | 2 ++
> > src/gallium/drivers/radeon/SIInstructions.td | 7 +++++
> > 3 files changed, 44 insertions(+)
> >
> > diff --git a/src/gallium/drivers/radeon/SIISelLowering.cpp b/src/gallium/drivers/radeon/SIISelLowering.cpp
> > index 092c2fa..f5eac16 100644
> > --- a/src/gallium/drivers/radeon/SIISelLowering.cpp
> > +++ b/src/gallium/drivers/radeon/SIISelLowering.cpp
> > @@ -129,6 +129,9 @@ MachineBasicBlock * SITargetLowering::EmitInstrWithCustomInserter(
> > case AMDGPU::SI_INTERP_CONST:
> > LowerSI_INTERP_CONST(MI, *BB, I);
> > break;
> > + case AMDGPU::SI_KIL:
> > + LowerSI_KIL(MI, *BB, I, MRI);
> > + break;
> > case AMDGPU::SI_V_CNDLT:
> > LowerSI_V_CNDLT(MI, *BB, I, MRI);
> > break;
> > @@ -193,6 +196,38 @@ void SITargetLowering::LowerSI_INTERP_CONST(MachineInstr *MI,
> > MI->eraseFromParent();
> > }
> >
> > +void SITargetLowering::LowerSI_KIL(MachineInstr *MI, MachineBasicBlock &BB,
> > + MachineBasicBlock::iterator I, MachineRegisterInfo & MRI) const
> > +{
> > + /* Clear this pixel from the exec mask if the operand is negative */
>
> Please use // style comments in the LLVM code.
Okay.
> > + BuildMI(BB, I, BB.findDebugLoc(I), TII->get(AMDGPU::V_CMPX_LE_F32_e32),
> > + AMDGPU::VCC)
> > + .addReg(AMDGPU::SREG_LIT_0)
> > + .addOperand(MI->getOperand(0));
> > +
> > + /* If the exec mask is non-zero, skip the next two instructions */
>
> This comment is misleading, because it is branching on the VCC status
> and not the exec mask status.
Right, the VCC and exec mask are identical at this point, but I guess
using S_CBRANCH_EXECNZ would be clearer.
> > + BuildMI(BB, I, BB.findDebugLoc(I), TII->get(AMDGPU::S_CBRANCH_VCCNZ))
> > + .addImm(3)
> > + .addReg(AMDGPU::VCC);
> > +
>
> I'm a little confused about how this is supposed to work. As I understand
> it, the program will branch even if just one of the waves in the wave front
> sets their VCC bit (which in this case means the pixel is not killed).
> Do we also need to export the exec_mask in the very last export of the
> program?
Yes, see si_llvm_emit_epilogue():
/* Specify whether the EXEC mask represents the valid mask */
last_args[1] = lp_build_const_int32(base->gallivm,
si_shader_ctx->type == TGSI_PROCESSOR_FRAGMENT);
I'll follow up with updated patches.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Debian, X and DRI developer
More information about the mesa-dev
mailing list