[Mesa-dev] R600/SI: Intrinsics for derivatives

Michel Dänzer michel at daenzer.net
Mon Jun 10 08:33:11 PDT 2013


On Sam, 2013-06-08 at 20:08 -0400, Tom Stellard wrote:
> On Fri, Jun 07, 2013 at 05:48:05PM -0700, Tom Stellard wrote:
> > On Fri, Jun 07, 2013 at 05:24:42PM +0200, Michel Dänzer wrote:
> > > 
> > > @@ -1544,6 +1562,26 @@ def : Pat <
> > >                     sub3)
> > >  >;
> > >  
> > > +class DDXY <Intrinsic name, bits<4> ldsdelta> : Pat <
> > > +  (name v4f32:$src, imm, imm, imm),
> > > +  (INSERT_SUBREG (INSERT_SUBREG (INSERT_SUBREG (INSERT_SUBREG (v4f32 (IMPLICIT_DEF)),
> > > +    (SI_DD (EXTRACT_SUBREG $src, sub0), (V_LSHLREV_B32_e32 2, (SI_TID)),
> > > +           (V_AND_B32_e32 0xfffffff0, (V_LSHLREV_B32_e32 2, (SI_TID))),
> > > +           ldsdelta), sub0),
> > > +    (SI_DD (EXTRACT_SUBREG $src, sub1), (V_LSHLREV_B32_e32 2, (SI_TID)),
> > > +           (V_AND_B32_e32 0xfffffff0, (V_LSHLREV_B32_e32 2, (SI_TID))),
> > > +           ldsdelta), sub1),
> > > +    (SI_DD (EXTRACT_SUBREG $src, sub2), (V_LSHLREV_B32_e32 2, (SI_TID)),
> > > +           (V_AND_B32_e32 0xfffffff0, (V_LSHLREV_B32_e32 2, (SI_TID))),
> > > +           ldsdelta), sub2),
> > > +    (SI_DD (EXTRACT_SUBREG $src, sub3), (V_LSHLREV_B32_e32 2, (SI_TID)),
> > > +           (V_AND_B32_e32 0xfffffff0, (V_LSHLREV_B32_e32 2, (SI_TID))),
> > > +           ldsdelta), sub3)
> > > +>;
> > 
> > Based on this pattern, I don't think you need to use a ddx/ddy intrinsic
> > here.  All of the instructions you are lowering DDX/DDY to have an
> > equivalent LLVM IR instruction or LLVM intrinsic.
> > 
> > For the DS_READ and DS_WRITE instructions all you need to do is emit
> > load/stores to the local address space and then add patterns for those
> > int the backend.  As an added bonus this will add support for OpenCL
> > local address spaces. I think the rest of the instructions are pretty straight
> > forward (unless I've overlooked something).  Let me know if you have any
> > questions.
> 
> I did overlook something.  You will need to add an intrinsic for thread
> id in order to implement ddx/ddy completely in LLVM IR, but I still
> think it is the best way.

Shoot, I was just happy I finally got all the piglit tests passing. :)
But I agree your suggested approach would be better, I'll give it a go.


-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer


More information about the mesa-dev mailing list