[Mesa-dev] RFC: Supporting mediump in NIR

Fri May 15 09:22:37 PDT 2015

On Fri, May 15, 2015 at 11:59:25AM -0400, Rob Clark wrote:
> On Fri, May 15, 2015 at 5:39 AM, Topi Pohjolainen
> <topi.pohjolainen at intel.com> wrote:
> > I wanted to kick-off discussion on how to support floating point
> > precision qualifiers in NIR. This is purely on optimization for
> > GLES where one can reduce the number of GPU cycles. At the moment
> > the compiler discards the qualifiers early when abstract syntax
> > tree (AST) is transformed into intermediate presentation (IR).
> >
> > Iago added the initial support to IR in order to check that the
> > stages agree on the precision. Naturally I started by rebasing his
> > work on master (I dropped the actual checking part as it didn't
> > automatically fit into master). I realized that it isn't sufficient
> > to have the precision tracked in ir_variable alone. When the IR
> > is further translated into NIR the precision is needed in ir_rvalue
> > as well when NIR decides which opcode to use.
> > Iago's patch isn't needed for the solution here afterall, I just
> > included it to for context sake.
> >
> > Now, there are number of implementation alternatives, I think, both
> > in AST/IR as well is in NIR. I thought I play with one approach to
> > provide something "real" to aid the decision making regarding the
> > architecture.
> >
> > I thought that despite fp16 (medium precision float) isn't really a
> > proper type in glsl, it would clearer if it were one internally in
> > the compiler though. I kept thinking fp64 and how I would introduce
> > that into NIR.
> > The first step was to do pretty much the same as what Dave Airlie
> > did for doubles (fp64) in the compiler frontend.
> >
> > Then in NIR I decided to introduce new opcodes for half floats
> > instead of modifying the existing float instructions to carry
> > additional information about the precision.
> 
> jfwiw, I can[*] in a lot of cases have precision per operand..  for
> example, add a f32 + f16 with result into f32.  So having separate
> opcodes seems kind of funny.

As the opcode in NIR is chosen solely based on the destination type, I
thought that f32 + f16 would be similar thing as int + bool producing
int, for example. I thought that implicit conversions would kick in.
And also the drivers making decisions where a conversion is really
needed (additional mov) or not. I have to admit though that I haven't
thought all the way through how and where the conversions are produced.