[Mesa-dev] [RFCv2 00/13] gallium: add support for NIR as alternate IR

Thu Jan 28 14:24:58 PST 2016

fyi, I'm getting rather close to having a v3 of this patchset ready to
send.  Since the last version, I've implemented NIR versions of
tgsi_emulate, and bitmap/drawpix lowering, moved a bunch of stuff out
of st_glsl_to_tgsi, etc.  Still a few things to clean up, but it is,
IMHO, getting reasonable close to mergable.  Ignoring
variable-indexing[1], I'm down to <40 regressions.

[1] the variable-indexing tests need some more work in freedreno/ir3
backend, because all of a sudden things are not just arrays of vec4.
The issue is not glsl_to_nir related, and would not apply to other
drivers (which already lower vars to regs).  But for now to cut down
on the noise in piglit results I've been -x variable-indexing.

One remaining issue that I've run into is that built-in uniforms don't
get packed in the same way that normal struct uniforms do.  Ie. for
gallium:

   struct {
      vec4 a;
      float b;
      float c;
      float d;
      float e;
  }

gets packed as 5*vec4.  But, for example, gl_FogParameters, which has
conceptually the same structure layout, actually has a physical layout
of 2*vec4:

  static const struct gl_builtin_uniform_element gl_Fog_elements[] = {
     {"color", {STATE_FOG_COLOR}, SWIZZLE_XYZW},
     {"density", {STATE_FOG_PARAMS}, SWIZZLE_XXXX},
     {"start", {STATE_FOG_PARAMS}, SWIZZLE_YYYY},
     {"end", {STATE_FOG_PARAMS}, SWIZZLE_ZZZZ},
     {"scale", {STATE_FOG_PARAMS}, SWIZZLE_WWWW},
  };

In fact, I'm not even sure that it is guaranteed that STATE_FOG_COLOR
and STATE_FOG_PARAMS get allocated to two consecutive vec4's.

What I'm thinking of at the moment is to write a lowering pass to run
before nir_lower_io, which would replace built-in uniform var's w/ the
equiv physical state elements.  Ie. replace "gl_FogParameters gl_Fog;"
with "vec4 fog.color;" plus "vec4 fog.params;", and make corresponding
fixups to the load_var's.

Thoughts?  Better ideas?

This would, I think, need to live outside of nir, since it would need
access to _mesa_builtin_uniform_desc.  So not quite sure where it
should live.  Maybe src/glsl/glsl_nir_lower_builtin_uniforms.cpp?

btw, I *think* this issue only applies to built-in uniforms, but if it
also applies to inputs/outputs/etc pls let me know.

BR,
-R

On Sun, Nov 8, 2015 at 3:12 PM, Rob Clark <robdclark at gmail.com> wrote:
> From: Rob Clark <robclark at freedesktop.org>
>
> Things have progressed somewhat since the initial RFC, to the point
> that all sorts of common things are working (glmark2, xonotic, stk,
> etc), and piglit is *mostly* working (~330 regressions or so)..
>
> (This is with both VS and FS converted, fwiw, compared to initial RFC
> which was only using glsl_to_nir for VS.)
>
> Some of the remaining piglit regressions might be bugs in ir3.. still
> tracking down some assumptions about everything being vec4's (which is
> true in the glsl->tgsi->nir path but not in the glsl->nir path).
>
> Still some cleanup needed, now that I think I have a reasonable grasp
> on how things should work.  But I think not too early to start getting
> some comments.
>
> I still need to ditch the anon union in pipe_shader_state, but was
> planning to leave flag-day rename-everything changes until closer to
> being ready to merge to avoid getting bogged down in rebase conflicts.
>
> This is based on top of nir_clone, nir_shader refcnt'ing and a few
> other in-flight patches which are not part of this patchset.  For the
> complete branch see:
>
> https://github.com/freedreno/mesa/commits/wip-gallium-skip-tgsi
>
> Rob Clark (13):
>   gallium: refactor pipe_shader_state to support multiple IR's
>   gallium: add NIR as a possible IR
>   nir: allow pre-resolved sampler uniform locations
>   nir: add lowering pass for y-transform
>   gallium/auxiliary: introduce nir_emulate
>   mesa/st: add support for NIR as possible driver IR
>   freedreno/ir3: add support for NIR as preferred IR
>   freedreno/ir3: fix const_index handling for uniforms
>   freedreno/ir3: handle large inputs/outputs
>   freedreno/ir3: support load_front_face intrinsic
>   freedreno/ir3: handle tex instrs w/ const offset
>   freedreno/ir3: don't ignore local vars
>   HACK: freedreno/a4xx: workaround glsl_to_nir hang..
>
>  src/gallium/auxiliary/Makefile.sources             |   2 +
>  src/gallium/auxiliary/hud/hud_context.c            |  14 +-
>  src/gallium/auxiliary/nir/nir_emulate.c            | 139 +++++++
>  src/gallium/auxiliary/nir/nir_emulate.h            |  34 ++
>  src/gallium/auxiliary/postprocess/pp_run.c         |   4 +-
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c             |   6 +-
>  src/gallium/auxiliary/util/u_simple_shaders.c      |  42 ++-
>  src/gallium/auxiliary/util/u_tests.c               |   7 +-
>  src/gallium/drivers/freedreno/a4xx/fd4_program.c   |   4 +-
>  src/gallium/drivers/freedreno/freedreno_screen.c   |   5 +-
>  src/gallium/drivers/freedreno/freedreno_util.h     |   1 +
>  .../drivers/freedreno/ir3/ir3_compiler_nir.c       | 110 ++++--
>  src/gallium/drivers/freedreno/ir3/ir3_shader.c     |  16 +-
>  src/gallium/include/pipe/p_defines.h               |  13 +-
>  src/gallium/include/pipe/p_state.h                 |  27 +-
>  src/glsl/Makefile.sources                          |   1 +
>  src/glsl/nir/nir.h                                 |  12 +
>  src/glsl/nir/nir_lower_samplers.c                  |  23 +-
>  src/glsl/nir/nir_lower_wpos_ytransform.c           | 320 ++++++++++++++++
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 410 ++++++++++++++++++++-
>  src/mesa/state_tracker/st_glsl_to_tgsi.h           |   5 +
>  src/mesa/state_tracker/st_program.c                | 118 +++++-
>  src/mesa/state_tracker/st_program.h                |   6 +
>  23 files changed, 1247 insertions(+), 72 deletions(-)
>  create mode 100644 src/gallium/auxiliary/nir/nir_emulate.c
>  create mode 100644 src/gallium/auxiliary/nir/nir_emulate.h
>  create mode 100644 src/glsl/nir/nir_lower_wpos_ytransform.c
>
> --
> 2.5.0
>