[Mesa-dev] [PATCH 0/9] nvc0: ARB_shader_ballot for Kepler+

Boyan Ding boyan.j.ding at gmail.com
Sat Apr 8 09:51:12 UTC 2017


This series implements ARB_shader_ballot for Kepler+. I have tested
it on GK208, 8 of 9 of piglit execution tests passed against current
master. The only failed test is because of the test's wrong assumption
when thread group size is less than 64, which is the case for nvidia
hardware. Other architectures (GK104 and GM107) are not tested because
of my lack of hareware, but I have validated the code generated on both
architectures, and it seems correct.

Patches 1-4 implement OP_SHFL emission, with a fix for nvc0 in patch 2.
Patch 5 extends nv50 ir's OP_VOTE to translate readFirstInvocationARB.
Patches 6-8 hook up the logic with tgsi, and the extension is eventually
flipped on in the last patch.

Boyan Ding (9):
  gm107/ir: Emit third src 'bound' and optional predicate output of SHFL
  nvc0/ir: Properly handle a "split form" of predicate destination
  nvc0/ir: Emit OP_SHFL
  gk110/ir: Emit OP_SHFL
  nvc0/ir: Allow 0/1 immediate value as source of OP_VOTE
  nvc0/ir: Add SV_LANEMASK_* system values.
  nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*
  nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*
  nvc0: Enable ARB_shader_ballot on Kepler+

 docs/features.txt                                  |  2 +-
 docs/relnotes/17.1.0.html                          |  2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir.h      |  5 ++
 .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 76 +++++++++++++++++-
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 47 +++++++++--
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 90 ++++++++++++++++++++--
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 55 +++++++++++++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c     |  3 +-
 8 files changed, 260 insertions(+), 20 deletions(-)

-- 
2.12.1



More information about the mesa-dev mailing list