[Mesa-dev] [PATCH v2 00/78] i965: A new vec4 backend based on NIR

Thu Jul 23 03:16:40 PDT 2015

Hi,

This is the second version of the series that adds a new vec4 backend for i965 based on NIR. The first series was sent some weeks ago [1] and went through review by Jason and Kenneth (thanks a lot!). This series is a result of addressing issues detected during that feedback.

This series also adds support for the NIR->vec4 pass on geometry shaders and on ARB_vertex_programs. Both supports were work-in-progress by the time we sent the first version, and are now completed. The patch-sets for GS and ARB_vertex_program were added at the end of the series.

Like the first version, we tested the backend on gen6 and gen7 (specifically, SNB, IVB and HSW); and with both piglit and dEQP. No regressions observed with piglit. The issues related with register spilling that we had in the first version have been fixed.

With dEQP, we observe a regression that is causing the following 3 tests to fail:
dEQP-GLES3.functional.shaders.loops.for_dynamic_iterations.vector_counter_vertex
dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.vector_counter_vertex
dEQP-GLES3.functional.shaders.loops.do_while_dynamic_iterations.vector_counter_vertex

We have a patch that solves those specific cases, but we would like to discuss it independently, before including it in the series. We believe these regressions should not block the review of the series.

As usual, a git tree is available at https://github.com/Igalia/mesa/tree/nir-vec4-v2 to ease testing.

cheers,
Eduardo

[1] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087448.html

Alejandro Piñeiro (9):
  i965/vec4: Redefine make_reg_for_system_value() to allow reuse in
    NIR->vec4 pass
  i965/nir/vec4: Add setup for system values
  i965/nir/vec4: Implement intrinsics that load system values
  i965/nir/vec4: Implement atomic counter intrinsics (read, inc and dec)
  i965/nir: Disable alu_to_scalar pass on non-scalar shaders
  i965/vec4: Change vec4_visitor::swizzle_result() method to allow reuse
  i965/vec4: Add a new dst_reg constructor accepting a brw_reg_type
  i965/ir/vec4: Refactor visit(ir_texture *ir)
  i965/nir/vec4: Add implementation of nir_emit_texture()

Antia Puentes (36):
  i965/nir/vec4: Implement loading values from an UBO
  i965/nir/vec4: Prepare source and destination registers for ALU
    operations
  i965/nir/vec4: Implement single-element "mov" operations
  i965/nir/vec4: Lower "vecN" instructions and mark them unreachable
  i965/nir/vec4: Implement int<->float format conversion ops
  i965/nir/vec4: Implement the addition operation
  i965/nir/vec4: Implement multiplication
  i965/vec4: Return the last emitted instruction in emit_math()
  i965/nir/vec4: Implement more math operations
  i965/nir/vec4: Implement carry/borrow for addition/subtraction
  i965/nir/vec4: Implement various rounding functions
  i965/vec4: Return the emitted instruction in emit_minmax()
  i965/nir/vec4: Implement min/max operations
  i965/nir/vec4: Derivatives are not allowed in VS
  i965/nir: Add utility method for comparisons
  i965/nir/vec4: Implement non-vector comparison ops
  i965/nir/vec4: Implement equality ops on vectors
  i965/nir/vec4: Implement non-equality ops on vectors
  i965/nir/vec4: Implement logical operators
  i965/nir/vec4: Implement "bool<->int,float" format conversion
  i965/nir/vec4: "noise" ops should already be lowered
  i965/nir/vec4: Implement pack/unpack operations
  i965/nir/vec4: Implement bit operations
  i965/nir/vec4: Implement the "sign" operation
  i965/nir/vec4: Implement "shift" operations
  i965/nir/vec4: Implement floating-point fused multiply-add
  i965/vec4: Return the emitted instruction in emit_lrp()
  i965/nir/vec4: Implement linear interpolation
  i965/nir/vec4: Implement conditional select
  i965/nir/vec4: Implement the dot product operation
  i965/nir/vec4: Implement vector "any" operation
  i965/nir/vec4: Mark as unreachable ops that should be already lowered
  i965/vec4: Enable NIR-vec4 pass on ARB_vertex_programs
  i965/vec4: Generate missing NIR for fixed-function vertex programs
  i965/nir/vec4: Handle uniforms on vertex programs
  i965/vec4: Handle uniform and GRF array access on vertex programs
    (NIR)

Eduardo Lima Mitev (20):
  i965/nir/vec4: Add implementation placeholders for a new NIR->vec4
    pass
  i965/nir/vec4: Select between new nir_vec4 or current vec4_visitor
    code-paths
  i965/vec4: Move type_size() method to brw_vec4_visitor class
  i965/nir/vec4: Add setup of input variables in NIR->vec4 pass
  i965/nir/vec4: Add shader function implementation
  i965/nir: Pass a is_scalar boolean to brw_create_nir()
  i965/nir/vec4: Implement load_const intrinsic
  i965/nir: Move brw_type_for_nir_type() to brw_nir to allow reuse
  i965/vec4: Add auxiliary func to build a writemask from a component
    size
  i965/nir/vec4: Add get_nir_dst() and get_nir_src() methods
  i965/nir/vec4: Implement loop statements (nir_cf_node_loop)
  i965/nir/vec4: Implement load_input intrinsic
  i965/vec4: Make sure that register types always match during
    emit_urb_slot()
  nir-lower_io: Store data.location instead, in const_index[0] of
    store_output
  i965/nir/vec4: Implement store_output intrinsic
  i965/nir/vec4: Implement nir_emit_jump
  i965/nir: Add new utility method brw_glsl_base_type_for_nir_type()
  i965/vec4: Move is_high_sample() method to vec4_visitor class
  i965/vec4: Change vec4_visitor::emit_mcs_fetch() method to allow reuse
  i965/vec4: Change vec4_visitor::gather_channel() method to allow reuse

Iago Toral Quiroga (11):
  i965/nir/vec4: Add setup of uniform variables
  nir/nir_lower_io: Add vec4 support
  i965/nir: Dot not assign direct uniform locations first for vec4-based
    shaders
  i965/nir/vec4: Implement conditional statements (nir_cf_node_if)
  i965/nir/vec4: Implement load_uniform intrinsic
  i965/nir: Enable NIR-vec4 pass on geometry shaders
  i965/gs: Refactor ir_emit_vertex and ir_end_primitive
  i965/nir/gs: Handle geometry shaders inputs
  i965/nir/gs: Implement EmitVertex and EndPrimitive
  i965/nir/gs: Implement support for gl_InvocationID system value
  i965/nir: Do not scalarize phis in non-scalar setups

Samuel Iglesias Gonsalvez (2):
  nir: Fix output swizzle in get_mul_for_src
  i965/gs/gen6: Refactor ir_emit_vertex and ir_end_primitive for gen6

 src/glsl/nir/nir.h                                |   18 +-
 src/glsl/nir/nir_lower_io.c                       |   99 +-
 src/glsl/nir/nir_opt_peephole_ffma.c              |   13 +-
 src/mesa/drivers/dri/i965/Makefile.sources        |    2 +
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp          |   34 +-
 src/mesa/drivers/dri/i965/brw_ir_vec4.h           |    2 +
 src/mesa/drivers/dri/i965/brw_nir.c               |   93 +-
 src/mesa/drivers/dri/i965/brw_nir.h               |    8 +-
 src/mesa/drivers/dri/i965/brw_program.c           |    6 +-
 src/mesa/drivers/dri/i965/brw_reg.h               |    6 +
 src/mesa/drivers/dri/i965/brw_shader.cpp          |   24 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp            |   53 +-
 src/mesa/drivers/dri/i965/brw_vec4.h              |   82 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp     |  118 ++
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   27 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |    9 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp        | 1507 +++++++++++++++++++++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp    |  470 ++++---
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |    5 +-
 src/mesa/drivers/dri/i965/brw_vs.h                |    3 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp     |   13 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h       |    2 +
 22 files changed, 2263 insertions(+), 331 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
 create mode 100644 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

-- 
2.1.4