[Mesa-dev] [PATCH 00/36] Computer shader shared variables

Lofstedt, Marta marta.lofstedt at intel.com
Mon Nov 16 03:10:22 PST 2015


I can confirm that this patch-set does not cause any regression for GLES 3.1 CTS test on HSW and BDW.

> -----Original Message-----
> From: Justen, Jordan L
> Sent: Saturday, November 14, 2015 10:44 PM
> To: mesa-dev at lists.freedesktop.org
> Cc: Kristian Høgsberg Kristensen; Lofstedt, Marta; Palli, Tapani; Justen, Jordan
> L
> Subject: [PATCH 00/36] Computer shader shared variables
> 
> git://people.freedesktop.org/~jljusten/mesa cs-shared-variables-v1
> http://patchwork.freedesktop.org/bundle/jljusten/cs-shared-variables-v1
> 
> Patches 1 - 13:
> 
>  * Rebased curro's "i965: L3 cache partitioning." (sent Sept 6)
> 
> Patches 14 - 19:
> 
>  * Rework lower_ubo_reference to allow code sharing with
>    lower_shared_reference
> 
> Patches 20 - 28:
> 
>  * Add shared variable support for i965. Add lower_shared_reference,
>    which works similar to lower_ubo_reference for SSBOs, except it
>    merges all shared variable into one shared variable region. (Rather
>    than separate BOs like SSBOs allows.)
> 
> Patches 29 - 36:
> 
>  * Adds atomic support for shared variable on i965, which is
>    implemented similar to SSBOs.
> 
> On Ivy Bridge fixes several piglit and OpenGLES 3.1 CTS tests:
> 
>  * spec/arb_compute_shader/compiler/shared-atomics.comp: fail pass
>  * spec/arb_compute_shader/execution/shared-atomic: crash pass
>  * spec/arb_compute_shader/execution/simple-barrier: crash pass
> 
>  * es31-cts/compute_shader/atomic-case1: fail pass
>  * es31-cts/compute_shader/atomic-case3: fail pass
>  * es31-cts/compute_shader/shared-indexing: fail pass
>  * es31-cts/compute_shader/shared-max: fail pass
>  * es31-cts/compute_shader/shared-simple: fail pass
>  * es31-cts/compute_shader/shared-struct: fail pass
>  * es31-cts/compute_shader/work-group-size: fail pass
> 
> Francisco Jerez (13):
>   i965: Define symbolic constants for some useful L3 cache control
>     registers.
>   i965: Keep track of whether LRI is allowed in the context struct.
>   i965: Define state flag to signal that the URB size has been altered.
>   i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC
>     flush is set.
>   i965: Import tables enumerating the set of validated L3
>     configurations.
>   i965: Implement programming of the L3 configuration.
>   i965/hsw: Enable L3 atomics.
>   i965: Implement selection of the closest L3 configuration based on a
>     vector of weights.
>   i965: Calculate appropriate L3 partition weights for the current
>     pipeline state.
>   i965: Implement L3 state atom.
>   i965: Add debug flag to print out the new L3 state during transitions.
>   i965: Work around L3 state leaks during context switches.
>   i965: Hook up L3 partitioning state atom.
> 
> Jordan Justen (23):
>   glsl ubo/ssbo: Use enum to track current buffer access type
>   glsl ubo/ssbo: Split buffer access to insert_buffer_access
>   glsl ubo/ssbo: Add lower_buffer_access class
>   glsl ubo/ssbo: Move is_dereferenced_thing_row_major into
>     lower_buffer_access
>   glsl ubo/ssbo: Move common code into
>     lower_buffer_access::setup_buffer_access
>   glsl: Add default matrix ordering in lower_buffer_access
>   glsl: Don't lower_variable_index_to_cond_assign for shared variables
>   glsl: Add lowering pass for shared variable references
>   nir: Translate glsl shared var load intrinsic to nir intrinsic
>   nir: Translate glsl shared var store intrinsic to nir intrinsic
>   i965: Disable vector splitting on shared variables
>   i965/fs: Handle nir shared variable load intrinsic
>   i965/fs: Handle nir shared variable store intrinsic function
>   i965: Enable shared local memory for CS shared variables
>   i965: Lower shared variable references to intrinsic calls
>   glsl: Allow atomic functions to be used with shared variables
>   glsl: Replace atomic_ssbo and ssbo_atomic with atomic
>   glsl: Check for SSBO variable in SSBO atomic lowering
>   glsl: Translate atomic intrinsic functions on shared variables
>   glsl: Buffer atomics are supported for compute shaders
>   glsl: Disable several optimizations on shared variables
>   nir: Add nir intrinsics for shared variable atomic operations
>   i965/nir: Implement shared variable atomic operations
> 
>  src/glsl/Makefile.sources                          |   2 +
>  src/glsl/ast_function.cpp                          |  18 +-
>  src/glsl/builtin_functions.cpp                     | 236 ++++-----
>  src/glsl/ir_optimization.h                         |   1 +
>  src/glsl/linker.cpp                                |   4 +
>  src/glsl/lower_buffer_access.cpp                   | 565 +++++++++++++++++++++
>  src/glsl/lower_buffer_access.h                     |  72 +++
>  src/glsl/lower_shared_reference.cpp                | 511 +++++++++++++++++++
>  src/glsl/lower_ubo_reference.cpp                   | 536 +++----------------
>  src/glsl/lower_variable_index_to_cond_assign.cpp   |   3 +
>  src/glsl/nir/glsl_to_nir.cpp                       | 131 ++++-
>  src/glsl/nir/nir_intrinsics.h                      |  29 +-
>  src/glsl/opt_constant_propagation.cpp              |   3 +-
>  src/glsl/opt_constant_variable.cpp                 |   3 +-
>  src/glsl/opt_copy_propagation.cpp                  |   3 +-
>  src/mesa/drivers/dri/i965/Makefile.sources         |   1 +
>  src/mesa/drivers/dri/i965/brw_compiler.h           |   1 +
>  src/mesa/drivers/dri/i965/brw_context.h            |  17 +-
>  src/mesa/drivers/dri/i965/brw_cs.c                 |   2 +
>  src/mesa/drivers/dri/i965/brw_defines.h            |   4 +
>  src/mesa/drivers/dri/i965/brw_fs.h                 |   2 +
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp           | 147 ++++++
>  .../drivers/dri/i965/brw_fs_vector_splitting.cpp   |   1 +
>  src/mesa/drivers/dri/i965/brw_pipe_control.c       |   4 +-
>  src/mesa/drivers/dri/i965/brw_shader.cpp           |   3 +
>  src/mesa/drivers/dri/i965/brw_state.h              |   5 +
>  src/mesa/drivers/dri/i965/brw_state_upload.c       |   5 +
>  src/mesa/drivers/dri/i965/gen7_cs_state.c          |  12 +
>  src/mesa/drivers/dri/i965/gen7_l3_state.c          | 545
> ++++++++++++++++++++
>  src/mesa/drivers/dri/i965/gen7_urb.c               |   3 +
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c      |   7 +
>  src/mesa/drivers/dri/i965/intel_batchbuffer.h      |   6 +-
>  src/mesa/drivers/dri/i965/intel_debug.c            |   1 +
>  src/mesa/drivers/dri/i965/intel_debug.h            |   1 +
>  src/mesa/drivers/dri/i965/intel_extensions.c       |   8 +-
>  src/mesa/drivers/dri/i965/intel_reg.h              |  53 ++
>  src/mesa/main/mtypes.h                             |   7 +
>  37 files changed, 2352 insertions(+), 600 deletions(-)  create mode 100644
> src/glsl/lower_buffer_access.cpp  create mode 100644
> src/glsl/lower_buffer_access.h  create mode 100644
> src/glsl/lower_shared_reference.cpp
>  create mode 100644 src/mesa/drivers/dri/i965/gen7_l3_state.c
> 
> --
> 2.6.2



More information about the mesa-dev mailing list