[Mesa-dev] [PATCH 00/17] st/mesa: add compute shaders support

Samuel Pitoiset samuel.pitoiset at gmail.com
Sun Jan 24 13:09:35 PST 2016


Hello,

This series adds the core support for ARB_compute_shader which is required for
OpenGL 4.3. This is based on the ssbo/atomics work of Ilia which introduces
ARB_shader_atomic_counters and ARB_shader_storage_buffer_object.

I don't include the nvc0 changes in that series but this currently somewhat
works on Fermi. Anyways, I still need to fix some fails for Fermi/Kepler but
they are absolutely unrelated to the core API changes.

In piglit, this passes all compute related tests except two of them which are
related to ARB_shader_image_load_store (Ilia is currently working on it).

In dEQP, there are 147 fails and 1333 passes, so... This seems to work (more or
less). The list of piglit fails are explained below.

The last patch of this series that enables ARB_compute_shader doesn't sound
good to me because this PIPE_CAP_COMPUTE is already turned on for some other
drivers even if they don't *really* support compute shaders. Comments are very
welcome here.

Please review, thanks!

Ilia Mirkin (1):
  mesa: make compute maximums reflect driver-provided values

Samuel Pitoiset (16):
  mesa: do not use a constant for MAX_COMPUTE_SHARED_SIZE
  gallium/cso: add support for compute shaders
  gallium: disable compute shaders for meta ops
  gallium: reserve one shader buffer for shared storage
  gallium: add a new interface for pipe_context::launch_grid()
  gallium: add indirect compute parameters to pipe_grid_info
  tgsi/ureg: add shared variables support for compute shaders
  st/mesa: add a second pipeline for compute
  st/mesa: add compute shader states
  st/mesa: add conversion for compute shaders
  st/mesa: add intrinsics for shared variables
  st/mesa: add mappings for compute shader sysvals
  st/mesa: add state validation for compute shaders
  st/mesa: add a state tracker for compute
  st/mesa: implement limits for ARB_compute_shader
  st/mesa: expose ARB_compute_shader when compute is supported

 src/gallium/auxiliary/cso_cache/cso_context.c      |  53 ++++++++
 src/gallium/auxiliary/cso_cache/cso_context.h      |   6 +
 src/gallium/auxiliary/hud/hud_context.c            |   3 +
 src/gallium/auxiliary/postprocess/pp_run.c         |   3 +
 src/gallium/auxiliary/tgsi/tgsi_build.c            |   1 +
 src/gallium/auxiliary/tgsi/tgsi_dump.c             |   2 +
 src/gallium/auxiliary/tgsi/tgsi_text.c             |   3 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.c             |  11 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.h             |   3 +-
 src/gallium/auxiliary/util/u_blit.c                |   3 +
 src/gallium/drivers/ilo/ilo_gpgpu.c                |   8 +-
 src/gallium/drivers/nouveau/nv50/nv50_compute.c    |  16 +--
 src/gallium/drivers/nouveau/nv50/nv50_context.h    |   3 +-
 .../drivers/nouveau/nv50/nv50_query_hw_sm.c        |  12 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_compute.c    |  19 ++-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h    |   6 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c        |  12 +-
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c    |  10 +-
 src/gallium/drivers/r600/evergreen_compute.c       |  19 ++-
 src/gallium/drivers/radeonsi/si_compute.c          |  33 +++--
 src/gallium/include/pipe/p_context.h               |  17 +--
 src/gallium/include/pipe/p_shader_tokens.h         |   3 +-
 src/gallium/include/pipe/p_state.h                 |  41 +++++-
 src/gallium/include/state_tracker/st_api.h         |   8 ++
 src/gallium/state_trackers/clover/core/kernel.cpp  |  13 +-
 src/gallium/tests/trivial/compute.c                |  11 +-
 src/glsl/builtin_variables.cpp                     |  15 ++-
 src/glsl/glsl_parser_extras.cpp                    |   7 +
 src/glsl/glsl_parser_extras.h                      |   5 +
 src/glsl/main.cpp                                  |   5 +
 src/mesa/Makefile.sources                          |   2 +
 src/mesa/main/config.h                             |  11 --
 src/mesa/main/get_hash_params.py                   |  14 +-
 src/mesa/main/mtypes.h                             |   6 +
 src/mesa/main/shaderapi.c                          |   1 +
 src/mesa/state_tracker/st_atom.c                   |  54 ++++++--
 src/mesa/state_tracker/st_atom.h                   |  10 +-
 src/mesa/state_tracker/st_atom_atomicbuf.c         |  18 +++
 src/mesa/state_tracker/st_atom_constbuf.c          |  46 ++++++-
 src/mesa/state_tracker/st_atom_sampler.c           |   8 ++
 src/mesa/state_tracker/st_atom_shader.c            |  36 +++++
 src/mesa/state_tracker/st_atom_storagebuf.c        |  21 +++
 src/mesa/state_tracker/st_atom_texture.c           |  26 ++++
 src/mesa/state_tracker/st_cb_bitmap.c              |   5 +-
 src/mesa/state_tracker/st_cb_clear.c               |   5 +-
 src/mesa/state_tracker/st_cb_compute.c             |  85 ++++++++++++
 src/mesa/state_tracker/st_cb_compute.h             |  38 ++++++
 src/mesa/state_tracker/st_cb_drawpixels.c          |   7 +-
 src/mesa/state_tracker/st_cb_drawtex.c             |   5 +-
 src/mesa/state_tracker/st_cb_msaa.c                |   2 +-
 src/mesa/state_tracker/st_cb_program.c             |  28 ++++
 src/mesa/state_tracker/st_cb_rasterpos.c           |   2 +-
 src/mesa/state_tracker/st_cb_readpixels.c          |   2 +-
 src/mesa/state_tracker/st_context.c                |   9 ++
 src/mesa/state_tracker/st_context.h                |   4 +
 src/mesa/state_tracker/st_draw.c                   |   4 +-
 src/mesa/state_tracker/st_draw_feedback.c          |   2 +-
 src/mesa/state_tracker/st_extensions.c             |  41 +++++-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 124 ++++++++++++++++-
 src/mesa/state_tracker/st_program.c                | 147 +++++++++++++++++++++
 src/mesa/state_tracker/st_program.h                |  63 +++++++++
 61 files changed, 1024 insertions(+), 153 deletions(-)
 create mode 100644 src/mesa/state_tracker/st_cb_compute.c
 create mode 100644 src/mesa/state_tracker/st_cb_compute.h

** PIGLIT **

MESA_GL_VERSION_OVERRIDE=4.2 ./piglit-run.py -1 --dmesg -t arb_compute_shader
tests/all.py
 
spec/arb_compute_shader/built-in constants/gl_MaxComputeImageUniforms: fail
spec/arb_compute_shader/minmax: fail

No image support.

** DEQP **

MESA_GLES_VERSION_OVERRIDE=3.1 ./piglit-run.py -1 --dmesg -t compute
tests/deqp_gles31.py

deqp-gles31/functional/compute/basic/copy_image_to_ssbo_large: fail
deqp-gles31/functional/compute/basic/copy_image_to_ssbo_small: fail
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_large: fail
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_small: fail
deqp-gles31/functional/compute/basic/image_barrier_multiple: fail
deqp-gles31/functional/compute/basic/image_barrier_single: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getboolean: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getfloat: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger64: fail

No image support.

deqp-gles31/functional/compute/basic/shared_atomic_op_multiple_groups: fail
deqp-gles31/functional/compute/basic/shared_atomic_op_single_group: fail
deqp-gles31/functional/compute/shared_var/atomic/add/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/add/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/add/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/add/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/add/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/add/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/and/highp_int: dmesg-fail
deqp-gles31/functional/compute/shared_var/atomic/and/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/and/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/and/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/and/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/and/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/compswap/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/compswap/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/compswap/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/compswap/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/compswap/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/compswap/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/exchange/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/exchange/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/exchange/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/exchange/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/exchange/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/exchange/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/max/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/max/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/max/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/max/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/max/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/max/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/min/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/min/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/min/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/min/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/min/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/min/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/or/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/or/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/or/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/or/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/or/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/or/mediump_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/xor/highp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/xor/highp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/xor/lowp_int: fail
deqp-gles31/functional/compute/shared_var/atomic/xor/lowp_uint: fail
deqp-gles31/functional/compute/shared_var/atomic/xor/mediump_int: fail
deqp-gles31/functional/compute/shared_var/atomic/xor/mediump_uint: fail

I still need to fix use of shared+atomics on Fermi.

deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_200x200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_200x200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_200x200_drawcount_800: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_500x500_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_200x200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_200x200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_200x200_drawcount_800: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_500x500_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1200x1200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1200x1200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_500x500_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1200x1200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1200x1200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_500x500_drawcount_8: fail

Some other indirect draw+compute pass, these ones fail no apparent reasons.

deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec4: fail

We don't know exactly what's going wrong here, but this is definitely related
to precision...

-- 
2.6.4



More information about the mesa-dev mailing list