[Mesa-dev] [PATCH v2 00/20] add compute shaders support

Samuel Pitoiset samuel.pitoiset at gmail.com
Sat Feb 6 22:04:33 UTC 2016


Hi,

This series adds the core support for ARB_compute_shader which is required for
OpenGL 4.3. This series is now based on mesa master since Ilia has pushed his
work related to ssbo+atomics.

The nvc0 changes are still in my local tree, but the fermi support is now done.

In piglit, this passes all compute related tests except two of them which are
related to ARB_shader_image_load_store (Ilia is currently working on it).

In dEQP, there are 92 fails and 1383 passes. The list of fails are
explained below.

Important changes from v2:
 - introduce TGSI_FILE_MEMORY and make use it instead of TGSI_FILE_BUFFER
 - add PIPE_SHADER_CAP_SUPPORTED_IRS and enable ARB_compute_shader only if the
   underlying driver supports TGSI
 - do not reserve a buffer for shared memory, that's useless

Please review, thanks!

Ilia Mirkin (1):

  mesa: make compute maximums reflect driver-provided values

Samuel Pitoiset (19):
  mesa: do not use a constant for MAX_COMPUTE_SHARED_SIZE
  mesa: store shared size in gl_compute_program
  mesa: add PROGRAM_MEMORY
  gallium/cso: add support for compute shaders
  gallium: add a new interface for pipe_context::launch_grid()
  gallium: add indirect compute parameters to pipe_grid_info
  gallium: add PIPE_SHADER_CAP_SUPPORTED_IRS
  tgsi/ureg: add shared variables support for compute shaders
  st/mesa: add a second pipeline for compute
  st/mesa: add compute shader states
  st/mesa: add conversion for compute shaders
  st/mesa: add intrinsics for shared variables
  st/mesa: keep track of shared memory declarations
  st/mesa: add mappings for compute shader sysvals
  st/mesa: add state validation for compute shaders
  st/mesa: add compute program dispatch callbacks
  st/mesa: implement limits for ARB_compute_shader
  st/mesa: expose ARB_compute_shader when compute is supported
  trace: add all compute related functions

 src/compiler/glsl/builtin_variables.cpp            |  15 ++-
 src/compiler/glsl/glsl_parser_extras.cpp           |   7 +
 src/compiler/glsl/glsl_parser_extras.h             |   5 +
 src/compiler/glsl/main.cpp                         |   5 +
 src/gallium/auxiliary/cso_cache/cso_context.c      |  30 +++++
 src/gallium/auxiliary/cso_cache/cso_context.h      |   4 +
 src/gallium/auxiliary/gallivm/lp_bld_limits.h      |   2 +
 src/gallium/auxiliary/tgsi/tgsi_build.c            |   1 +
 src/gallium/auxiliary/tgsi/tgsi_dump.c             |   5 +
 src/gallium/auxiliary/tgsi/tgsi_exec.h             |   2 +
 src/gallium/auxiliary/tgsi/tgsi_strings.c          |   1 +
 src/gallium/auxiliary/tgsi/tgsi_text.c             |   3 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.c             |  32 +++++
 src/gallium/auxiliary/tgsi/tgsi_ureg.h             |   3 +
 src/gallium/docs/source/screen.rst                 |   2 +
 src/gallium/drivers/freedreno/freedreno_screen.c   |   2 +
 src/gallium/drivers/ilo/ilo_gpgpu.c                |   8 +-
 src/gallium/drivers/ilo/ilo_screen.c               |   2 +
 src/gallium/drivers/nouveau/nv50/nv50_compute.c    |  16 +--
 src/gallium/drivers/nouveau/nv50/nv50_context.h    |   3 +-
 .../drivers/nouveau/nv50/nv50_query_hw_sm.c        |  12 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_compute.c    |  19 ++-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h    |   6 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c        |  12 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c     |   2 +
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c    |  10 +-
 src/gallium/drivers/r300/r300_screen.c             |   4 +
 src/gallium/drivers/r600/evergreen_compute.c       |  19 ++-
 src/gallium/drivers/r600/r600_pipe.c               |   2 +
 src/gallium/drivers/radeonsi/si_compute.c          |  33 +++--
 src/gallium/drivers/radeonsi/si_pipe.c             |   5 +
 src/gallium/drivers/svga/svga_screen.c             |   6 +
 src/gallium/drivers/trace/tr_context.c             |  75 +++++++++++
 src/gallium/drivers/trace/tr_dump_state.c          |  51 +++++++
 src/gallium/drivers/trace/tr_dump_state.h          |   4 +
 src/gallium/drivers/trace/tr_screen.c              |  25 ++++
 src/gallium/drivers/vc4/vc4_screen.c               |   2 +
 src/gallium/include/pipe/p_context.h               |  17 +--
 src/gallium/include/pipe/p_defines.h               |   1 +
 src/gallium/include/pipe/p_shader_tokens.h         |   4 +-
 src/gallium/include/pipe/p_state.h                 |  39 ++++++
 src/gallium/state_trackers/clover/core/kernel.cpp  |  13 +-
 src/gallium/tests/trivial/compute.c                |  11 +-
 src/mesa/Makefile.sources                          |   2 +
 src/mesa/main/config.h                             |  11 --
 src/mesa/main/get_hash_params.py                   |  14 +-
 src/mesa/main/mtypes.h                             |   7 +
 src/mesa/main/shaderapi.c                          |   1 +
 src/mesa/state_tracker/st_atom.c                   |  54 ++++++--
 src/mesa/state_tracker/st_atom.h                   |  11 +-
 src/mesa/state_tracker/st_atom_atomicbuf.c         |  18 +++
 src/mesa/state_tracker/st_atom_constbuf.c          |  46 ++++++-
 src/mesa/state_tracker/st_atom_sampler.c           |   8 ++
 src/mesa/state_tracker/st_atom_shader.c            |  36 +++++
 src/mesa/state_tracker/st_atom_storagebuf.c        |  21 +++
 src/mesa/state_tracker/st_atom_texture.c           |  26 ++++
 src/mesa/state_tracker/st_cb_bitmap.c              |   2 +-
 src/mesa/state_tracker/st_cb_clear.c               |   2 +-
 src/mesa/state_tracker/st_cb_compute.c             |  85 ++++++++++++
 src/mesa/state_tracker/st_cb_compute.h             |  38 ++++++
 src/mesa/state_tracker/st_cb_drawpixels.c          |   4 +-
 src/mesa/state_tracker/st_cb_drawtex.c             |   2 +-
 src/mesa/state_tracker/st_cb_msaa.c                |   2 +-
 src/mesa/state_tracker/st_cb_program.c             |  28 ++++
 src/mesa/state_tracker/st_cb_rasterpos.c           |   2 +-
 src/mesa/state_tracker/st_cb_readpixels.c          |   2 +-
 src/mesa/state_tracker/st_context.c                |   9 ++
 src/mesa/state_tracker/st_context.h                |  12 ++
 src/mesa/state_tracker/st_draw.c                   |   4 +-
 src/mesa/state_tracker/st_draw_feedback.c          |   2 +-
 src/mesa/state_tracker/st_extensions.c             |  39 +++++-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 130 +++++++++++++++++-
 src/mesa/state_tracker/st_program.c                | 147 +++++++++++++++++++++
 src/mesa/state_tracker/st_program.h                |  63 +++++++++
 74 files changed, 1206 insertions(+), 142 deletions(-)
 create mode 100644 src/mesa/state_tracker/st_cb_compute.c
 create mode 100644 src/mesa/state_tracker/st_cb_compute.h

-- 
2.6.4

** PIGLIT **

MESA_GL_VERSION_OVERRIDE=4.2 ./piglit-run.py -1 --dmesg -t arb_compute_shader
tests/all.py
 
 spec/arb_compute_shader/built-in constants/gl_MaxComputeImageUniforms: fail
 spec/arb_compute_shader/minmax: fail

** DEQP **

MESA_GLES_VERSION_OVERRIDE=3.1 ./piglit-run.py -1 --dmesg -t compute
tests/deqp_gles31.py

deqp-gles31/functional/compute/basic/copy_image_to_ssbo_large: fail
deqp-gles31/functional/compute/basic/copy_image_to_ssbo_small: fail
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_large: fail
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_small: fail
deqp-gles31/functional/compute/basic/image_barrier_multiple: fail
deqp-gles31/functional/compute/basic/image_barrier_single: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getboolean: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getfloat: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger64: fail

No image support.

deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_200x200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_200x200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_200x200_drawcount_800: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_combined_grid_500x500_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_200x200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_200x200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_200x200_drawcount_800: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawarrays_separate_grid_500x500_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1200x1200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1200x1200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_500x500_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1200x1200_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1200x1200_drawcount_8: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_500x500_drawcount_1: fail
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_500x500_drawcount_8: fail

Some other indirect draw+compute pass, these ones fail no apparent reasons.

deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec4: fail

We don't know exactly what's going wrong here, but this is definitely not related to
the compute support.


More information about the mesa-dev mailing list