[Mesa-dev] [PATCH 00/11] nvc0: compute shaders for Kepler (GK104)

Samuel Pitoiset samuel.pitoiset at gmail.com
Sat Feb 27 14:01:56 UTC 2016


Hi there,

This series adds support for ARB_compute_shader on Kepler GK104. GK110+ support
is still unstable and need more work.

Almost all dEQP compute tests pass with a ratio of ~97%. As usual, the list of
fails is described below. About piglit, only two tests fail but this is
related to images support.

By the way, the series is built on top of "nvc0: avoid using magic numbers for
the uniform_bo offsets".

Please review,
Thanks!

Samuel Pitoiset (11):
  nvc0: use a different offset for buffers and surfaces
  nvc0: bind driver cb for compute on c7[] for Kepler
  nvc0: bind shader buffers for compute on Kepler
  nvc0: bind constant buffers for compute on Kepler
  nvc0: allow to use more than 7 UBOs for compute on Kepler
  nvc0: bump the number of available UBOs for compute on Kepler
  nvc0: reduce likelihood of collision for real buffers on Kepler
  nvc0: add indirect compute support on Kepler
  nvc0/ir: fix wrong pred emission for ld lock on GK104
  nv50/ir: add atomics support on shared memory for Kepler
  nvc0: enable compute shaders on Kepler (GK104)

 .../drivers/nouveau/codegen/nv50_ir_driver.h       |   2 +
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  |   5 +-
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp      | 216 ++++++++++++++--
 .../nouveau/codegen/nv50_ir_lowering_nvc0.h        |  13 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h    |  17 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c    |  18 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c     |   4 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.h     |   1 -
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c    | 282 +++++++++++++++++----
 src/gallium/drivers/nouveau/nvc0/nve4_compute.h    |  27 +-
 10 files changed, 468 insertions(+), 117 deletions(-)

-- 
2.7.1



deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec4: fail

Very likely related to sqrt. Same as Fermi.

deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getboolean: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getfloat: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger64: fail
deqp-gles31/functional/compute/basic/copy_image_to_ssbo_large: crash
deqp-gles31/functional/compute/basic/copy_image_to_ssbo_small: crash
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_large: crash
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_small: crash
deqp-gles31/functional/compute/basic/image_barrier_multiple: crash
deqp-gles31/functional/compute/basic/image_barrier_single: crash

No images support.

deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/isampler2darray: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler2darrayshadow: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler2dshadow: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler3d: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/samplercubeshadow: fail

I don't exactly know what happens with those tests because all other samplers ones pass.
My assumption is that those fails are related to use float instead of vec4. Not sure if
it's related to the compute support though. Need investigation.

deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_5000: crash
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_5000: crash

We submit too fast and the kernel kills the pushbuf. Same fails on Fermi, can be fixed later.
Unrelated to the compute support.


More information about the mesa-dev mailing list