[Mesa-dev] [PATCH 00/11] nvc0: compute shaders for Kepler (GK104)
Samuel Pitoiset
samuel.pitoiset at gmail.com
Sat Feb 27 14:01:56 UTC 2016
Hi there,
This series adds support for ARB_compute_shader on Kepler GK104. GK110+ support
is still unstable and need more work.
Almost all dEQP compute tests pass with a ratio of ~97%. As usual, the list of
fails is described below. About piglit, only two tests fail but this is
related to images support.
By the way, the series is built on top of "nvc0: avoid using magic numbers for
the uniform_bo offsets".
Please review,
Thanks!
Samuel Pitoiset (11):
nvc0: use a different offset for buffers and surfaces
nvc0: bind driver cb for compute on c7[] for Kepler
nvc0: bind shader buffers for compute on Kepler
nvc0: bind constant buffers for compute on Kepler
nvc0: allow to use more than 7 UBOs for compute on Kepler
nvc0: bump the number of available UBOs for compute on Kepler
nvc0: reduce likelihood of collision for real buffers on Kepler
nvc0: add indirect compute support on Kepler
nvc0/ir: fix wrong pred emission for ld lock on GK104
nv50/ir: add atomics support on shared memory for Kepler
nvc0: enable compute shaders on Kepler (GK104)
.../drivers/nouveau/codegen/nv50_ir_driver.h | 2 +
.../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 5 +-
.../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 216 ++++++++++++++--
.../nouveau/codegen/nv50_ir_lowering_nvc0.h | 13 +-
src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 17 +-
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 18 +-
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4 +-
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 1 -
src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 282 +++++++++++++++++----
src/gallium/drivers/nouveau/nvc0/nve4_compute.h | 27 +-
10 files changed, 468 insertions(+), 117 deletions(-)
--
2.7.1
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/acos/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/atan2/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/distance/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/ldexp/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/length/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/refract/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/highp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/lowp_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/sqrt/mediump_compute/vec4: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/scalar: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec2: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec3: fail
deqp-gles31/functional/shaders/builtin_functions/precision/tanh/highp_compute/vec4: fail
Very likely related to sqrt. Same as Fermi.
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getboolean: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getfloat: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger: fail
deqp-gles31/functional/state_query/integer/max_compute_image_uniforms_getinteger64: fail
deqp-gles31/functional/compute/basic/copy_image_to_ssbo_large: crash
deqp-gles31/functional/compute/basic/copy_image_to_ssbo_small: crash
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_large: crash
deqp-gles31/functional/compute/basic/copy_ssbo_to_image_small: crash
deqp-gles31/functional/compute/basic/image_barrier_multiple: crash
deqp-gles31/functional/compute/basic/image_barrier_single: crash
No images support.
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/isampler2darray: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler2darrayshadow: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler2dshadow: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/sampler3d: fail
deqp-gles31/functional/shaders/opaque_type_indexing/sampler/const_literal/compute/samplercubeshadow: fail
I don't exactly know what happens with those tests because all other samplers ones pass.
My assumption is that those fails are related to use float instead of vec4. Not sure if
it's related to the compute support though. Need investigation.
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_combined_grid_1000x1000_drawcount_5000: crash
deqp-gles31/functional/draw_indirect/compute_interop/large/drawelements_separate_grid_1000x1000_drawcount_5000: crash
We submit too fast and the kernel kills the pushbuf. Same fails on Fermi, can be fixed later.
Unrelated to the compute support.
More information about the mesa-dev
mailing list