[Mesa-dev] [PATCH 00/74] ARB_shader_storage_buffer_object (mesa, i965)

Iago Toral Quiroga itoral at igalia.com
Thu May 14 07:06:03 PDT 2015


Hi,

this series brings support for OpenGL 4.3 ARB_shader_storage_object [1].
It includes the Mesa/GLSL frontend bits as well as the Intel i965 driver
implementation.

The extension provides a new kind of buffer called shader storage buffer
object (SSBO), which is similar to UBOs but:
1. Is writable
2. Allows a number of atomic operations
3. Allows an optional unsized array at the bottom of its definitions

This series was developed by Samuel Iglesias and myself, based on
initial code by Kristian Høgsberg.

Development branch with the patches and some dependencies (*):
git clone -b itoral-ARB_shader_storage_buffer_object-v1.0 https://github.com/Igalia/mesa.git

(*) The i965 implementation needs to use untyped read/write messages
to implement SSBO reads/writes which are also used in the implementation
of ARB_shader_image_load_store that Curro is working on. The branch
linked above includes these patches from Curro as well as a couple of
general bugfixes (not necessarily SSBO-specific) from Tapani and Antia
that are necessary for correct behavior in some scenarios.

Piglit repository including SSBO tests:
git clone -b arb_shader_storage_buffer_object-v1 https://github.com/Igalia/piglit.git

=== General notes about the implementation ===

Because SSBOs are very similar to UBOs the implementation attempts to
reuse the code we already have for UBOs wherever we can. There is a lot
of code in the GLSL compiler to deal with UBOs, so we do not want to
exactly duplicate that. An "is buffer" flag is added if needed when
reusing UBO data structures so we can tell if a given instance
represents uniforms or buffers.

The lower_ubo_reference pass is also updated to detect SSBO reads
(lowered to ir_binop_ssbo_load expressions) and writes (lowered to a
new IR node ir_ssbo_store that drivers can detect and implement).

Since SSBOs are writeable, various optimization passes had to be altered
accordingly. For example, we cannot kill dead assignments to buffer
variables, or do CSE on ssbo load expressions, etc.

Other features include: interactions with ARB_program_interface_query,
support for a new std430 layout mode specific to SSBOs, memory
qualifiers from ARB_shader_image_load_store applicable to buffer
variables (also at layout level) and new atomic operations that can be
used with integer buffer variables.

Notice that NIR is not supported yet, so anyone wanting to test this on
i965 needs to set INTEL_USE_NIR=0.

=== Comments for reviewers ===

The i965 implementation was developed and tested on Haswell and
IvyBridge, other platforms might require small tweaks. Specifically,
the message used in the implementation of the unsized array length()
function requires a header in Skylake.

Both Mesa and i965 need to give default values for certain things
like the maximum allowed size of a shader storage buffer, the maximum
number of buffer bindings, the maximum number of combined shader storage
blocks, etc. We are not sure about what default values we should use
for these in all cases and except in a few cases we generally copied the
default values from uniforms. That may be fine or not, so let us know if
we should use different values for any of these setting at the Mesa or
i965 levels.

The i965 implementation gets SSBO reads/writes to unaligned offsets
right, even in the cases where there is expression-based indexing into
arrays. Notice that this does not currently work with UBOs, so maybe we
want to extend the implementation based on untyped read messages to UBOs
as well so we can fix this, merging the ssbo_load and ubo_load code
paths in the visitor code. That would make UBO loads go through the data
cache instead of using sampler messages, but I guess that should not be
a problem.

We did not include GL_MAX_TESS_CONTROL_SHADER_STORAGE_BLOCKS and
GL_MAX_TESS_EVALUATION_SHADER_STORAGE_BLOCKS because Mesa does not
support tesselation shaders yet.

We did not add support for glShaderStorageBlockBinding in display lists
even though glUniformBlockBinding is supported. This is because
ARB_uniform_buffer_object has a explicit mention to this for the case of
glUniformBlockBinding, but  ARB_shader_storage_object doesn't have the
same mention for glShaderStorageBlockBinding.

Since memory qualifiers were introduced with ARB_shader_image_load_store,
qualifiers fields in some structrures are prefixed with 'image_'. We did
not change that here, but we probably want to rename that now that SSBOs
can also use them. I think we can do this with a later patch.

=== Piglit ===

There are no piglit regressions except for one introduced with patch
61 that is actually a bug fix for incorrect behavior in the current UBO
implementation.  The regressed piglit test expects the old (incorrect)
behavior. There is a patch [2] in the piglit mailing list to fix that
test.

=== Quick patch reference ===

Patches 01-13: Extension bringup, compiler bits for SSBOs and buffer
               variables (mesa)
Patches 14-19: GL_SHADER_STORAGE_BUFFER target (mesa)
Patches 20-25: edit a few optimization passes to play well with buffer
               variables (mesa)
Patches 26-28: add the lowering of ssbo writes to ir_ssbo_store (mesa)
Patches 29-34: driver implementation bits for SSBO buffers and buffer
               variables (i965)
Patches 35-38: support for SSBO unsized arrays (mesa, i965)
Patches 39-40: buffer-related bugfixes for UBOs and SSBOs (i965)
Patches 41-44: add std430 layout mode for SSBOs (mesa)
Patches 45-46: shader storage buffer object resource usage limit checks
Patches 47-52: implement SSBO reads and writes (i965)
Patches 53-58: implement SSBO atomics (mesa, i965)
Patch   59:    add glShaderStorageBlockBinding (mesa)
Patches 60-61: buffer queries for SSBOs and fix for the same queries on
               UBOs (mesa)
Patches 62-66: memory qualifiers with SSBOs (mesa)
Patch   67:    interaction with ARB_program_interface_query
Patch   68:    test tokens for ARB_shader_storage_buffer_object
Patch   69:    GLAPI for ARB_shader_storage_object
Patch   70:    getters for ARB_shader_storage_buffer_object max
               constants (mesa)
Patch   71:    enable ARB_shader_storage_buffer_object for gen7+
Patch   72:    Mark ARB_shader_storage_buffer_object as done for i965
Patch   73:    Fix instance blocks with inactive elements (general
               bugfix)
Patch   74:    Skip dependency control for opcodes emitting multiple
               instructions (i965/vec4, optional)

Patch 73 is not necessary, but fixes a bug that exists in master with
instance blocks and UBOs that we also hit while testing instance blocks
with SSBOs. It was part of one of our dEQP batches but was never
reviewed.

Patch 74 is not necessary, at some point during development I needed
that but the final implementation does not require it any more.
Considering that we have that same patch on the FS and that the problem
it fixes can be hit in vec4 for the same reasons I felt like it could
make sense to merge it anyway.

[1] https://www.opengl.org/registry/specs/ARB/shader_storage_buffer_object.txt
[2] http://lists.freedesktop.org/archives/piglit/2015-May/015972.html

Antia Puentes (1):
  glsl: Consider active all elements of a shared/std140 block array

Iago Toral Quiroga (41):
  glsl: Identify active uniform blocks that are buffer blocks as such.
  mesa: Add shader storage buffer support to struct gl_context
  mesa: Initialize and free shader storage buffers
  mesa: Implement _mesa_DeleteBuffers for target
    GL_SHADER_STORAGE_BUFFER
  mesa: Implement _mesa_BindBuffersBase for target
    GL_SHADER_STORAGE_BUFFER
  mesa: Implement _mesa_BindBuffersRange for target
    GL_SHADER_STORAGE_BUFFER
  mesa: Implement _mesa_BindBufferBase for target
    GL_SHADER_STORAGE_BUFFER
  mesa: Implement _mesa_BindBufferRange for target
    GL_SHADER_STORAGE_BUFFER
  glsl: Don't do tree grafting on buffer variables
  glsl: Do not kill dead assignments to buffer variables or SSBO
    declarations.
  glsl: Do not do CSE for expressions involving SSBO loads
  glsl: Don't do constant propagation on buffer variables
  glsl: Don't do constant variable on buffer variables
  glsl: Don't do copy propagation on buffer variables
  mesa: Add new IR node ir_ssbo_store
  glsl: Lower shader storage buffer object writes to ir_ssbo_store
  glsl: Do constant folding on ir_ssbo_store
  i965: Use 16-byte offset alignment for shader storage buffers
  i965: Implement DriverFlags.NewShaderStorageBuffer
  i965: Set MaxShaderStorageBuffers for compute shaders
  i965: Upload Shader Storage Buffer Object surfaces
  i965: handle visiting of ir_var_buffer variables
  i965/fs: Do not split buffer variables
  i965/fs: Implement SSBO writes
  i965/fs: Implement SSBO reads
  i965/fs: Do not include the header with a pixel mask in untyped read
    messages
  i965/vec4: Implement SSBO writes
  i965/vec4: Implement SSBO reads
  glsl: Rename atomic counter functions
  glsl: Add atomic functions from ARB_shader_storage_buffer_object
  i965/vec4: Implement shader storage buffer object atomic intrinsics
  i965/fs: Implement shader storage buffer object atomic intrinsics
  glsl: First argument to atomic functions must be a buffer variable
  mesa: Add queries for GL_SHADER_STORAGE_BUFFER
  glsl: Allow use of memory qualifiers with
    ARB_shader_storage_buffer_object.
  glsl: Apply memory qualifiers to buffer variables
  glsl: Allow memory layout qualifiers on shader storage buffer objects
  glsl: Do not allow assignments to read-only variables
  glsl: Do not allow reads from write-only variables
  docs: Mark ARB_shader_storage_buffer_object as done for i965.
  i965/vec4: Skip dependency control for opcodes emitting multiple
    instructions

Kristian Høgsberg (7):
  glsl: Add ir_var_buffer
  glsl: Implement parser support for 'buffer' qualifier
  glsl: link buffer variables and shader storage buffer interface blocks
  glsl: Add ir_binop_ssbo_load expression operation.
  glsl: lower SSBO reads to ir_binop_ssbo_load expressions
  i965: do not emit_bool_to_cond_code with ssbo load expressions
  glsl: atomic counters can be declared as buffer-qualified variables

Samuel Iglesias Gonsalvez (25):
  mesa: define ARB_shader_storage_buffer_object extension
  mesa: add MaxShaderStorageBlocks to struct gl_program_constants
  glsl: enable binding layout qualifier usage for shader storage buffer
    objects
  glsl: shader buffer variables cannot have initializers
  glsl: buffer variables cannot be defined outside interface blocks
  glsl: fix error messages in invalid declarations of shader storage
    blocks
  glsl: add support for unsized arrays in shader storage blocks
  glsl: Add parser/compiler support for unsized array's length()
  i965/vec4: Implement unsized array's length calculation
  i965/fs: Implement unsized array's length calculation
  i965/wm: emit null buffer surfaces when null buffers are attached
  i965/wm: surfaces should have the API buffer size, not the drm buffer
    size
  glsl: Add parser/compiler support for std430 interface packing
    qualifier
  glsl: propagate interface packing information to arrays of scalars,
    vectors.
  glsl: propagate std430 packing qualifier to struct's members and array
    of structs
  glsl: add std430 interface packing support to ssbo writes and unsized
    array length
  glsl: a shader storage buffer must be smaller than the maximum size
    allowed
  glsl: number of active shader storage blocks must be within allowed
    limits
  mesa: add glShaderStorageBlockBinding()
  glsl: fix UNIFORM_BUFFER_START or UNIFORM_BUFFER_SIZE query when no
    buffer object is bound
  main: Add SHADER_STORAGE_BLOCK and BUFFER_VARIABLE support for
    ARB_program_interface_query
  main/tests: add ARB_shader_storage_buffer_object tokens to
    enum_strings
  glapi: add ARB_shader_storage_block_buffer_object
  mesa: Add getters for the GL_ARB_shader_storage_buffer_object max
    constants
  i965: Enable ARB_shader_storage_buffer_object extension for gen7+

 docs/GL3.txt                                       |   2 +-
 src/glsl/ast.h                                     |  12 +
 src/glsl/ast_array_index.cpp                       |   6 +-
 src/glsl/ast_function.cpp                          |  37 ++
 src/glsl/ast_to_hir.cpp                            | 362 ++++++++++--
 src/glsl/ast_type.cpp                              |   4 +-
 src/glsl/builtin_functions.cpp                     | 215 ++++++-
 src/glsl/builtin_types.cpp                         |   3 +-
 src/glsl/builtin_variables.cpp                     |   5 +-
 src/glsl/glcpp/glcpp-parse.y                       |   3 +
 src/glsl/glsl_lexer.ll                             |  11 +-
 src/glsl/glsl_parser.yy                            | 110 +++-
 src/glsl/glsl_parser_extras.cpp                    |  65 ++-
 src/glsl/glsl_parser_extras.h                      |   7 +
 src/glsl/glsl_symbol_table.cpp                     |  16 +-
 src/glsl/glsl_types.cpp                            | 203 +++++--
 src/glsl/glsl_types.h                              |  48 +-
 src/glsl/hir_field_selection.cpp                   |  15 +-
 src/glsl/ir.cpp                                    |  14 +
 src/glsl/ir.h                                      |  82 ++-
 src/glsl/ir_function.cpp                           |   1 +
 src/glsl/ir_hierarchical_visitor.cpp               |  18 +
 src/glsl/ir_hierarchical_visitor.h                 |   2 +
 src/glsl/ir_hv_accept.cpp                          |  23 +
 src/glsl/ir_print_visitor.cpp                      |  15 +-
 src/glsl/ir_print_visitor.h                        |   1 +
 src/glsl/ir_reader.cpp                             |   2 +
 src/glsl/ir_rvalue_visitor.cpp                     |  21 +
 src/glsl/ir_rvalue_visitor.h                       |   3 +
 src/glsl/ir_uniform.h                              |   5 +
 src/glsl/ir_validate.cpp                           |  18 +
 src/glsl/ir_visitor.h                              |   2 +
 src/glsl/link_interface_blocks.cpp                 |  15 +-
 src/glsl/link_uniform_block_active_visitor.cpp     |  24 +
 src/glsl/link_uniform_block_active_visitor.h       |   1 +
 src/glsl/link_uniform_blocks.cpp                   |  36 +-
 src/glsl/link_uniform_initializers.cpp             |   3 +-
 src/glsl/link_uniforms.cpp                         |  32 +-
 src/glsl/linker.cpp                                | 166 ++++--
 src/glsl/linker.h                                  |   1 +
 src/glsl/loop_unroll.cpp                           |   1 +
 src/glsl/lower_named_interface_blocks.cpp          |   5 +-
 src/glsl/lower_ubo_reference.cpp                   | 633 ++++++++++++++++++---
 src/glsl/lower_variable_index_to_cond_assign.cpp   |   1 +
 src/glsl/nir/glsl_to_nir.cpp                       |   7 +
 src/glsl/opt_constant_folding.cpp                  |  16 +
 src/glsl/opt_constant_propagation.cpp              |   8 +
 src/glsl/opt_constant_variable.cpp                 |   7 +
 src/glsl/opt_copy_propagation.cpp                  |   2 +-
 src/glsl/opt_cse.cpp                               |  33 +-
 src/glsl/opt_dead_code.cpp                         |   9 +-
 src/glsl/opt_structure_splitting.cpp               |   5 +-
 src/glsl/opt_tree_grafting.cpp                     |   9 +-
 .../glapi/gen/ARB_shader_storage_buffer_object.xml |  36 ++
 src/mapi/glapi/gen/GL4x.xml                        |  18 +-
 src/mapi/glapi/gen/Makefile.am                     |   1 +
 src/mapi/glapi/gen/gl_API.xml                      |   6 +-
 src/mesa/drivers/dri/i965/brw_context.c            |   2 +
 src/mesa/drivers/dri/i965/brw_context.h            |   6 +
 src/mesa/drivers/dri/i965/brw_defines.h            |   4 +
 src/mesa/drivers/dri/i965/brw_eu_emit.c            |   4 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp               |   1 +
 src/mesa/drivers/dri/i965/brw_fs.h                 |   5 +
 .../dri/i965/brw_fs_channel_expressions.cpp        |   3 +
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp     |  47 ++
 .../drivers/dri/i965/brw_fs_vector_splitting.cpp   |   1 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp       | 349 ++++++++++--
 src/mesa/drivers/dri/i965/brw_shader.cpp           |   6 +
 src/mesa/drivers/dri/i965/brw_state_upload.c       |   1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp             |   1 +
 src/mesa/drivers/dri/i965/brw_vec4.h               |   8 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  35 ++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp     | 361 +++++++++++-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   |  70 ++-
 src/mesa/drivers/dri/i965/intel_buffer_objects.c   |   2 +
 src/mesa/drivers/dri/i965/intel_extensions.c       |   1 +
 src/mesa/main/bufferobj.c                          | 380 +++++++++++++
 src/mesa/main/config.h                             |   2 +
 src/mesa/main/context.c                            |   8 +
 src/mesa/main/extensions.c                         |   1 +
 src/mesa/main/get.c                                |  38 +-
 src/mesa/main/get_hash_params.py                   |  12 +
 src/mesa/main/mtypes.h                             |  57 +-
 src/mesa/main/program_resource.c                   |   7 +-
 src/mesa/main/shader_query.cpp                     | 265 ++++++++-
 src/mesa/main/tests/enum_strings.cpp               |  15 +
 src/mesa/main/uniforms.c                           |  52 ++
 src/mesa/main/uniforms.h                           |   4 +
 src/mesa/program/ir_to_mesa.cpp                    |  10 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp         |  16 +
 90 files changed, 3800 insertions(+), 380 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_shader_storage_buffer_object.xml

-- 
1.9.1



More information about the mesa-dev mailing list