[Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

Iago Toral itoral at igalia.com
Wed Jun 24 07:36:24 PDT 2015


On Wed, 2015-06-24 at 15:43 +0300, Francisco Jerez wrote:
> Iago Toral <itoral at igalia.com> writes:
> 
> > On Wed, 2015-06-17 at 17:20 -0700, Jordan Justen wrote:
> >> I wanted to question whether this was required, based on this text
> >> from the extension spec:
> >> 
> >> "The ability to write to buffer objects creates the potential for
> >>  multiple independent shader invocations to read and write the same
> >>  underlying memory. The same issue exists with the
> >>  ARB_shader_image_load_store extension provided in OpenGL 4.2, which
> >>  can write to texture objects and buffers. In both cases, the
> >>  specification makes few guarantees related to the relative order of
> >>  memory reads and writes performed by the shader invocations."
> >> 
> >> But I'm not sure if we can reconcile CSE with 'memoryBarrier' and
> >> 'barrier'. curro, any thoughts from image load/store?
> >
> > I think the problem is within the same thread, that text above talks
> > about multiple invocations reading from and writing to the same
> > location, but within the same invocation, the order of reads and writes
> > must be preserved:
> >
> > "Buffer variable memory reads and writes within a single shader
> > invocation are processed in order.  However, the order of reads and
> > writes performed in one invocation relative to those performed by
> > another invocation is largely undefined."
> >
> > For example, if X is a shader storage buffer variable and we have code
> > like this with just one invocation:
> >
> > ssbo_store(X, 1);
> > a = ssbo_load(X) + 1  // a = 2
> > ssbo_store(X, 2);
> > b = ssbo_load(X) + 1; // b = 3
> >
> > CSE could mess it up like this:
> >
> > ssbo_store(X, 1);
> > tmp = ssbo_load(X) + 1  // tmp = 2
> > a = tmp;
> > ssbo_store(X, 2);
> > b = tmp;
> >
> > which would be incorrect. I think I wrote this patch after seeing
> > something like this happening. The CSE pass clearly states that it does
> > not support write variables after all.
> >
> > Also, notice the same would apply if there are multiple invocations but
> > the shader code used something like gl_VertexID or gl_FragCoord to make
> > each invocation read from/write to a different address within the SSBO
> > buffer (I imagine this is the usual way to operate with SSBOs). In these
> > cases, even if we have multiple invocations, keeping the relative order
> > of reads and writes within each one is necessary.
> >
> 
> AFAICT the reason why this (and many of the other changes in GLSL
> optimization passes) is needed is because SSBO loads have been
> implemented as ir_expression nodes instead of being lowered into
> intrinsics (as other side-effectful operations do like
> ARB_shader_image_load_store and ARB_shader_atomic_counters).  This
> surely broke the assumption of a number of optimization passes that
> ir_expression nodes behave as pure functions.  I guess the reason why
> you've done it this way is because UBO loads were already being
> represented as expressions, so I see why you may have wanted to use the
> same approach for SSBOs even though there is a fundamental difference
> between the two: UBO loads have no side effects and are constant for a
> given set of arguments and a given shader execution, SSBO loads and
> stores are not.  SSBO stores couldn't be accommodated into the same
> framework so easily, and you decided to create a separate ir node for
> them, what seems inconsistent with loads.  Intrinsics would probably
> have been a good fit for both loads and stores, and would have made all
> these optimization changes unnecessary...
> 
> P.S.: Sorry for the late reply, I was on vacation when I was CC'ed.

Right, your assessment about the reasons behind the current
implementation is correct. I did not realize of these issues when I
decided to go with the current implementation, now it does look like
going with GLSL intrinsics would have made things a bit easier. I
suppose it would make sense to revisit the implementation in the near
future taking your work on arb_shader_image_load_store as a reference.

Iago

> > Iago
> >
> >> -Jordan
> >> 
> >> On 2015-06-03 00:01:13, Iago Toral Quiroga wrote:
> >> > SSBOs are read/write and this CSE pass only handles read-only variables.
> >> > ---
> >> >  src/glsl/opt_cse.cpp | 33 ++++++++++++++++++++++++++++++++-
> >> >  1 file changed, 32 insertions(+), 1 deletion(-)
> >> > 
> >> > diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp
> >> > index 4b8e9a0..a05ab46 100644
> >> > --- a/src/glsl/opt_cse.cpp
> >> > +++ b/src/glsl/opt_cse.cpp
> >> > @@ -245,6 +245,28 @@ contains_rvalue(ir_rvalue *haystack, ir_rvalue *needle)
> >> >  }
> >> >  
> >> >  static bool
> >> > +expression_contains_ssbo_load(ir_expression *expr)
> >> > +{
> >> > +   if (expr->operation == ir_binop_ssbo_load)
> >> > +      return true;
> >> > +
> >> > +   for (unsigned i = 0; i < expr->get_num_operands(); i++) {
> >> > +      ir_rvalue *op = expr->operands[i];
> >> > +      if (op->ir_type == ir_type_expression &&
> >> > +          expression_contains_ssbo_load(op->as_expression())) {
> >> > +         return true;
> >> > +      } else if (op->ir_type == ir_type_swizzle) {
> >> > +         ir_swizzle *swizzle = op->as_swizzle();
> >> > +         ir_expression *val = swizzle->val->as_expression();
> >> > +         if (val && expression_contains_ssbo_load(val))
> >> > +            return true;
> >> > +      }
> >> > +   }
> >> > +
> >> > +   return false;
> >> > +}
> >> > +
> >> > +static bool
> >> >  is_cse_candidate(ir_rvalue *ir)
> >> >  {
> >> >     /* Our temporary variable assignment generation isn't ready to handle
> >> > @@ -260,7 +282,16 @@ is_cse_candidate(ir_rvalue *ir)
> >> >      * to variable-index array dereferences at some point.
> >> >      */
> >> >     switch (ir->ir_type) {
> >> > -   case ir_type_expression:
> >> > +   case ir_type_expression: {
> >> > +         /* Skip expressions involving SSBO loads, since these operate on
> >> > +          * read-write variables, meaning that the same ssbo_load expression
> >> > +          * may return a different value if the underlying buffer storage
> >> > +          * is written in between.
> >> > +          */
> >> > +         if (expression_contains_ssbo_load(ir->as_expression()))
> >> > +            return false;
> >> > +      }
> >> > +      break;
> >> >     case ir_type_texture:
> >> >        break;
> >> >     default:
> >> > -- 
> >> > 1.9.1
> >> > 
> >> > _______________________________________________
> >> > mesa-dev mailing list
> >> > mesa-dev at lists.freedesktop.org
> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >> 




More information about the mesa-dev mailing list