[Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

Wed Jun 24 05:43:12 PDT 2015

Iago Toral <itoral at igalia.com> writes:

> On Wed, 2015-06-17 at 17:20 -0700, Jordan Justen wrote:
>> I wanted to question whether this was required, based on this text
>> from the extension spec:
>> 
>> "The ability to write to buffer objects creates the potential for
>>  multiple independent shader invocations to read and write the same
>>  underlying memory. The same issue exists with the
>>  ARB_shader_image_load_store extension provided in OpenGL 4.2, which
>>  can write to texture objects and buffers. In both cases, the
>>  specification makes few guarantees related to the relative order of
>>  memory reads and writes performed by the shader invocations."
>> 
>> But I'm not sure if we can reconcile CSE with 'memoryBarrier' and
>> 'barrier'. curro, any thoughts from image load/store?
>
> I think the problem is within the same thread, that text above talks
> about multiple invocations reading from and writing to the same
> location, but within the same invocation, the order of reads and writes
> must be preserved:
>
> "Buffer variable memory reads and writes within a single shader
> invocation are processed in order.  However, the order of reads and
> writes performed in one invocation relative to those performed by
> another invocation is largely undefined."
>
> For example, if X is a shader storage buffer variable and we have code
> like this with just one invocation:
>
> ssbo_store(X, 1);
> a = ssbo_load(X) + 1  // a = 2
> ssbo_store(X, 2);
> b = ssbo_load(X) + 1; // b = 3
>
> CSE could mess it up like this:
>
> ssbo_store(X, 1);
> tmp = ssbo_load(X) + 1  // tmp = 2
> a = tmp;
> ssbo_store(X, 2);
> b = tmp;
>
> which would be incorrect. I think I wrote this patch after seeing
> something like this happening. The CSE pass clearly states that it does
> not support write variables after all.
>
> Also, notice the same would apply if there are multiple invocations but
> the shader code used something like gl_VertexID or gl_FragCoord to make
> each invocation read from/write to a different address within the SSBO
> buffer (I imagine this is the usual way to operate with SSBOs). In these
> cases, even if we have multiple invocations, keeping the relative order
> of reads and writes within each one is necessary.
>

AFAICT the reason why this (and many of the other changes in GLSL
optimization passes) is needed is because SSBO loads have been
implemented as ir_expression nodes instead of being lowered into
intrinsics (as other side-effectful operations do like
ARB_shader_image_load_store and ARB_shader_atomic_counters).  This
surely broke the assumption of a number of optimization passes that
ir_expression nodes behave as pure functions.  I guess the reason why
you've done it this way is because UBO loads were already being
represented as expressions, so I see why you may have wanted to use the
same approach for SSBOs even though there is a fundamental difference
between the two: UBO loads have no side effects and are constant for a
given set of arguments and a given shader execution, SSBO loads and
stores are not.  SSBO stores couldn't be accommodated into the same
framework so easily, and you decided to create a separate ir node for
them, what seems inconsistent with loads.  Intrinsics would probably
have been a good fit for both loads and stores, and would have made all
these optimization changes unnecessary...

P.S.: Sorry for the late reply, I was on vacation when I was CC'ed.

> Iago
>
>> -Jordan
>> 
>> On 2015-06-03 00:01:13, Iago Toral Quiroga wrote:
>> > SSBOs are read/write and this CSE pass only handles read-only variables.
>> > ---
>> >  src/glsl/opt_cse.cpp | 33 ++++++++++++++++++++++++++++++++-
>> >  1 file changed, 32 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp
>> > index 4b8e9a0..a05ab46 100644
>> > --- a/src/glsl/opt_cse.cpp
>> > +++ b/src/glsl/opt_cse.cpp
>> > @@ -245,6 +245,28 @@ contains_rvalue(ir_rvalue *haystack, ir_rvalue *needle)
>> >  }
>> >  
>> >  static bool
>> > +expression_contains_ssbo_load(ir_expression *expr)
>> > +{
>> > +   if (expr->operation == ir_binop_ssbo_load)
>> > +      return true;
>> > +
>> > +   for (unsigned i = 0; i < expr->get_num_operands(); i++) {
>> > +      ir_rvalue *op = expr->operands[i];
>> > +      if (op->ir_type == ir_type_expression &&
>> > +          expression_contains_ssbo_load(op->as_expression())) {
>> > +         return true;
>> > +      } else if (op->ir_type == ir_type_swizzle) {
>> > +         ir_swizzle *swizzle = op->as_swizzle();
>> > +         ir_expression *val = swizzle->val->as_expression();
>> > +         if (val && expression_contains_ssbo_load(val))
>> > +            return true;
>> > +      }
>> > +   }
>> > +
>> > +   return false;
>> > +}
>> > +
>> > +static bool
>> >  is_cse_candidate(ir_rvalue *ir)
>> >  {
>> >     /* Our temporary variable assignment generation isn't ready to handle
>> > @@ -260,7 +282,16 @@ is_cse_candidate(ir_rvalue *ir)
>> >      * to variable-index array dereferences at some point.
>> >      */
>> >     switch (ir->ir_type) {
>> > -   case ir_type_expression:
>> > +   case ir_type_expression: {
>> > +         /* Skip expressions involving SSBO loads, since these operate on
>> > +          * read-write variables, meaning that the same ssbo_load expression
>> > +          * may return a different value if the underlying buffer storage
>> > +          * is written in between.
>> > +          */
>> > +         if (expression_contains_ssbo_load(ir->as_expression()))
>> > +            return false;
>> > +      }
>> > +      break;
>> >     case ir_type_texture:
>> >        break;
>> >     default:
>> > -- 
>> > 1.9.1
>> > 
>> > _______________________________________________
>> > mesa-dev mailing list
>> > mesa-dev at lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150624/4e553bb8/attachment-0001.sig>