[Mesa-dev] [PATCH v5 12/70] glsl: implement unsized array length

Mon Dec 14 23:39:25 PST 2015

On Mon, 2015-12-14 at 08:00 +0100, Iago Toral wrote:
> On Sun, 2015-12-13 at 23:10 -0500, Ilia Mirkin wrote:
> > On Thu, Sep 10, 2015 at 9:35 AM, Iago Toral Quiroga <itoral at igalia.
> > com> wrote:
> > > +ir_expression *
> > > +lower_ubo_reference_visitor::process_ssbo_unsized_array_length(i
> > > r_rvalue **rvalue,
> > > +                                                               i
> > > r_dereference *deref,
> > > +                                                               i
> > > r_variable *var)
> > > +{
> > > +   mem_ctx = ralloc_parent(*rvalue);
> > > +
> > > +   ir_rvalue *base_offset = NULL;
> > > +   unsigned const_offset;
> > > +   bool row_major;
> > > +   int matrix_columns;
> > > +   int unsized_array_stride =
> > > calculate_unsized_array_stride(deref);
> > > +
> > > +   /* Compute the offset to the start if the dereference as well
> > > as other
> > > +    * information we need to calculate the length.
> > > +    */
> > > +   setup_for_load_or_store(var, deref,
> > > +                           &base_offset, &const_offset,
> > > +                           &row_major, &matrix_columns);
> > > +   /* array.length() =
> > > +    *  max((buffer_object_size - offset_of_array) /
> > > stride_of_array, 0)
> > > +    */
> > > +   ir_expression *buffer_size = emit_ssbo_get_buffer_size();
> > > +
> > > +   ir_expression *offset_of_array = new(mem_ctx)
> > > +      ir_expression(ir_binop_add, base_offset,
> > > +                    new(mem_ctx) ir_constant(const_offset));
> > > +   ir_expression *offset_of_array_int = new(mem_ctx)
> > > +      ir_expression(ir_unop_u2i, offset_of_array);
> > > +
> > > +   ir_expression *sub = new(mem_ctx)
> > > +      ir_expression(ir_binop_sub, buffer_size,
> > > offset_of_array_int);
> > > +   ir_expression *div =  new(mem_ctx)
> > > +      ir_expression(ir_binop_div, sub,
> > > +                    new(mem_ctx)
> > > ir_constant(unsized_array_stride));
> > > +   ir_expression *max = new(mem_ctx)
> > > +      ir_expression(ir_binop_max, div, new(mem_ctx)
> > > ir_constant(0));
> > > +
> > > +   return max;
> > > +}
> > 
> > Hi Iago,
> > 
> > I noticed that this comes out as a signed division. Is there any
> > way
> > to make it into an unsigned division? That way we can e.g. optimize
> > a
> > power-of-two division into a shift, and it's a few instructions
> > fewer
> > to emulate when there's no built-in integer division instruction
> > (which I think is most GPUs). It seems that you went to some
> > trouble
> > to do all this with signed integers, but I can't quite figure out
> > why.
> 
> Hi Ilia,
> 
> I agree, I don't see why we would do the extra work to make this
> signed... Samuel wrote this code though, so I'll let him confirm.
> 
> Iago
> 
> 

The formula is:

array.length() =
    max((buffer_object_size - offset_of_array) / stride_of_array, 0)

I did the signed division because the buffer size could be smaller than
the offset of the unsized array (i.e., the application setup it
wrongly), hence the result of the subtraction would be negative. Then,
after the signed division, max() would return 0 in that case.

If you want to use an unsigned division, it would be something like:

* Execute max((buffer_object_size - offset_of_array), 0).
* If the subtraction result is a positive value, max() would return a
non-zero value. We can detect that case (with a CMP for example),
convert the value to unsigned, do the unsigned division and return its
result.
* If the subtraction result is a negative or zero, max() would return
0, so we can detect that case (it is actually the other branch of the
previous CMP) and return it.

Sam