[Mesa-dev] [PATCH v5 12/70] glsl: implement unsized array length

Sun Dec 13 20:10:45 PST 2015

On Thu, Sep 10, 2015 at 9:35 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
> +ir_expression *
> +lower_ubo_reference_visitor::process_ssbo_unsized_array_length(ir_rvalue **rvalue,
> +                                                               ir_dereference *deref,
> +                                                               ir_variable *var)
> +{
> +   mem_ctx = ralloc_parent(*rvalue);
> +
> +   ir_rvalue *base_offset = NULL;
> +   unsigned const_offset;
> +   bool row_major;
> +   int matrix_columns;
> +   int unsized_array_stride = calculate_unsized_array_stride(deref);
> +
> +   /* Compute the offset to the start if the dereference as well as other
> +    * information we need to calculate the length.
> +    */
> +   setup_for_load_or_store(var, deref,
> +                           &base_offset, &const_offset,
> +                           &row_major, &matrix_columns);
> +   /* array.length() =
> +    *  max((buffer_object_size - offset_of_array) / stride_of_array, 0)
> +    */
> +   ir_expression *buffer_size = emit_ssbo_get_buffer_size();
> +
> +   ir_expression *offset_of_array = new(mem_ctx)
> +      ir_expression(ir_binop_add, base_offset,
> +                    new(mem_ctx) ir_constant(const_offset));
> +   ir_expression *offset_of_array_int = new(mem_ctx)
> +      ir_expression(ir_unop_u2i, offset_of_array);
> +
> +   ir_expression *sub = new(mem_ctx)
> +      ir_expression(ir_binop_sub, buffer_size, offset_of_array_int);
> +   ir_expression *div =  new(mem_ctx)
> +      ir_expression(ir_binop_div, sub,
> +                    new(mem_ctx) ir_constant(unsized_array_stride));
> +   ir_expression *max = new(mem_ctx)
> +      ir_expression(ir_binop_max, div, new(mem_ctx) ir_constant(0));
> +
> +   return max;
> +}

Hi Iago,

I noticed that this comes out as a signed division. Is there any way
to make it into an unsigned division? That way we can e.g. optimize a
power-of-two division into a shift, and it's a few instructions fewer
to emulate when there's no built-in integer division instruction
(which I think is most GPUs). It seems that you went to some trouble
to do all this with signed integers, but I can't quite figure out why.

Cheers,

  -ilia