[Mesa-dev] [PATCH v4 (part2) 29/59] nir: Implement __intrinsic_store_ssbo

Connor Abbott cwabbott0 at gmail.com
Fri Aug 7 10:01:05 PDT 2015


On Fri, Aug 7, 2015 at 1:15 AM, Iago Toral <itoral at igalia.com> wrote:
> On Fri, 2015-08-07 at 07:43 +0200, Iago Toral wrote:
>> On Thu, 2015-08-06 at 11:06 -0700, Connor Abbott wrote:
>> > On Thu, Aug 6, 2015 at 12:30 AM, Iago Toral <itoral at igalia.com> wrote:
>> > > On Wed, 2015-08-05 at 12:17 -0700, Connor Abbott wrote:
>> > >> On Wed, Aug 5, 2015 at 1:30 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
>> > >> > ---
>> > >> >  src/glsl/nir/glsl_to_nir.cpp  | 36 ++++++++++++++++++++++++++++++++++++
>> > >> >  src/glsl/nir/nir_intrinsics.h | 12 ++++++------
>> > >> >  2 files changed, 42 insertions(+), 6 deletions(-)
>> > >> >
>> > >> > diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>> > >> > index 642affd..cbec2df 100644
>> > >> > --- a/src/glsl/nir/glsl_to_nir.cpp
>> > >> > +++ b/src/glsl/nir/glsl_to_nir.cpp
>> > >> > @@ -641,6 +641,8 @@ nir_visitor::visit(ir_call *ir)
>> > >> >           op = nir_intrinsic_image_atomic_comp_swap;
>> > >> >        } else if (strcmp(ir->callee_name(), "__intrinsic_memory_barrier") == 0) {
>> > >> >           op = nir_intrinsic_memory_barrier;
>> > >> > +      } else if (strcmp(ir->callee_name(), "__intrinsic_store_ssbo") == 0) {
>> > >> > +         op = nir_intrinsic_store_ssbo;
>> > >> >        } else {
>> > >> >           unreachable("not reached");
>> > >> >        }
>> > >> > @@ -730,6 +732,40 @@ nir_visitor::visit(ir_call *ir)
>> > >> >        }
>> > >> >        case nir_intrinsic_memory_barrier:
>> > >> >           break;
>> > >> > +      case nir_intrinsic_store_ssbo: {
>> > >> > +         exec_node *param = ir->actual_parameters.get_head();
>> > >> > +         ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>> > >> > +
>> > >> > +         param = param->get_next();
>> > >> > +         ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
>> > >> > +
>> > >> > +         param = param->get_next();
>> > >> > +         ir_rvalue *val = ((ir_instruction *)param)->as_rvalue();
>> > >> > +
>> > >> > +         param = param->get_next();
>> > >> > +         ir_constant *write_mask = ((ir_instruction *)param)->as_constant();
>> > >> > +         assert(write_mask);
>> > >> > +
>> > >> > +         /* Check if we need the indirect version */
>> > >> > +         ir_constant *const_offset = offset->as_constant();
>> > >> > +         if (!const_offset) {
>> > >> > +            op = nir_intrinsic_store_ssbo_indirect;
>> > >> > +            ralloc_free(instr);
>> > >> > +            instr = nir_intrinsic_instr_create(shader, op);
>> > >> > +            instr->src[2] = evaluate_rvalue(offset);
>> > >> > +            instr->const_index[0] = 0;
>> > >> > +         } else {
>> > >> > +            instr->const_index[0] = const_offset->value.u[0];
>> > >> > +         }
>> > >> > +
>> > >> > +         instr->const_index[1] = write_mask->value.u[0];
>> > >> > +
>> > >> > +         instr->src[0] = evaluate_rvalue(val);
>> > >> > +         instr->num_components = val->type->vector_elements;
>> > >> > +
>> > >> > +         instr->src[1] = evaluate_rvalue(block);
>> > >> > +         break;
>> > >> > +      }
>> > >> >        default:
>> > >> >           unreachable("not reached");
>> > >> >        }
>> > >> > diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
>> > >> > index f264f55..83eeecd 100644
>> > >> > --- a/src/glsl/nir/nir_intrinsics.h
>> > >> > +++ b/src/glsl/nir/nir_intrinsics.h
>> > >> > @@ -176,12 +176,12 @@ LOAD(input, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
>> > >> >   * offset.
>> > >> >   */
>> > >> >
>> > >> > -#define STORE(name, num_indices, flags) \
>> > >> > -   INTRINSIC(store_##name, 1, ARR(0), false, 0, 0, num_indices, flags) \
>> > >> > -   INTRINSIC(store_##name##_indirect, 2, ARR(0, 1), false, 0, 0, \
>> > >> > +#define STORE(name, extra_srcs, num_indices, flags) \
>> > >> > +   INTRINSIC(store_##name, extra_srcs, ARR(0, 1), false, 0, 0, num_indices, flags) \
>> > >> > +   INTRINSIC(store_##name##_indirect, extra_srcs + 1, ARR(0, 1, 1), false, 0, 0, \
>> > >> >               num_indices, flags) \
>> > >> >
>> > >> > -STORE(output, 1, 0)
>> > >> > -/* STORE(ssbo, 2, 0) */
>> > >> > +STORE(output, 1, 2, 0)
>> > >> > +STORE(ssbo, 2, 2, 0)
>> > >>
>> > >> I don't think outputs should have any extra sources, since they only
>> > >> take a constant index, plus possibly an indirect source that's already
>> > >> covered by the STORE macro. SSBO stores should only have one extra
>> > >> source for the block index. Also, we should update the comment above
>> > >> to explain this similarly to the paragraph above the loads.
>> > >
>> > > SSBO stores need an extra source for the block index and an extra index
>> > > for a writemask.
>> > >
>> > > I'll leave the STORE() macro as it was and just define SSBO stores using
>> > > INTRINSIC() directly then.
>> >
>> > Ok, I see. I don't think you need a separate INTRINSIC(), but right
>> > now calling the parameter you added "extra_srcs" is confusing, since
>> > you're counting the value to be stored, which isn't really "extra" at
>> > all -- every store should have one! How about instead, we change the
>> > STORE macro to have:
>> >
>> > - An "extra_srcs" parameter that contains only sources that are
>> > actually extra, not counting the value to be stored -- direct stores
>> > have "extra_srcs + 1" sources, and indirect sources have "extra_srcs +
>> > 2" sources
>> > - An "extra_indices" parameter that contains the extra indices, and
>> > replace "num_indices" with "extra_indices + 1"
>> >
>> > Then normal stores have both set to 0, and SSBO stores have both set
>> > to 1 to indicate the extra block index and writemask.
>>
>> Sure, sounds good to me.
>
> I think I spoke too fast... the problem with this is that we also need
> to inform about the size of the extra srcs (for ssbos, the extra srcs
> should have a size of 1), so we would need another parameter for that,
> and then the problem is that we need to join/expand that array with the
> number of components for the non-extra srcs. I don't think there is a
> way to do that with a macro, but even if there is, I wonder if it is
> worth the extra complexity.... calling INTRINSIC directly is easy enough
> I think.
>
> The alternative would be to pass the number of components of all the
> srcs (not just the extra srcs) in that extra argument, but that would
> defeat the point of having a macro and careing only about the "extra"
> stuff anyway...

I think that for now, we can just do what you're already doing and add
an extra size in the macro that's unused for store_output. Long-term,
we want to move intrinsics to the same Python system that ALU opcodes
use, which will make problems like these go away.

Connor

>
> Iago
>
>>
>> > >
>> > >> >
>> > >> > -LAST_INTRINSIC(store_output_indirect)
>> > >> > +LAST_INTRINSIC(store_ssbo_indirect)
>> > >> > --
>> > >> > 1.9.1
>> > >> >
>> > >> > _______________________________________________
>> > >> > mesa-dev mailing list
>> > >> > mesa-dev at lists.freedesktop.org
>> > >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> > >>
>> > >
>> > >
>> >
>>
>
>


More information about the mesa-dev mailing list