[Mesa-dev] [PATCH 16/78] i965/nir/vec4: Implement store_output intrinsic

Jason Ekstrand jason at jlekstrand.net
Mon Jul 13 04:57:52 PDT 2015

On Wed, Jul 8, 2015 at 11:54 PM, Eduardo Lima Mitev <elima at igalia.com> wrote:
> On 06/30/2015 06:51 PM, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev <elima at igalia.com> wrote:
>>> The index into the output_reg array where to store the destination register is
>>> fetched from the nir_outputs map built during nir_setup_outputs stage.
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 17 +++++++++++++++--
>>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 8a2d335..55d4490 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -520,10 +520,23 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
>>>     }
>>>     case nir_intrinsic_store_output_indirect:
>>> +      has_indirect = true;
>>>        /* fallthrough */
>>> -   case nir_intrinsic_store_output:
>>> -      /* @TODO: Not yet implemented */
>>> +   case nir_intrinsic_store_output: {
>>> +      int offset = instr->const_index[0];
>>> +      int output = nir_outputs[offset];
>>> +
>>> +      src = get_nir_src(instr->src[0], nir_output_types[offset]);
>>> +      dest = dst_reg(src);
>>> +
>>> +      dest.writemask = brw_writemask_for_size(instr->num_components);
>>> +
>>> +      if (has_indirect)
>>> +         dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[1]));
>>> +
>>> +      output_reg[output] = dest;
>> I'm very confused about the amount of indirection going on here.  It
>> seems to me that we should be setting these outputs up in
>> setup_outputs() rather than storring off a map from ints to other ints
>> and setting it up here.  I didn't make this comment on the patch for
>> setup_outputs() because I wanted to wait to see it used before I
>> commented on it.
>> I'm guessing you did it this way because the nir_assign_var_locations
>> is giving you bogus values.  If so, then it might be better to just
>> assign variable locations in setup_outputs() rather than having a
>> remap table.  The whole point of nir_lower_io is to make IO easy for
>> the back-end.  If you need a re-map table, then it's no longer making
>> it easy and we need to think more about what's going on.
>> --Jason
> That double indirection felt bad since the beginning, but it was needed
> to store the original variable's location (var->data.location). Let me
> explain:
> We are (re)using the plumbering in vec4_visitor to setup URB, so the
> only thing we need to do is to store the out register in "output_reg"
> map at the correct location. And that location is given by the original
> location in the shader (var->data.location).
> So, in this case, "nir_assign_var_locations" pass, which constructs
> var->data.driver_location, is not useful to us, except to give us
> consecutive indexes to construct the other map we have, the type map,
> which is needed to carry the correct type from the original variable to
> the output register.

If nir_assign_var_locations isn't doing anything for you, don't call
it.  You'll need to do something with var->data.driver_location.  If
what you really want is var->data.location, then just copy that to
var->data.driver_location when you do nir_setup_outputs.  Or
(depending on how the URB setup works, I don't actually know), put the
actual URB location in var->data.driver_location when you walk the

>From there, you have two options.  One would be to setup output_reg at
the same time with the correct types right away and emit a MOV when
you get a store_output.  (Copy propagation should clean up the MOV.)
For what it's worth, I don't think the type matters; a URB write just
writes data to something so as long as you don't have a type mismatch
in a MOV, the hardware won't care.

The other option, would be to directly emit the URB write in
store_output.  At the moment, it may be better to take the first
option since that better matches what the FS does right now.  But both
should work fine.

> So, before knowing that I could modify nir_lower_io, my best shot at
> transferring the original variable location was to create this
> nir_outputs map. Now, what I have done is to put that value in
> const_index[1] of the intrinsic instruction, which was previously
> unused. What do you think?
> That removes the offset to offset map, but we still need the type map.
> About your comment on initializing the register during setup stage, I'm
> a bit confused: the register that we need to store is not available
> during setup stage, because we still don't have local registers allocated.

What do you mean?  Because you don't have the destination of the
output_write intrinsic allocated?  Even if the register has a file of
BAD_FILE, you could still store the type there.  Also, as I said
above, the hardware shouldn't care about the types of data.  As long
as the URB write code doesn't accidentally do a float -> int
conversion or something, we should be fine.

>>>        break;
>>> +   }
>>>     case nir_intrinsic_load_vertex_id:
>>>        unreachable("should be lowered by lower_vertex_id()");
>>> --

More information about the mesa-dev mailing list