[Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

Jason Ekstrand jason at jlekstrand.net
Fri Jul 10 12:24:23 PDT 2015


On Fri, Jul 10, 2015 at 9:53 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
>> On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>>
>>>> On Jul 9, 2015 7:57 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>>>>>
>>>>> We were passing src0 alpha and oMask in reverse order.  There seems to
>>>>> be no good way to pass them in the correct order to the new-style
>>>>> LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
>>>>> oMask is not.  Just split src0 alpha in fixed-width registers and pass
>>>>> them to LOAD_PAYLOAD as if they were part of the header as work-around
>>>>> for now.
>>>>
>>>> Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
>>>> was only theoretical.  I wasn't very familiar with what omask actually did
>>>> and, since piglit didn't hit it, I wasn't sure if it was a real problem or
>>>> not.  I probably should have done more digging and written a piglit test at
>>>> the time. My bad.
>>>>
>>>> One solution that I proposed at the time was to turn header_size into
>>>> header_mask in the obvious way. We can still use 8 bits because we should
>>>> never have a header source higher than 8.
>>>>
>>>
>>> So your idea is to have one bit per source indicating whether it's
>>> header-like or per-channel?  I don't think that extends to instructions
>>> other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
>>> same time header and payload.
>>
>> You're right, it doesn't.  We really shouldn't be conflating them.  We
>> should have header_mask and header_present be different fields.  Maybe
>> use a union to save space, but they should have different semantic
>> meaning and different names.  We should probably also have a
>> compr4_mask and get rid of the hackery there.
>>
>>> One bit per 32B register would extend
>>> easily but it would be rather ugly to deal with if you want to keep your
>>> code SIMD width-invariant.
>>>
>>> I think if you go with the per-source flag you'll want it to be in its
>>> own subclass of fs_inst.  With its own subclass you could even have an
>>> array of per-source sizes determining the number of registers read for
>>> each source, which would be rather nice for the visitor (no need to
>>> split vectors into components while passing them to LOAD_PAYLOAD).
>>>
>>> Still I think the most elegant solution would be to simply get rid of
>>> the header/payload distinction by using force_writemask_all and, if it
>>> proves to be necessary, fix the optimizer to get rid of redundant
>>> force_writemask_all flags where it doesn't do it already.
>>
>> I really don't think that's a good long-term or short-term solution.
>>
>> How badly are you blocking on this?   I don't really have a lot of
>> extra time to work on this at the moment but can carve some out if
>> needed.
>
> I'm not blocking on this at all, feel free to fix it however you like,
> or just go with this hack for the moment if you have higher priority
> stuff to work on right now, I honestly don't care.

That's good to hear.  I'll try and take a look at this in a couple of
weeks.  Thanks for bringing it up and writing the piglit test!
--Jason

>> --jason
>>
>>>> Thoughts?
>>>> --Jason
>>>>
>>>>> I've written a piglit test that demonstrates the problem by using
>>>>> gl_SampleMask from a fragment shader with multiple color outputs [1].
>>>>>
>>>>> [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
>>>>> ---
>>>>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
>>>> +++++++++++++++++---------
>>>>>  1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>>> index 94d6a58..304ae74 100644
>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>>> @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
>>>> &bld,
>>>>>        length++;
>>>>>     }
>>>>>
>>>>> +   if (src0_alpha.file != BAD_FILE && color0.file != BAD_FILE) {
>>>>> +      /* Neat, we need to chop the src0 alpha component and pass it as
>>>> part of
>>>>> +       * the header even though it has per-channel semantics, because
>>>> the next
>>>>> +       * optional field is header-like and LOAD_PAYLOAD requires all such
>>>>> +       * fields to form a contiguous segment at the beginning of the
>>>> message.
>>>>> +       */
>>>>> +      for (unsigned i = 0; i < exec_size / 8; i++) {
>>>>> +         setup_color_payload(&sources[length], src0_alpha, 1, 8,
>>>>> +                             use_2nd_half || i == 1);
>>>>> +         length++;
>>>>> +      }
>>>>> +   }
>>>>> +
>>>>>     prog_data->uses_omask =
>>>>>        prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
>>>>>     if (prog_data->uses_omask) {
>>>>> @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
>>>> &bld,
>>>>>                               offset(this->outputs[0], bld, 3),
>>>>>                               1, exec_size, false);
>>>>>        length += 4;
>>>>> -   } else if (color1.file == BAD_FILE) {
>>>>> -      if (src0_alpha.file != BAD_FILE) {
>>>>> -         setup_color_payload(&sources[length], src0_alpha, 1, exec_size,
>>>> false);
>>>>> -         length++;
>>>>> -      }
>>>>> -
>>>>> -      setup_color_payload(&sources[length], color0, components,
>>>>> -                          exec_size, use_2nd_half);
>>>>> -      length += 4;
>>>>>     } else {
>>>>>        setup_color_payload(&sources[length], color0, components,
>>>>>                            exec_size, use_2nd_half);
>>>>>        length += 4;
>>>>> +
>>>>> +   }
>>>>> +
>>>>> +   if (color1.file != BAD_FILE) {
>>>>>        setup_color_payload(&sources[length], color1, components,
>>>>>                            exec_size, use_2nd_half);
>>>>>        length += 4;
>>>>> --
>>>>> 2.4.3
>>>>>


More information about the mesa-dev mailing list