[Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.
Jason Ekstrand
jason at jlekstrand.net
Fri Jul 10 12:24:23 PDT 2015
On Fri, Jul 10, 2015 at 9:53 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
>> On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>>
>>>> On Jul 9, 2015 7:57 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>>>>>
>>>>> We were passing src0 alpha and oMask in reverse order. There seems to
>>>>> be no good way to pass them in the correct order to the new-style
>>>>> LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
>>>>> oMask is not. Just split src0 alpha in fixed-width registers and pass
>>>>> them to LOAD_PAYLOAD as if they were part of the header as work-around
>>>>> for now.
>>>>
>>>> Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
>>>> was only theoretical. I wasn't very familiar with what omask actually did
>>>> and, since piglit didn't hit it, I wasn't sure if it was a real problem or
>>>> not. I probably should have done more digging and written a piglit test at
>>>> the time. My bad.
>>>>
>>>> One solution that I proposed at the time was to turn header_size into
>>>> header_mask in the obvious way. We can still use 8 bits because we should
>>>> never have a header source higher than 8.
>>>>
>>>
>>> So your idea is to have one bit per source indicating whether it's
>>> header-like or per-channel? I don't think that extends to instructions
>>> other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
>>> same time header and payload.
>>
>> You're right, it doesn't. We really shouldn't be conflating them. We
>> should have header_mask and header_present be different fields. Maybe
>> use a union to save space, but they should have different semantic
>> meaning and different names. We should probably also have a
>> compr4_mask and get rid of the hackery there.
>>
>>> One bit per 32B register would extend
>>> easily but it would be rather ugly to deal with if you want to keep your
>>> code SIMD width-invariant.
>>>
>>> I think if you go with the per-source flag you'll want it to be in its
>>> own subclass of fs_inst. With its own subclass you could even have an
>>> array of per-source sizes determining the number of registers read for
>>> each source, which would be rather nice for the visitor (no need to
>>> split vectors into components while passing them to LOAD_PAYLOAD).
>>>
>>> Still I think the most elegant solution would be to simply get rid of
>>> the header/payload distinction by using force_writemask_all and, if it
>>> proves to be necessary, fix the optimizer to get rid of redundant
>>> force_writemask_all flags where it doesn't do it already.
>>
>> I really don't think that's a good long-term or short-term solution.
>>
>> How badly are you blocking on this? I don't really have a lot of
>> extra time to work on this at the moment but can carve some out if
>> needed.
>
> I'm not blocking on this at all, feel free to fix it however you like,
> or just go with this hack for the moment if you have higher priority
> stuff to work on right now, I honestly don't care.
That's good to hear. I'll try and take a look at this in a couple of
weeks. Thanks for bringing it up and writing the piglit test!
--Jason
>> --jason
>>
>>>> Thoughts?
>>>> --Jason
>>>>
>>>>> I've written a piglit test that demonstrates the problem by using
>>>>> gl_SampleMask from a fragment shader with multiple color outputs [1].
>>>>>
>>>>> [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
>>>>> ---
>>>>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
>>>> +++++++++++++++++---------
>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>>> index 94d6a58..304ae74 100644
>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>>>> @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
>>>> &bld,
>>>>> length++;
>>>>> }
>>>>>
>>>>> + if (src0_alpha.file != BAD_FILE && color0.file != BAD_FILE) {
>>>>> + /* Neat, we need to chop the src0 alpha component and pass it as
>>>> part of
>>>>> + * the header even though it has per-channel semantics, because
>>>> the next
>>>>> + * optional field is header-like and LOAD_PAYLOAD requires all such
>>>>> + * fields to form a contiguous segment at the beginning of the
>>>> message.
>>>>> + */
>>>>> + for (unsigned i = 0; i < exec_size / 8; i++) {
>>>>> + setup_color_payload(&sources[length], src0_alpha, 1, 8,
>>>>> + use_2nd_half || i == 1);
>>>>> + length++;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> prog_data->uses_omask =
>>>>> prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
>>>>> if (prog_data->uses_omask) {
>>>>> @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
>>>> &bld,
>>>>> offset(this->outputs[0], bld, 3),
>>>>> 1, exec_size, false);
>>>>> length += 4;
>>>>> - } else if (color1.file == BAD_FILE) {
>>>>> - if (src0_alpha.file != BAD_FILE) {
>>>>> - setup_color_payload(&sources[length], src0_alpha, 1, exec_size,
>>>> false);
>>>>> - length++;
>>>>> - }
>>>>> -
>>>>> - setup_color_payload(&sources[length], color0, components,
>>>>> - exec_size, use_2nd_half);
>>>>> - length += 4;
>>>>> } else {
>>>>> setup_color_payload(&sources[length], color0, components,
>>>>> exec_size, use_2nd_half);
>>>>> length += 4;
>>>>> +
>>>>> + }
>>>>> +
>>>>> + if (color1.file != BAD_FILE) {
>>>>> setup_color_payload(&sources[length], color1, components,
>>>>> exec_size, use_2nd_half);
>>>>> length += 4;
>>>>> --
>>>>> 2.4.3
>>>>>
More information about the mesa-dev
mailing list