[Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.
Jason Ekstrand
jason at jlekstrand.net
Fri Jul 10 09:41:03 PDT 2015
On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
>> On Jul 9, 2015 7:57 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>>>
>>> We were passing src0 alpha and oMask in reverse order. There seems to
>>> be no good way to pass them in the correct order to the new-style
>>> LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
>>> oMask is not. Just split src0 alpha in fixed-width registers and pass
>>> them to LOAD_PAYLOAD as if they were part of the header as work-around
>>> for now.
>>
>> Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
>> was only theoretical. I wasn't very familiar with what omask actually did
>> and, since piglit didn't hit it, I wasn't sure if it was a real problem or
>> not. I probably should have done more digging and written a piglit test at
>> the time. My bad.
>>
>> One solution that I proposed at the time was to turn header_size into
>> header_mask in the obvious way. We can still use 8 bits because we should
>> never have a header source higher than 8.
>>
>
> So your idea is to have one bit per source indicating whether it's
> header-like or per-channel? I don't think that extends to instructions
> other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
> same time header and payload.
You're right, it doesn't. We really shouldn't be conflating them. We
should have header_mask and header_present be different fields. Maybe
use a union to save space, but they should have different semantic
meaning and different names. We should probably also have a
compr4_mask and get rid of the hackery there.
> One bit per 32B register would extend
> easily but it would be rather ugly to deal with if you want to keep your
> code SIMD width-invariant.
>
> I think if you go with the per-source flag you'll want it to be in its
> own subclass of fs_inst. With its own subclass you could even have an
> array of per-source sizes determining the number of registers read for
> each source, which would be rather nice for the visitor (no need to
> split vectors into components while passing them to LOAD_PAYLOAD).
>
> Still I think the most elegant solution would be to simply get rid of
> the header/payload distinction by using force_writemask_all and, if it
> proves to be necessary, fix the optimizer to get rid of redundant
> force_writemask_all flags where it doesn't do it already.
I really don't think that's a good long-term or short-term solution.
How badly are you blocking on this? I don't really have a lot of
extra time to work on this at the moment but can carve some out if
needed.
--jason
>> Thoughts?
>> --Jason
>>
>>> I've written a piglit test that demonstrates the problem by using
>>> gl_SampleMask from a fragment shader with multiple color outputs [1].
>>>
>>> [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
>>> ---
>>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
>> +++++++++++++++++---------
>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> index 94d6a58..304ae74 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>>> @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
>> &bld,
>>> length++;
>>> }
>>>
>>> + if (src0_alpha.file != BAD_FILE && color0.file != BAD_FILE) {
>>> + /* Neat, we need to chop the src0 alpha component and pass it as
>> part of
>>> + * the header even though it has per-channel semantics, because
>> the next
>>> + * optional field is header-like and LOAD_PAYLOAD requires all such
>>> + * fields to form a contiguous segment at the beginning of the
>> message.
>>> + */
>>> + for (unsigned i = 0; i < exec_size / 8; i++) {
>>> + setup_color_payload(&sources[length], src0_alpha, 1, 8,
>>> + use_2nd_half || i == 1);
>>> + length++;
>>> + }
>>> + }
>>> +
>>> prog_data->uses_omask =
>>> prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
>>> if (prog_data->uses_omask) {
>>> @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
>> &bld,
>>> offset(this->outputs[0], bld, 3),
>>> 1, exec_size, false);
>>> length += 4;
>>> - } else if (color1.file == BAD_FILE) {
>>> - if (src0_alpha.file != BAD_FILE) {
>>> - setup_color_payload(&sources[length], src0_alpha, 1, exec_size,
>> false);
>>> - length++;
>>> - }
>>> -
>>> - setup_color_payload(&sources[length], color0, components,
>>> - exec_size, use_2nd_half);
>>> - length += 4;
>>> } else {
>>> setup_color_payload(&sources[length], color0, components,
>>> exec_size, use_2nd_half);
>>> length += 4;
>>> +
>>> + }
>>> +
>>> + if (color1.file != BAD_FILE) {
>>> setup_color_payload(&sources[length], color1, components,
>>> exec_size, use_2nd_half);
>>> length += 4;
>>> --
>>> 2.4.3
>>>
More information about the mesa-dev
mailing list