[Mesa-dev] [PATCH] glsl: break the gl_FragData array into separate gl_FragData[i] variables
Eric Anholt
eric at anholt.net
Tue Oct 29 20:04:18 CET 2013
Paul Berry <stereotype441 at gmail.com> writes:
> On 29 October 2013 11:20, Eric Anholt <eric at anholt.net> wrote:
>
>> Paul Berry <stereotype441 at gmail.com> writes:
>>
>> > On 27 October 2013 12:58, Marek Olšák <maraeo at gmail.com> wrote:
>> >
>> >> On Sun, Oct 27, 2013 at 4:19 PM, Paul Berry <stereotype441 at gmail.com>
>> >> wrote:
>> >> > On 20 October 2013 09:20, Marek Olšák <maraeo at gmail.com> wrote:
>> >> >>
>> >> >> From: Marek Olšák <marek.olsak at amd.com>
>> >> >>
>> >> >> This avoids a defect in lower_output_reads.
>> >> >>
>> >> >> The problem is lower_output_reads treats the gl_FragData array as a
>> >> single
>> >> >> variable. It first redirects all output writes to a temporary
>> variable
>> >> >> (array)
>> >> >> and then writes the whole temporary variable to the output,
>> generating
>> >> >> assignments to all elements of gl_FragData.
>> >> >
>> >> >
>> >> > Can you go into more detail about why this is a problem? At first
>> >> glance it
>> >> > seems like it should be fine, because failing to assign to an element
>> of
>> >> > gl_FragData is supposed to result in undefined data going to that
>> >> fragment
>> >> > output. So lowering to a shader that does assign to that element of
>> >> > gl_FragData seems like it should be harmless. What am I missing here?
>> >>
>> >> Thanks for the review. The problem is drivers cannot eliminate useless
>> >> writes to gl_FragData, and each enabled gl_FragData output decreases
>> >> performance. The GLSL compiler cannot eliminate the writes either,
>> >> because gl_FragData is an array.
>> >>
>> >> Maybe the GLSL compiler should resize arrays based on which elements
>> >> are written, so that unused elements are not declared, but this is not
>> >> enough for gl_FragData, where the i-th output can be written, but
>> >> (i-1)-th output doesn't have to be.
>> >>
>> >
>> > Ah, ok. When I saw the word "defect", I misunderstood you to be fixing a
>> > correctness-of-rendering bug. As a performance optimization, I get it
>> now.
>> >
>> > For driver back-ends that don't need lower_output_reads (such as i915 and
>> > i965), this optimization isn't needed. Would you mind adding a flag to
>> > ShaderCompilerOptions so that tgsi-based drivers can opt in to this new
>> > optimization? I want to make sure that the code generation of i965 and
>> > i915 isn't affected.
>>
>> Actually, if Marek has identified that applications not setting all
>> their gl_FragData[] is a performance issue, I want to see this applied
>> to i965, too. This optimization would remove entire FB_WRITE opcodes
>> for us, which could be pretty significant if applications hit this case.
>>
>
> Yes, but Marek's patch won't benefit i965 even in that case. The only
> effect of Marek's patch is to prevent lower_output_reads (which i965
> doesn't use) from converting a shader that writes to only some elements of
> gl_FragData into a shader that writes to all elements of gl_FragData.
>
> To optimize i965 for shaders that don't write to all elements of
> gl_FragData we need to do back-end work that's unrelated to Marek's patch.
Oh, I thought Marek's patch was going to take a program that did:
out vec4 gl_FragData[4];
gl_FragData[0] = vec4(whatever);
gl_FragData[3] = vec4(whatever);
and turn it into:
out vec4 gl_FragData_0;
out vec4 gl_FragData_3;
gl_FragData_0 = vec4(whatever);
gl_FragData_3 = vec4(whatever);
which would have had an effect.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20131029/2b81a197/attachment.pgp>
More information about the mesa-dev
mailing list