[Mesa-dev] [PATCH] glsl: break the gl_FragData array into separate gl_FragData[i] variables

Eric Anholt eric at anholt.net
Tue Oct 29 19:20:40 CET 2013


Paul Berry <stereotype441 at gmail.com> writes:

> On 27 October 2013 12:58, Marek Olšák <maraeo at gmail.com> wrote:
>
>> On Sun, Oct 27, 2013 at 4:19 PM, Paul Berry <stereotype441 at gmail.com>
>> wrote:
>> > On 20 October 2013 09:20, Marek Olšák <maraeo at gmail.com> wrote:
>> >>
>> >> From: Marek Olšák <marek.olsak at amd.com>
>> >>
>> >> This avoids a defect in lower_output_reads.
>> >>
>> >> The problem is lower_output_reads treats the gl_FragData array as a
>> single
>> >> variable. It first redirects all output writes to a temporary variable
>> >> (array)
>> >> and then writes the whole temporary variable to the output, generating
>> >> assignments to all elements of gl_FragData.
>> >
>> >
>> > Can you go into more detail about why this is a problem?  At first
>> glance it
>> > seems like it should be fine, because failing to assign to an element of
>> > gl_FragData is supposed to result in undefined data going to that
>> fragment
>> > output.  So lowering to a shader that does assign to that element of
>> > gl_FragData seems like it should be harmless.  What am I missing here?
>>
>> Thanks for the review. The problem is drivers cannot eliminate useless
>> writes to gl_FragData, and each enabled gl_FragData output decreases
>> performance. The GLSL compiler cannot eliminate the writes either,
>> because gl_FragData is an array.
>>
>> Maybe the GLSL compiler should resize arrays based on which elements
>> are written, so that unused elements are not declared, but this is not
>> enough for gl_FragData, where the i-th output can be written, but
>> (i-1)-th output doesn't have to be.
>>
>
> Ah, ok.  When I saw the word "defect", I misunderstood you to be fixing a
> correctness-of-rendering bug.  As a performance optimization, I get it now.
>
> For driver back-ends that don't need lower_output_reads (such as i915 and
> i965), this optimization isn't needed.  Would you mind adding a flag to
> ShaderCompilerOptions so that tgsi-based drivers can opt in to this new
> optimization?  I want to make sure that the code generation of i965 and
> i915 isn't affected.

Actually, if Marek has identified that applications not setting all
their gl_FragData[] is a performance issue, I want to see this applied
to i965, too.  This optimization would remove entire FB_WRITE opcodes
for us, which could be pretty significant if applications hit this case.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20131029/5203904b/attachment-0001.pgp>


More information about the mesa-dev mailing list