[Mesa-dev] [PATCH] glsl: break the gl_FragData array into separate gl_FragData[i] variables
stereotype441 at gmail.com
Tue Oct 29 19:26:34 CET 2013
On 29 October 2013 11:20, Eric Anholt <eric at anholt.net> wrote:
> Paul Berry <stereotype441 at gmail.com> writes:
> > On 27 October 2013 12:58, Marek Olšák <maraeo at gmail.com> wrote:
> >> On Sun, Oct 27, 2013 at 4:19 PM, Paul Berry <stereotype441 at gmail.com>
> >> wrote:
> >> > On 20 October 2013 09:20, Marek Olšák <maraeo at gmail.com> wrote:
> >> >>
> >> >> From: Marek Olšák <marek.olsak at amd.com>
> >> >>
> >> >> This avoids a defect in lower_output_reads.
> >> >>
> >> >> The problem is lower_output_reads treats the gl_FragData array as a
> >> single
> >> >> variable. It first redirects all output writes to a temporary
> >> >> (array)
> >> >> and then writes the whole temporary variable to the output,
> >> >> assignments to all elements of gl_FragData.
> >> >
> >> >
> >> > Can you go into more detail about why this is a problem? At first
> >> glance it
> >> > seems like it should be fine, because failing to assign to an element
> >> > gl_FragData is supposed to result in undefined data going to that
> >> fragment
> >> > output. So lowering to a shader that does assign to that element of
> >> > gl_FragData seems like it should be harmless. What am I missing here?
> >> Thanks for the review. The problem is drivers cannot eliminate useless
> >> writes to gl_FragData, and each enabled gl_FragData output decreases
> >> performance. The GLSL compiler cannot eliminate the writes either,
> >> because gl_FragData is an array.
> >> Maybe the GLSL compiler should resize arrays based on which elements
> >> are written, so that unused elements are not declared, but this is not
> >> enough for gl_FragData, where the i-th output can be written, but
> >> (i-1)-th output doesn't have to be.
> > Ah, ok. When I saw the word "defect", I misunderstood you to be fixing a
> > correctness-of-rendering bug. As a performance optimization, I get it
> > For driver back-ends that don't need lower_output_reads (such as i915 and
> > i965), this optimization isn't needed. Would you mind adding a flag to
> > ShaderCompilerOptions so that tgsi-based drivers can opt in to this new
> > optimization? I want to make sure that the code generation of i965 and
> > i915 isn't affected.
> Actually, if Marek has identified that applications not setting all
> their gl_FragData is a performance issue, I want to see this applied
> to i965, too. This optimization would remove entire FB_WRITE opcodes
> for us, which could be pretty significant if applications hit this case.
Yes, but Marek's patch won't benefit i965 even in that case. The only
effect of Marek's patch is to prevent lower_output_reads (which i965
doesn't use) from converting a shader that writes to only some elements of
gl_FragData into a shader that writes to all elements of gl_FragData.
To optimize i965 for shaders that don't write to all elements of
gl_FragData we need to do back-end work that's unrelated to Marek's patch.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mesa-dev