[Mesa-dev] [PATCH v2 12/16] glsl: Add a lowering pass to handle advanced blending modes.

Tue Aug 23 08:12:01 UTC 2016

On Monday, August 22, 2016 5:50:49 PM PDT Francisco Jerez wrote:
> Kenneth Graunke <kenneth at whitecape.org> writes:
> 
> > Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
> > need to emulate it in the pixel shader.  This lowering pass implements
> > all the necessary math for advanced blending.  It fetches the existing
> > framebuffer value using the MESA_shader_framebuffer_fetch built-in
> > variables, and the previous commit's state var uniform to select
> > which equation to use.
> >
> > This is done at the GLSL IR level to make it easy for all drivers to
> > implement the GL_KHR_blend_equation_advanced extension and share code.
> >
> > Drivers need to hook up MESA_shader_framebuffer_fetch functionality:
> > 1. Hook up the fb_fetch_output variable
> > 2. Implement BlendBarrier()
> >
> > Then to get KHR_blend_equation_advanced, they simply need to:
> > 3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled
> > 4. Call this lowering pass.
> >
> > Very little driver specific code should be required.
> >
> > v2: Handle multiple output variables per render target (which may exist
> >     due to ARB_enhanced_layouts), and array variables (even with one
> >     render target, we might have out vec4 color[1]), and non-vec4
> >     variables (it's easier than finding spec text to justify not
> >     handling it).  Thanks to Francisco Jerez for the feedback.
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> >  src/compiler/Makefile.sources                      |   1 +
> >  src/compiler/glsl/ir_optimization.h                |   1 +
> >  .../glsl/lower_blend_equation_advanced.cpp         | 557 +++++++++++++++++++++
> >  3 files changed, 559 insertions(+)
> >  create mode 100644 src/compiler/glsl/lower_blend_equation_advanced.cpp
> >
> > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> > index 0ff9b23..fe26132 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -77,6 +77,7 @@ LIBGLSL_FILES = \
> >  	glsl/loop_analysis.h \
> >  	glsl/loop_controls.cpp \
> >  	glsl/loop_unroll.cpp \
> > +	glsl/lower_blend_equation_advanced.cpp \
> >  	glsl/lower_buffer_access.cpp \
> >  	glsl/lower_buffer_access.h \
> >  	glsl/lower_const_arrays_to_uniforms.cpp \
> > diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h
> > index c29260a..3bd6928 100644
> > --- a/src/compiler/glsl/ir_optimization.h
> > +++ b/src/compiler/glsl/ir_optimization.h
> > @@ -151,6 +151,7 @@ void optimize_dead_builtin_variables(exec_list *instructions,
> >  bool lower_tess_level(gl_linked_shader *shader);
> >  
> >  bool lower_vertex_id(gl_linked_shader *shader);
> > +bool lower_blend_equation_advanced(gl_linked_shader *shader);
> >  
> >  bool lower_subroutine(exec_list *instructions, struct _mesa_glsl_parse_state *state);
> >  void propagate_invariance(exec_list *instructions);
> > diff --git a/src/compiler/glsl/lower_blend_equation_advanced.cpp b/src/compiler/glsl/lower_blend_equation_advanced.cpp
> > new file mode 100644
> > index 0000000..5632865
> > --- /dev/null
> > +++ b/src/compiler/glsl/lower_blend_equation_advanced.cpp
> > @@ -0,0 +1,557 @@
> > +/*
> > + * Copyright © 2016 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#include "ir.h"
> > +#include "ir_builder.h"
> > +#include "ir_optimization.h"
> > +#include "ir_hierarchical_visitor.h"
> > +#include "program/prog_instruction.h"
> > +#include "program/prog_statevars.h"
> > +#include "util/bitscan.h"
> > +
> > +using namespace ir_builder;
> > +
> > +#define imm1(x) new(mem_ctx) ir_constant((float) x, 1)
> > +#define imm3(x) new(mem_ctx) ir_constant((float) x, 3)
> > +
> 
> The x argument should probably be between parentheses to avoid surprises
> in the future -- Or even better, make them static inline functions
> instead of macros.

Good call, I'll add parenthesis.  I made them macros rather than static
inlines so you can do imm3(0.5) rather than imm3(mem_ctx, 0.5)...it's
shorter (which is the main point).

[snip]

> > +                                    blend_comps[0], blend_comps[1],
> > +                                    blend_comps[2], blend_comps[3]);
> > +   }
> > +
> > +   ir_function_signature *main = get_main(sh);
> > +   ir_factory f(&main->body, mem_ctx);
> > +
> > +   ir_variable *result_dest =
> > +      calc_blend_result(f, mode, fb, blend_source, sh->info.BlendSupport);
> > +
> > +   /* Copy the result back to the original values.  It would be simpler
> > +    * to demote the program's output variables, and create a new vec4
> > +    * output for our result, but this pass runs before we create the
> > +    * ARB_program_interface_query resource list.  So we have to leave
> > +    * the original outputs in place and use them.
> > +    */
> > +   for (int i = 0; i < 4; i++) {
> > +      if (!outputs[i])
> > +         continue;
> > +
> > +      f.emit(assign(deref_output(outputs[i]), swizzle(result_dest, i, 1),
> > +                    1 << i));
> > +   }
> > +
> 
> It looks like this relies on the main function having a single exit
> point, which AFAICT may not be the case if there are multiple return
> statements and the lowering pass is called before do_lower_jumps() (as
> is the case in your i965 code).  Several options occur to me:
> 
>  - Rename the shader's main function and generate a new main function
>    that calls into the original and implements the advanced blending
>    epilogue.  This has the advantage that it wouldn't force drivers to
>    lower returns if they can support them natively, and is likely to
>    result in better code generation than the next option.
> 
>  - Make the lowering pass re-emit the blending epilogue for each exit
>    point of the main function.
> 
>  - Call do_lower_jumps(lower_main_return=true) manually at the top of
>    lower_blend_equation_advanced() if support for any advanced blending
>    modes was requested in order to make sure the function has the right
>    form.  This seems a bit of a hack but I guess it would be the easiest
>    solution.

Ugh.  Good point.

Option #2 seems unfortunate - I'd hate to emit this multiple times.

In a world where all functions are inlined, options #1 and #3 seem
basically equivalent - function inlining lowers early returns.  So
the renamed main would have early returns lowered anyway.

I think I'll take option #3 and change the function to:

bool
lower_blend_equation_advanced(struct gl_linked_shader *sh)
{
   if (sh->info.BlendSupport == 0)
      return false;

   /* Lower early returns in main() so there's a single exit point
    * where we can insert our lowering code.
    */
   do_lower_jumps(sh->ir, false, false, true, false, false);

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160823/8cdda158/attachment.sig>