[RFC] Weston GL shader compilation re-design

Thu Jan 31 20:09:11 UTC 2019

Hi Harish,

On Wed, 23 Jan 2019 at 09:35, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Wed, 23 Jan 2019 11:32:34 +0530 Harish Krupo <harish.krupo.kps at intel.com> wrote:
> > Proposal 1:
> > * Each of the shaders (gamma/degamma/main/tone mapping) would be
> >   independent strings.
> > * When the view needs to be rendered, renderer will generate a set of
> >   requirements like need degamma PQ curve, need tone mapping, need gamma
> >   SRGB curve and so on.
> > * These requirements (NEED_GAMMA_PQ...) would be bit fields and will be
> >   OR'd to create the total requirements.
> > * This total requirement would be used to stitch the required shaders to
> >   compile a program. The total requirements integer will also be used as
> >   a key to store that program in a hashmap for re-use.
>
> choosing between multipass rendering and more complicated shader
> building, I would definitely take the more complicated shader building
> first. If the use case is video playback, then the clients will be
> updating very often and the frames they produce will likely ever be
> composited just once. Using multipass with intermediate buffers would
> just blow up the memory bandwidth usage.
>
> Later, if someone wants to optimize things more for rarely updating
> surfaces, they could implement an image cache for some stages of the
> pixel conversions.
>
> Precision is another benefit of a complicated shader that does
> everything in one go: the intermediate values in the shader can be
> stored in GPU registers at high precision. Achieving the same precision
> with multipass would cost a huge amount of memory. Memory saving is a
> big reason why one almost always store images with gamma applied, even
> in memory rather than just files.
>
> Btw. when enhancing the fragment shader mechanics, I think we could
> rely more on the GLSL compilers than we do today. What I mean by that
> is that we could have the shader's main call functions that do each of
> the conversion steps to the pixel values, and then when concatenating
> the shader string, we pick the appropriate function implementations into
> it. No-op or pass-through functions ideally shouldn't produce any worse
> code than manually not calling them. IOW, go with a more function than
> macro oriented design in assembling the complete frament shader string.

I can see compositor-specific effect shaders (e.g. desaturation or
blurring of inactive windows) maybe needing access to function-local
variables. What I had in mind for this was something similar to Cogl's
shader snippets:
https://developer.gnome.org/cogl-2.0-experimental/stable/cogl-2.0-experimental-Shader-snippets.html

I'd be happy with either approach to begin with though; maybe a more
limited approach is best right now.

You're right that unused functions (even unused statements or branches
mostly, these days) will get eliminated by the compiler and not make
it to final GPU bytecode.

> I think code maintainability and clarity should be the foremost goal,
> but avoiding any serious performance issues. Fine-tuning the shader
> performance, e.g. investigating a single "megashader" vs. multiple
> specific shaders can be left out for now. Unless someone has solid
> knowledge about what would be best already?

I don't think we really need a megashader. Switching between different
shaders is quite cheap (we already switch a great deal of state when
moving between views anyway), and there's no compilation overhead. I'd
much rather do that for now.

> Of course, we should stick to GL ES 2.0 capabilities if possible, but
> we could also consider requiring GL ES 3.0 if that would seem too
> useful to ignore.

If we discover that some parts of the pipeline need ESSL3.x to
function properly, we could just not advertise those bits unless we
have ES3.x. In any case, if we start requiring newer ESSL versions or
extensions, we'll likely need #pragmas at the top of the shader to
enable them, which means we have to do a little bit more than purely
concatenating functions anyway.

Cheers,
Daniel