[RFC] Weston GL shader compilation re-design

Pekka Paalanen ppaalanen at gmail.com
Wed Jan 23 09:34:53 UTC 2019


On Wed, 23 Jan 2019 11:32:34 +0530
Harish Krupo <harish.krupo.kps at intel.com> wrote:

> Hi,
> 
> This is in continuation to the discussion in the mail chain here [1]. We
> are currently working on enabling HDR rendering support in the
> gl-renderer. HDR support requires color space conversion (709->2020 or
> other way around) and Tone Mapping, for which, we need to execute the
> following steps in the shader:
> * Degamma (to linearize the buffer)
> * Color space conversion (BT2020 -> BT709...)
> * Tone mapping (SDR -> HDR, HDR -> SDR, HDR -> HDR conversion)
> * Gamma: Apply gamma/transfer function for the display (PQ/HLG...)
> 
> We are currently targeting the DCI-P3, SRGB and 2084 color spaces. One
> way to implement it would be to create multiple sets of shaders for each
> degamma/gamma combination. Like the one done by Ville in his POC [2].
> This solution was a POC, and isn't scalable when we need to support
> multiple color spaces. I would like to propose 2 different solutions:
> 
> Proposal 1:
> * Each of the shaders (gamma/degamma/main/tone mapping) would be
>   independent strings.
> * When the view needs to be rendered, renderer will generate a set of
>   requirements like need degamma PQ curve, need tone mapping, need gamma
>   SRGB curve and so on.
> * These requirements (NEED_GAMMA_PQ...) would be bit fields and will be
>   OR'd to create the total requirements.
> * This total requirement would be used to stitch the required shaders to
>   compile a program. The total requirements integer will also be used as
>   a key to store that program in a hashmap for re-use.
> 
> Proposal 2:
> We could go for multi pass rendering, where each of the steps (gamma,
> csc, tone mappping, degamma) will be applied in different passes. Each
> pass would bind an fbo and call draw elements with the corresponding
> shader. This would not be as efficient but is another approach.
> 
> Please comment on the approaches and do suggest if there is a better
> one.

Hi Harish,

choosing between multipass rendering and more complicated shader
building, I would definitely take the more complicated shader building
first. If the use case is video playback, then the clients will be
updating very often and the frames they produce will likely ever be
composited just once. Using multipass with intermediate buffers would
just blow up the memory bandwidth usage.

Later, if someone wants to optimize things more for rarely updating
surfaces, they could implement an image cache for some stages of the
pixel conversions.

Precision is another benefit of a complicated shader that does
everything in one go: the intermediate values in the shader can be
stored in GPU registers at high precision. Achieving the same precision
with multipass would cost a huge amount of memory. Memory saving is a
big reason why one almost always store images with gamma applied, even
in memory rather than just files.

Btw. when enhancing the fragment shader mechanics, I think we could
rely more on the GLSL compilers than we do today. What I mean by that
is that we could have the shader's main call functions that do each of
the conversion steps to the pixel values, and then when concatenating
the shader string, we pick the appropriate function implementations into
it. No-op or pass-through functions ideally shouldn't produce any worse
code than manually not calling them. IOW, go with a more function than
macro oriented design in assembling the complete frament shader string.

I think code maintainability and clarity should be the foremost goal,
but avoiding any serious performance issues. Fine-tuning the shader
performance, e.g. investigating a single "megashader" vs. multiple
specific shaders can be left out for now. Unless someone has solid
knowledge about what would be best already?

Of course, we should stick to GL ES 2.0 capabilities if possible, but
we could also consider requiring GL ES 3.0 if that would seem too
useful to ignore.


Thanks,
pq

> 
> [1] https://lists.freedesktop.org/archives/wayland-devel/2019-January/039809.html 
> [2] https://github.com/vsyrjala/weston/blob/hdr_poc/libweston/gl-renderer.c#L257
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20190123/64d3ef79/attachment.sig>


More information about the wayland-devel mailing list