[RFC] Weston GL shader compilation re-design

Wed Jan 23 09:44:52 UTC 2019

Hi Pekka,

Thank you for the comments.
I wIll start implementing the shader building.

Pekka Paalanen <ppaalanen at gmail.com> writes:

> On Wed, 23 Jan 2019 11:32:34 +0530
> Harish Krupo <harish.krupo.kps at intel.com> wrote:
>
>> Hi,
>> 
>> This is in continuation to the discussion in the mail chain here [1]. We
>> are currently working on enabling HDR rendering support in the
>> gl-renderer. HDR support requires color space conversion (709->2020 or
>> other way around) and Tone Mapping, for which, we need to execute the
>> following steps in the shader:
>> * Degamma (to linearize the buffer)
>> * Color space conversion (BT2020 -> BT709...)
>> * Tone mapping (SDR -> HDR, HDR -> SDR, HDR -> HDR conversion)
>> * Gamma: Apply gamma/transfer function for the display (PQ/HLG...)
>> 
>> We are currently targeting the DCI-P3, SRGB and 2084 color spaces. One
>> way to implement it would be to create multiple sets of shaders for each
>> degamma/gamma combination. Like the one done by Ville in his POC [2].
>> This solution was a POC, and isn't scalable when we need to support
>> multiple color spaces. I would like to propose 2 different solutions:
>> 
>> Proposal 1:
>> * Each of the shaders (gamma/degamma/main/tone mapping) would be
>>   independent strings.
>> * When the view needs to be rendered, renderer will generate a set of
>>   requirements like need degamma PQ curve, need tone mapping, need gamma
>>   SRGB curve and so on.
>> * These requirements (NEED_GAMMA_PQ...) would be bit fields and will be
>>   OR'd to create the total requirements.
>> * This total requirement would be used to stitch the required shaders to
>>   compile a program. The total requirements integer will also be used as
>>   a key to store that program in a hashmap for re-use.
>> 
>> Proposal 2:
>> We could go for multi pass rendering, where each of the steps (gamma,
>> csc, tone mappping, degamma) will be applied in different passes. Each
>> pass would bind an fbo and call draw elements with the corresponding
>> shader. This would not be as efficient but is another approach.
>> 
>> Please comment on the approaches and do suggest if there is a better
>> one.
>
> Hi Harish,
>
> choosing between multipass rendering and more complicated shader
> building, I would definitely take the more complicated shader building
> first. If the use case is video playback, then the clients will be
> updating very often and the frames they produce will likely ever be
> composited just once. Using multipass with intermediate buffers would
> just blow up the memory bandwidth usage.
>
> Later, if someone wants to optimize things more for rarely updating
> surfaces, they could implement an image cache for some stages of the
> pixel conversions.
>
> Precision is another benefit of a complicated shader that does
> everything in one go: the intermediate values in the shader can be
> stored in GPU registers at high precision. Achieving the same precision
> with multipass would cost a huge amount of memory. Memory saving is a
> big reason why one almost always store images with gamma applied, even
> in memory rather than just files.
>
> Btw. when enhancing the fragment shader mechanics, I think we could
> rely more on the GLSL compilers than we do today. What I mean by that
> is that we could have the shader's main call functions that do each of
> the conversion steps to the pixel values, and then when concatenating
> the shader string, we pick the appropriate function implementations into
> it. No-op or pass-through functions ideally shouldn't produce any worse
> code than manually not calling them. IOW, go with a more function than
> macro oriented design in assembling the complete frament shader string.
>
> I think code maintainability and clarity should be the foremost goal,
> but avoiding any serious performance issues. Fine-tuning the shader
> performance, e.g. investigating a single "megashader" vs. multiple
> specific shaders can be left out for now. Unless someone has solid
> knowledge about what would be best already?
>
> Of course, we should stick to GL ES 2.0 capabilities if possible, but
> we could also consider requiring GL ES 3.0 if that would seem too
> useful to ignore.
>
>
> Thanks,
> pq
>
>> 
>> [1] https://lists.freedesktop.org/archives/wayland-devel/2019-January/039809.html 
>> [2] https://github.com/vsyrjala/weston/blob/hdr_poc/libweston/gl-renderer.c#L257

Regards
Harish Krupo