16 bpc fixed point (RGBA16) framebuffer support for core and AMD.

Fri Apr 16 16:29:41 UTC 2021

Friendly ping to the AMD people. Nicholas, Harry, Alex, any feedback?
Would be great to get this in sooner than later.

Thanks and have a nice weekend,
-mario

On Fri, Mar 19, 2021 at 10:03 PM Mario Kleiner
<mario.kleiner.de at gmail.com> wrote:
>
> Hi,
>
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
>
> This is intended to allow for pageflipping to, and direct scanout of,
> Vulkan swapchain images in the format VK_FORMAT_R16G16B16A16_UNORM.
> I have patched AMD's GPUOpen amdvlk OSS driver to enable this format
> for swapchains, mapping to DRM_FORMAT_XBGR16161616:
> Link: https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
>
> My main motivation for this is squeezing every bit of precision
> out of the hardware for scientific and medical research applications,
> where fp16 in the unorm range is limited to ~11 bpc effective linear
> precision in the upper half [0.5;1.0] of the unorm range, although
> the hardware could do at least 12 bpc.
>
> It has been successfully tested on AMD RavenRidge (DCN-1), and with
> Polaris11 (DCE-11.2). Up to two displays were active on RavenRidge
> (DP 2560x1440 at 144Hz + HDMI 2560x1440 at 120Hz), the maximum supported
> on my hw, both running at 10 bpc DP output depth.
>
> Up to three displays were active on the Polaris (DP 2560x1440 at 144Hz +
> 2560x1440 at 100Hz USB-C DP-altMode-to-HDMI converter + eDP 2880x1800 at 60Hz
> Apple Retina panel), all running at 10 bpc output depth.
>
> No malfunctions, visual artifacts or other oddities were observed
> (apart from an adventureous mess of cables and adapters on my desk),
> suggesting it works.
>
> I used my automatic photometer measurement procedure to verify the
> effective output precision of 10 bpc DP native signal + spatial
> dithering in the gpu as enabled by the amdgpu driver. Results show
> the expected 12 bpc precision i hoped for -- the current upper limit
> for AMD display hw afaik.
>
> So it seems to work in the way i hoped :).
>
> Some open questions wrt. AMD DC, to be addressed in this patch series, or follow up
> patches if neccessary:
>
> - For the atomic check for plane scaling, the current patch will
> apply the same hw limits as for other rgb fixed point fb's, e.g.,
> for 8 bpc rgb8. Is this correct? Or would we need to use the fp16
> limits, because this is also a 64 bpp format? Or something new
> entirely?
>
> - I haven't added the new fourcc to the DCC tables yet. Should i?
>
> - I had to change an assert for DCE to allow 36bpp linebuffers (patch 4/5).
> It looks to me as if that assert was inconsistent with other places
> in the driver where COLOR_DEPTH121212 is supported, and looking at
> the code, the change seems harmless. At least on DCE-11.2 the change
> didn't cause any noticeable (by myself) or measurable (by my equipment)
> problems on any of the 3 connected displays.
>
> - Related to that change, while i needed to increase lb pixelsize to 36bpp
> to get > 10 bpc effective precision on DCN, i didn't need to do that
> on DCE. Also no change of lb pixelsize was needed on either DCN or DCe
> to get > 10 bpc precision for fp16 framebuffers, so something seems to
> behave differently for floating point 16 vs. fixed point 16. This all
> seems to suggest one could leave lb pixelsize at the old 30 bpp value
> on at least DCE-11.2 and still get the > 10 bpc precision if one wanted
> to avoid the changes of patch 4/5.
>
> Thanks,
> -mario
>
>