nouveau 30bpp / deep color status

Mon Mar 5 07:25:43 UTC 2018

On 02/05/2018 12:50 AM, Ilia Mirkin wrote:
> In case anyone's curious about 30bpp framebuffer support, here's the
> current status:
> 
> Kernel:
> 
> Ben and I have switched the code to using a 256-based LUT for Kepler+,
> and I've also written a patch to cause the addfb ioctl to use the
> proper format. You can pick this up at:
> 
> https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!)
> https://patchwork.freedesktop.org/patch/202322/
> 
> With these two, you should be able to use "X -depth 30" again on any
> G80+ GPU to bring up a screen (as you could in kernel 4.9 and
> earlier). However this still has some deficiencies, some of which I've
> addressed:
> 
> xf86-video-nouveau:
> 
> DRI3 was broken, and Xv was broken. Patches available at:
> 
> https://github.com/imirkin/xf86-video-nouveau/commits/master
> 
> mesa:
> 
> The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the
> nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could).
> Mesa was only enabled for XRGB, so I've piped XBGR through all the
> same places:
> 
> https://github.com/imirkin/mesa/commits/30bpp
> 

Wrt. mesa, those patches are now in master and i think we have a bit of 
a problem under X11+GLX:

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/dri/dri_screen.c#n108

dri_fill_in_modes() defines MESA_FORMAT_R10G10B10A2_UNORM, 
MESA_FORMAT_R10G10B10X2_UNORM at the top inbetween the BGRX/A formats 
ignoring the instructions that
"/* The 32-bit RGBA format must not precede the 32-bit BGRA format.
   * Likewise for RGBX and BGRX.  Otherwise, the GLX client and the GLX
   * server may disagree on which format the GLXFBConfig represents,
   * resulting in swapped color channels."

RGBA/X formats should only be exposed
if (dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING))

and that is only the case for the Android loader.

The GLX code doesn't use the red/green/blueChannelMasks for proper 
matching of formats, and the server doesn't even transmit those masks to 
the client in the case of GLX. So whatever 10 bit format comes first 
will win when building the assignment to GLXFBConfigs.

I looked at the code and how it behaves. In practice Intel gfx works 
because it's a classic DRI driver with its own method of building the 
DRIconfig's, and it only exposes the BGR101010 formats, so no danger of 
mixups. AMD's gallium drivers expose both BGR and RGB ordered 10 bit 
formats, but due to the ordering, the matching ends up only assigning 
the desired BGR formats that are good for AMD hw, discarding the RGB 
formats. nouveau works because it only exposes the desired RGB format 
for the hw. But with other gallium drivers for some SoC's or future 
gallium drivers it is not so clear if the right thing will happen. E.g., 
freedreno seems to support both BGR and RGB 10 bit formats as 
PIPE_BIND_DISPLAY_TARGET afaics, so i don't know if by luck the right 
thing would happen?

Afaics EGL does the right thing wrt. channelmask matching of EGLConfigs 
to DRIconfigs, so we could probably implement dri_loader_get_cap(screen, 
DRI_LOADER_CAP_RGBA_ORDERING) == TRUE for the EGL loaders.

But for GLX it is not so easy or quick. I looked if i could make the 
servers GLX send proper channelmask attributes and Mesa parsing them, 
but there aren't any GLX tags defined for channel masks, and all other 
tags come from official GLX extension headers. I'm not sure what the 
proper procedure for defining new tags is? Do we have to define a new 
GLX extension for that and get it in the Khronos registry and then back 
into the server/mesa code-base?

The current patches in mesa for XBGR also lack enablement pieces for 
EGL, Wayland and X11 compositing, but that's a different problem.

-mario

> libdrm:
> 
> For testing, I added a modetest gradient pattern split horizontally.
> Top half is 10bpc, bottom half is 8bpc. This is useful for seeing
> whether you're really getting 10bpc, or if things are getting
> truncated along the way. Definitely hacky, but ... wasn't intending on
> upstreaming it anyways:
> 
> https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e3989441
> 
> -------------------------------------
> 
> Results with the patches (tested on a GK208B and a "deep color" TV over HDMI):
>   - modetest with a 10bpc gradient shows up smoother than an 8bpc
> gradient. However it's still dithered to 8bpc, not "real" 10bpc.
>   - things generally work in X -- dri2 and dri3, xv, and obviously
> regular X rendering / acceleration
>   - lots of X software can't handle 30bpp modes (mplayer hates it for
> xv and x11 rendering, aterm bails on shading the root pixmap, probably
> others)
> 
> I'm also told that with DP, it should actually send the higher-bpc
> data over the wire. With HDMI, we're still stuck at 24bpp for now
> (although the hardware can do 36bpp as well). This is why my gradient
> result above was still dithered.
> 
> Things to do - mostly nouveau specific, but probably some general
> infra needed too:
>   - Figure out how to properly expose the 1024-sized LUT
>   - Add fp16 scanout
>   - Stop relying on the max bpc of the monitor/connector and make
> decisions based on the "effective" bpc (e.g. based on the
> currently-set fb format, take hdmi/dp into account, etc). This will
> also affect the max clock somehow. Perhaps there should be a way to
> force a connector to a certain bpc.
>   - Add higher-bpc HDMI support
>   - Add 10bpc dithering (only makes sense if >= 10bpc output is
> *actually* enabled first)
>   - Investigate YUV HDMI modes (esp since they can enable 4K at 60 on HDMI
> 1.4 hardware)
>   - Test out Wayland compositors
>   - Teach xf86-video-modesetting about addfb2 or that nouveau's
> ordering is different.
> 
> I don't necessarily plan on working further on this, so if there are
> interested parties, they should definitely try to pick it up. I'll try
> to upstream all my changes though.
> 
> Cheers,
> 
>    -ilia
>