WebKit failing to find GLXFBConfig, confusion around fbconfigs + swrast

Thu Sep 6 23:05:09 UTC 2018

On Mon, Aug 27, 2018 at 1:07 AM Daniel Drake <drake at endlessm.com> wrote:

> Hi,
>
> I'm looking at a strange issue which has taken me across WebKit,
> glvnd, mesa and X, and has left me somewhat confused regarding if I've
> found any real bugs here, or just expected behaviour (my graphics
> knowledge doesn't go far beyond the basics).
>
> The issue:
> Under xserver-1.18 + mesa-18.1, on Intel GeminiLake, the
> Webkit-powered GNOME online accounts UI shows a blank window (instead
> of the web service login UI). The logs show a webkit crash at the same
> time, because it doesn't handle a GLXBadFBConfig X error.
>
> On the Webkit side, it is failing to find an appropriate GLXFBConfig
> that corresponds to the X visual of the window, which is using a depth
> 32 RGBA8888 visual. It then ends up passing a NULL config to
> glXCreateContextAttribsARB() which results in an error.
>
> Inspecting the available visuals and GLXFBConfigs with glxinfo, I
> observe that there is only one visual with depth 32 (the one being
> used here), but there isn't even a single GLXFBConfig with depth 32.
>
> Looking on the X server side, I observe the active code that first
> deals with the fbconfigs list is glxdriswrast.c __glXDRIscreenProbe,
> which is calling into mesa's driSWRastCreateNewScreen() and getting
> the available fbconfigs from there.
>
> I then spotted a log message:
>   (EE) modeset(0): [DRI2] No driver mapping found for PCI device 0x8086 /
> 0x3184
>
> and then I find hw/xfree86/dri2/pci_ids/i965_pci_ids.h, which (on this
> old X) is missing GeminiLake PCI IDs, so I add it there. Now I have my
> depth 32 fbconfig with the right visual assigned and webkit works.
>
>
> Questions:
>
> 1. What should webkit be doing in event of it not being to find a
> GLXFBConfig that corresponds to the X visual of it's window?
>
>
> 2. Why is swrast coming into the picture? Is swrast being used for
> rendering?
>
> I was surprised to see that appear in the traces. I had assumed that
> with a new enough mesa, I would be avoiding software rendering
> codepaths.
>
> I don't think it's using swrast for rendering because I feel like I
> would have noticed corresponding slow performance, also even before my
> changes glxinfo says:
>
>   direct rendering: Yes
>   Extended renderer info (GLX_MESA_query_renderer):
>     Vendor: Intel Open Source Technology Center (0x8086)
>     Device: Mesa DRI Intel(R) UHD Graphics 605 (Geminilake)  (0x3184)
>     Version: 18.1.6
>     Accelerated: yes
>
> If swrast is not being used for rendering, why is it being used to
> determine what the available fbconfigs are? Is that a bug?
>
>
> 3. Should swrast offer a depth 32 GLXFBConfig?
>
> If I were on a setup that really uses swrast for rendering (e.g. if
> mesa doesn't provide an accelerated graphics driver), I assume this
> webkit crash would be hit there too, due to not having a depth 32
> fbconfig.
>
> Should it have one?
>
> I didn't investigate in detail, but it looks like mesa's
> dri_fill_in_modes() (perhaps via its calls down to
> llvmpipe_is_format_supported()) declares that depth 32 is not
> supported in the swrast codepath.
>
>
> 4. Why is there still a list of PCI IDs in the X server?
>
> I was under the impression that these days, rendering stuff has been
> handed off to mesa, and display stuff has been handed off to KMS. Both
> the kernel and mesa have corresponding drivers for those functions
> (and their own lists of PCI IDs).
>
> I was then surprised to see the X server also maintaining a list of
> PCI IDs and it having a significant effect on which codepaths are
> followed.
>
>
> Thanks for any clarifications!
>

So this is a fun question and took me a day or two of random spelunking.
Let's start with the last question, since it gives us a good starting
point: why are the PCI IDs necessary?

The answer is "DRI2 needs to figure out the driver to load if the user
doesn't pass it into DRI2Connect".
https://gitlab.freedesktop.org/xorg/xserver/blob/master/hw/xfree86/dri2/dri2.c#L1440

Let's now ask and answer two more follow-up questions: 1. Why is the server
using DRI2, 2. Why does the server need the driver name, and 3. Why doesn't
mesa pass the driver name along?

My best guess for why DRI2 is being used is that xf86-video-intel turns it
off by default, because ickle didn't like the implicit synchronization that
DRI3 had and refused to fix some bugs in it. So if you load
xf86-video-intel, unless you configure it to turn on DRI3, you get DRI2.
Yay.

As for why mesa doesn't pass the driver name along, the answer just is that
it doesn't. Maybe it should?
https://github.com/mesa3d/mesa/blob/bd963f84302adb563136712c371023f15dadbea7/src/glx/dri2_glx.c#L1196

DRI3 works a bit differently -- an FD is passed to the X server by mesa,
and the DDX figures out how to interpret that FD. The full flow in rootless
X is that logind picks an FD, passes that to the X server, and then the DDX
driver (likely -modesetting) calls drmGetDeviceNameFromFd2, and all the
logic is encapsulated in libdrm and mesa. But the generic DRI2 doesn't have
an FD, really, so you have to get something.

Let's answer the other questions now:

1. What should webkit be doing in event of it not being to find a
GLXFBConfig that corresponds to the X visual of it's window?

Crash. No, really. Crash. That's indicative of a system misconfiguration.
There should always be a GLXFBConfig that matches the X visual, unless
something has gone horribly wrong.

2. Why is swrast coming into the picture? Is swrast being used for
rendering?

Short answer: "No".

Long answer: This is where I admit to probably not being fully correct,
because it's a complex weave of mesa, drm, and the X server and I forgot a
lot of this over time, and glxvnd changed a bit of it as well. Corrections
welcome.

GLX has two main parts to it: it specifies how you can render GL to an X11
window / pixmap, and it also provides "indirect rendering", which is a
protocol that lets you send GL commands to the X server and the X server
will render them for you. Just so we're on the same page: "Direct
rendering" means that the application itself renders the frame into a
buffer, and "Indirect rendering" means that the application sends the X
server a series of GL protocol commands, and the X server figures out how
to render them.

This is *unrelated* to how the application renders. The application could
render to its buffer in hardware or do it in software. That's up to the
application, from the point of view of the X server. It doesn't care how
those buffers got there, it just knows they exist. In indirect rendering,
the server has a similar choice: it can use the GPU or software or ship it
across the country to have it artisinally hand-painted. The application
doesn't care about buffers, it just told the server it wanted these
graphics commands executed.

In the open-source mesa world, we want to use the same GL driver stack in
both the X server for the indirect case, and in the application for the
direct case. So we load the same GL driver. But there's a bit of trickiness
there: if libGL.so tries to call DRI2GetBuffers from inside the server, it
will get stuck trying to make a synchronous call to its own process. So
mesa drivers have a "loader" interface which is the abstraction here in
play.

For an application going through libGLX direct rendering, it uses a
specific loader, which has implementations of getBuffers / flushBuffers /
getBuffersWithFormat that talk DRI2 to the X server:
https://github.com/mesa3d/mesa/blob/bd963f84302adb563136712c371023f15dadbea7/src/glx/dri2_glx.c#L974-L980

For an X server going through indirect rendering, it uses this loader,
which have different implementations of getBuffers / flushBuffers /
getBuffersWithFormat that poke the internal data structures rather than
make X11 requests:
https://gitlab.freedesktop.org/xorg/xserver/blob/master/glx/glxdri2.c#L763

If the X server can't load the right mesa driver to do hardware-accelerated
X11, it will fall back to DRIswrast for indirect rendering.

Now, the one big thing I'm unsure about is how much of this stack is used
in direct rendering. I believe most of it gets stomped on by the DDX
driver, where, since you're using a compositor, swapping buffers does
nothing other than swapping the backing storage of the composite named
pixmap, and the compositor swapping will turn into a DDX page flip. But
this is where I run out of time, and someone like Adam would be able to
answer a lot more authoritatively anyway. Hopefully this helps.

3. Should swrast offer a depth 32 GLXFBConfig?

Maybe.

Daniel
> _______________________________________________
> xorg-devel at lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel

-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.x.org/archives/xorg-devel/attachments/20180906/69c54c1a/attachment.html>