Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

Fri Jan 19 21:46:25 UTC 2024

On one hand I think it's a great idea. Moving code out of drivers to common
means fixing bugs helps everyone, and implementing new features is the same.

On the other hand, everyone's already got code that works, which means both
a lot of work to switch that code over to common and then the usual cycle
of fixing regressions.

Gallium is generally pretty great now, so my gut says a 'Zirkonium' common
layer is eventually gonna be pretty good too, assuming it can provide the
same sorts of efficiency gains; vulkan is a lot thinner than GL, which
means CPU utilization becomes more noticable very easily.

I'm not saying I'll dive in head first tomorrow, but generally speaking I
think 10 years from now it'll be a nice thing to have.

Mike

On Fri, Jan 19, 2024, 4:02 PM Faith Ekstrand <faith at gfxstrand.net> wrote:

> Yeah, this one's gonna hit Phoronix...
>
> When we started writing Vulkan drivers back in the day, there was this
> notion that Vulkan was a low-level API that directly targets hardware.
> Vulkan drivers were these super thin things that just blasted packets
> straight into the hardware. What little code was common was small and
> pretty easy to just copy+paste around. It was a nice thought...
>
> What's happened in the intervening 8 years is that Vulkan has grown. A lot.
>
> We already have several places where we're doing significant layering.
> It started with sharing the WSI code and some Python for generating
> dispatch tables. Later we added common synchronization code and a few
> vkFoo2 wrappers. Then render passes and...
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024
>
> That's been my project the last couple weeks: A common VkPipeline
> implementation built on top of an ESO-like interface. The big
> deviation this MR makes from prior art is that I make no attempt at
> pretending it's a layered implementation. The vtable for shader
> objects looks like ESO but takes its own path when it's useful to do
> so. For instance, shader creation always consumes NIR and a handful of
> lowering passes are run for you. It's no st_glsl_to_nir but it is a
> bit opinionated. Also, a few of the bits that are missing from ESO
> such as robustness have been added to the interface.
>
> In my mind, this marks a pretty fundamental shift in how the Vulkan
> runtime works, at least in my mind. Previously, everything was
> designed to be a toolbox where you can kind of pick and choose what
> you want to use. Also, everything at least tried to act like a layer
> where you still implemented Vulkan but you could leave out bits like
> render passes if you implemented the new thing and were okay with the
> layer. With the ESO code, you implement something that isn't Vulkan
> entrypoints and the actual entrypoints live in the runtime. This lets
> us expand and adjust the interface as needed for our purposes as well
> as sanitize certain things even in the modern API.
>
> The result is that NVK is starting to feel like a gallium driver. 🙃
>
> So here's the question: do we like this? Do we want to push in this
> direction? Should we start making more things work more this way? I'm
> not looking for MRs just yet nor do I have more reworks directly
> planned. I'm more looking for thoughts and opinions as to how the
> various Vulkan driver teams feel about this. We'll leave the detailed
> planning for the Mesa issue tracker.
>
> It's worth noting that, even though I said we've tried to keep things
> layerish, there are other parts of the runtime that look like this.
> The synchronization code is a good example. The vk_sync interface is
> pretty significantly different from the Vulkan objects it's used to
> implement. That's worked out pretty well, IMO. With as complicated as
> something like pipelines or synchronization are, trying to keep the
> illusion of a layer just isn't practical.
>
> So, do we like this? Should we be pushing more towards drivers being a
> backed of the runtime instead of a user of it?
>
> Now, before anyone asks, no, I don't really want to build a multi-API
> abstraction with a Vulkan state tracker. If we were doing this 5 years
> ago and Zink didn't already exist, one might be able to make an
> argument for pushing in that direction. However, that would add a huge
> amount of weight to the project and make it even harder to develop the
> runtime than it already is and for little benefit at this point.
>
> Here's a few other constraints on what I'm thinking:
>
> 1. I want it to still be possible for drivers to implement an
> extension without piles of runtime plumbing or even bypass the runtime
> on occasion as needed.
>
> 2. I don't want to recreate the gallium cap disaster drivers should
> know exactly what they're advertising. We may want to have some
> internal features or properties that are used by the runtime to make
> decisions but they'll be in addition to the features and properties in
> Vulkan.
>
> 3. We've got some meta stuff already but we probably want more.
> However, I don't want to force meta on folks who don't want it.
>
> The big thing here is that if we do this, I'm going to need help. I'm
> happy to do a lot of the architectural work but drivers are going to
> have to keep up with the changes and I can't take on the burden of
> moving 8 different drivers forward. I can answer questions and maybe
> help out a bit but the refactoring is going to be too much for one
> person, even if that person is me.
>
> Thoughts?
>
> ~Faith
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20240119/9746b4c9/attachment.htm>