Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

Faith Ekstrand faith at gfxstrand.net
Wed Jan 24 18:39:36 UTC 2024


On Wed, Jan 24, 2024 at 12:26 PM Zack Rusin <zack.rusin at broadcom.com> wrote:
>
> On Wed, Jan 24, 2024 at 10:27 AM Faith Ekstrand <faith at gfxstrand.net>
wrote:
> >
> > Jose,
> >
> > Thanks for your thoughts!
> >
> > On Wed, Jan 24, 2024 at 4:30 AM Jose Fonseca <jose.fonseca at broadcom.com>
wrote:
> > >
> > > I don't know much about the current Vulkan driver internals to have
or provide an informed opinion on the path forward, but I'd like to share
my backwards looking perspective.
> > >
> > > Looking back, Gallium was two things effectively:
> > > (1) an abstraction layer, that's watertight (as in upper layers
shouldn't reach through to lower layers)
> > > (2) an ecosystem of reusable components (draw, util, tgsi, etc.)
> > >
> > > (1) was of course important -- and the discipline it imposed is what
enabled to great simplifications -- but it also became a straight-jacket,
as GPUs didn't stand still, and sooner or later the
see-every-hardware-as-the-same lenses stop reflecting reality.
> > >
> > > If I had to pick one, I'd say that (2) is far more useful and
practical.    Take components like gallium's draw and other util modules. A
driver can choose to use them or not.  One could fork them within Mesa
source tree, and only the drivers that opt-in into the fork would need to
be tested/adapted/etc
> > >
> > > On the flip side, Vulkan API is already a pretty low level HW
abstraction.  It's also very flexible and extensible, so it's hard to
provide a watertight abstraction underneath it without either taking the
lowest common denominator, or having lots of optional bits of functionality
governed by a myriad of caps like you alluded to.
> >
> > There is a third thing that isn't really recognized in your description:
> >
> > (3) A common "language" to talk about GPUs and data structures that
> > represent that language
> >
> > This is precisely what the Vulkan runtime today doesn't have. Classic
> > meta sucked because we were trying to implement GL in GL. u_blitter,
> > on the other hand, is pretty fantastic because Gallium provides a much
> > more sane interface to write those common components in terms of.
> >
> > So far, we've been trying to build those components in terms of the
> > Vulkan API itself with calls jumping back into the dispatch table to
> > try and get inside the driver. This is working but it's getting more
> > and more fragile the more tools we add to that box. A lot of what I
> > want to do with gallium2 or whatever we're calling it is to fix our
> > layering problems so that calls go in one direction and we can
> > untangle the jumble. I'm still not sure what I want that to look like
> > but I think I want it to look a lot like Vulkan, just with a handier
> > interface.
>
> Yes, that makes sense. When we were writing the initial components for
> gallium (draw and cso) I really liked the general concept and thought
> about trying to reuse them in the old, non-gallium Mesa drivers but
> the obstacle was that there was no common interface to lay them on.
> Using GL to implement GL was silly and using Vulkan to implement
> Vulkan is not much better.
>
> Having said that my general thoughts on GPU abstractions largely match
> what Jose has said. To me it's a question of whether a clean
> abstraction:
> - on top of which you can build an entire GPU driver toolkit (i.e. all
> the components and helpers)
> - that makes it trivial to figure up what needs to be done to write a
> new driver and makes bootstrapping a new driver a lot simpler
> - that makes it easier to reason about cross hardware concepts (it's a
> lot easier to understand the entirety of the ecosystem if every driver
> is not doing something unique to implement similar functionality)
> is worth more than almost exponentially increasing the difficulty of:
> - advancing the ecosystem (i.e. it might be easier to understand but
> it's way harder to create clean abstractions across such different
> hardware).
> - driver maintenance (i.e. there will be a constant stream of
> regressions hitting your driver as a result of other people working on
> their drivers)
> - general development (i.e. bug fixes/new features being held back
> because they break some other driver)
>
> Some of those can certainly be titled one way or the other, e.g. the
> driver maintenance con be somewhat eased by requiring that every
> driver working on top of the new abstraction has to have a stable
> Mesa-CI setup (be it lava or ci-tron, or whatever) but all of those
> things need to be reasoned about. In my experience abstractions never
> have uniform support because some people will value cons of them more
> than they value the pros. So the entire process requires some very
> steadfast individuals to keep going despite hearing that the effort is
> dumb, at least until the benefits of the new approach are impossible
> to deny. So you know... "how much do you believe in this approach
> because some days will suck and you can't give up" ;) is probably the
> question.

Well, I've built my entire career out of doing things that others said were
a terrible idea until after I'd done them and proved they were actually a
good idea, so... Not too worried about that one. 😉

~Faith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20240124/bc152bf5/attachment-0001.htm>


More information about the mesa-dev mailing list