[Intel-gfx] [RFC PATCH 04/20] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Jason Ekstrand
jason at jlekstrand.net
Wed Jan 11 22:18:01 UTC 2023
On Wed, Jan 11, 2023 at 2:50 AM Tvrtko Ursulin <
tvrtko.ursulin at linux.intel.com> wrote:
>
> On 10/01/2023 14:08, Jason Ekstrand wrote:
> > On Tue, Jan 10, 2023 at 5:28 AM Tvrtko Ursulin
> > <tvrtko.ursulin at linux.intel.com <mailto:tvrtko.ursulin at linux.intel.com>>
>
> > wrote:
> >
> >
> >
> > On 09/01/2023 17:27, Jason Ekstrand wrote:
> >
> > [snip]
> >
> > > >>> AFAICT it proposes to have 1:1 between *userspace*
> created
> > > contexts (per
> > > >>> context _and_ engine) and drm_sched. I am not sure
> avoiding
> > > invasive changes
> > > >>> to the shared code is in the spirit of the overall idea
> > and instead
> > > >>> opportunity should be used to look at way to
> > refactor/improve
> > > drm_sched.
> > >
> > >
> > > Maybe? I'm not convinced that what Xe is doing is an abuse at
> > all or
> > > really needs to drive a re-factor. (More on that later.)
> > There's only
> > > one real issue which is that it fires off potentially a lot of
> > kthreads.
> > > Even that's not that bad given that kthreads are pretty light and
> > you're
> > > not likely to have more kthreads than userspace threads which are
> > much
> > > heavier. Not ideal, but not the end of the world either.
> > Definitely
> > > something we can/should optimize but if we went through with Xe
> > without
> > > this patch, it would probably be mostly ok.
> > >
> > > >> Yes, it is 1:1 *userspace* engines and drm_sched.
> > > >>
> > > >> I'm not really prepared to make large changes to DRM
> > scheduler
> > > at the
> > > >> moment for Xe as they are not really required nor does
> Boris
> > > seem they
> > > >> will be required for his work either. I am interested to
> see
> > > what Boris
> > > >> comes up with.
> > > >>
> > > >>> Even on the low level, the idea to replace drm_sched
> threads
> > > with workers
> > > >>> has a few problems.
> > > >>>
> > > >>> To start with, the pattern of:
> > > >>>
> > > >>> while (not_stopped) {
> > > >>> keep picking jobs
> > > >>> }
> > > >>>
> > > >>> Feels fundamentally in disagreement with workers (while
> > > obviously fits
> > > >>> perfectly with the current kthread design).
> > > >>
> > > >> The while loop breaks and worker exists if no jobs are
> ready.
> > >
> > >
> > > I'm not very familiar with workqueues. What are you saying would
> fit
> > > better? One scheduling job per work item rather than one big work
> > item
> > > which handles all available jobs?
> >
> > Yes and no, it indeed IMO does not fit to have a work item which is
> > potentially unbound in runtime. But it is a bit moot conceptual
> > mismatch
> > because it is a worst case / theoretical, and I think due more
> > fundamental concerns.
> >
> > If we have to go back to the low level side of things, I've picked
> this
> > random spot to consolidate what I have already mentioned and perhaps
> > expand.
> >
> > To start with, let me pull out some thoughts from workqueue.rst:
> >
> > """
> > Generally, work items are not expected to hog a CPU and consume many
> > cycles. That means maintaining just enough concurrency to prevent
> work
> > processing from stalling should be optimal.
> > """
> >
> > For unbound queues:
> > """
> > The responsibility of regulating concurrency level is on the users.
> > """
> >
> > Given the unbound queues will be spawned on demand to service all
> > queued
> > work items (more interesting when mixing up with the
> > system_unbound_wq),
> > in the proposed design the number of instantiated worker threads does
> > not correspond to the number of user threads (as you have elsewhere
> > stated), but pessimistically to the number of active user contexts.
> >
> >
> > Those are pretty much the same in practice. Rather, user threads is
> > typically an upper bound on the number of contexts. Yes, a single user
> > thread could have a bunch of contexts but basically nothing does that
> > except IGT. In real-world usage, it's at most one context per user
> thread.
>
> Typically is the key here. But I am not sure it is good enough. Consider
> this example - Intel Flex 170:
>
> * Delivers up to 36 streams 1080p60 transcode throughput per card.
> * When scaled to 10 cards in a 4U server configuration, it can support
> up to 360 streams of HEVC/HEVC 1080p60 transcode throughput.
>
I had a feeling it was going to be media.... 😅
> One transcode stream from my experience typically is 3-4 GPU contexts
> (buffer travels from vcs -> rcs -> vcs, maybe vecs) used from a single
> CPU thread. 4 contexts * 36 streams = 144 active contexts. Multiply by
> 60fps = 8640 jobs submitted and completed per second.
>
> 144 active contexts in the proposed scheme means possibly means 144
> kernel worker threads spawned (driven by 36 transcode CPU threads). (I
> don't think the pools would scale down given all are constantly pinged
> at 60fps.)
>
> And then each of 144 threads goes to grab the single GuC CT mutex. First
> threads are being made schedulable, then put to sleep as mutex
> contention is hit, then woken again as mutexes are getting released,
> rinse, repeat.
>
Why is every submission grabbing the GuC CT mutex? I've not read the GuC
back-end yet but I was under the impression that most run_job() would be
just shoving another packet into a ring buffer. If we have to send the GuC
a message on the control ring every single time we submit a job, that's
pretty horrible.
--Jason
(And yes this backend contention is there regardless of 1:1:1, it would
> require a different re-design to solve that. But it is just a question
> whether there are 144 contending threads, or just 6 with the thread per
> engine class scheme.)
>
> Then multiply all by 10 for a 4U server use case and you get 1440 worker
> kthreads, yes 10 more CT locks, but contending on how many CPU cores?
> Just so they can grab a timeslice and maybe content on a mutex as the
> next step.
>
> This example is where it would hurt on large systems. Imagine only an
> even wider media transcode card...
>
> Second example is only a single engine class used (3d desktop?) but with
> a bunch of not-runnable jobs queued and waiting on a fence to signal.
> Implicit or explicit dependencies doesn't matter. Then the fence signals
> and call backs run. N work items get scheduled, but they all submit to
> the same HW engine. So we end up with:
>
> /-- wi1 --\
> / .. .. \
> cb --+--- wi.. ---+-- rq1 -- .. -- rqN
> \ .. .. /
> \-- wiN --/
>
>
> All that we have achieved is waking up N CPUs to contend on the same
> lock and effectively insert the job into the same single HW queue. I
> don't see any positives there.
>
> This example I think can particularly hurt small / low power devices
> because of needless waking up of many cores for no benefit. Granted, I
> don't have a good feel on how common this pattern is in practice.
>
> >
> > That
> > is the number which drives the maximum number of not-runnable jobs
> that
> > can become runnable at once, and hence spawn that many work items,
> and
> > in turn unbound worker threads.
> >
> > Several problems there.
> >
> > It is fundamentally pointless to have potentially that many more
> > threads
> > than the number of CPU cores - it simply creates a scheduling storm.
> >
> > Unbound workers have no CPU / cache locality either and no connection
> > with the CPU scheduler to optimize scheduling patterns. This may
> matter
> > either on large systems or on small ones. Whereas the current design
> > allows for scheduler to notice userspace CPU thread keeps waking up
> the
> > same drm scheduler kernel thread, and so it can keep them on the same
> > CPU, the unbound workers lose that ability and so 2nd CPU might be
> > getting woken up from low sleep for every submission.
> >
> > Hence, apart from being a bit of a impedance mismatch, the proposal
> has
> > the potential to change performance and power patterns and both large
> > and small machines.
> >
> >
> > Ok, thanks for explaining the issue you're seeing in more detail. Yes,
> > deferred kwork does appear to mismatch somewhat with what the scheduler
> > needs or at least how it's worked in the past. How much impact will
> > that mismatch have? Unclear.
> >
> > > >>> Secondly, it probably demands separate workers (not
> > optional),
> > > otherwise
> > > >>> behaviour of shared workqueues has either the potential
> to
> > > explode number
> > > >>> kernel threads anyway, or add latency.
> > > >>>
> > > >>
> > > >> Right now the system_unbound_wq is used which does have a
> > limit
> > > on the
> > > >> number of threads, right? I do have a FIXME to allow a
> > worker to be
> > > >> passed in similar to TDR.
> > > >>
> > > >> WRT to latency, the 1:1 ratio could actually have lower
> > latency
> > > as 2 GPU
> > > >> schedulers can be pushing jobs into the backend /
> cleaning up
> > > jobs in
> > > >> parallel.
> > > >>
> > > >
> > > > Thought of one more point here where why in Xe we
> > absolutely want
> > > a 1 to
> > > > 1 ratio between entity and scheduler - the way we implement
> > > timeslicing
> > > > for preempt fences.
> > > >
> > > > Let me try to explain.
> > > >
> > > > Preempt fences are implemented via the generic messaging
> > > interface [1]
> > > > with suspend / resume messages. If a suspend messages is
> > received to
> > > > soon after calling resume (this is per entity) we simply
> > sleep in the
> > > > suspend call thus giving the entity a timeslice. This
> > completely
> > > falls
> > > > apart with a many to 1 relationship as now a entity
> > waiting for a
> > > > timeslice blocks the other entities. Could we work aroudn
> > this,
> > > sure but
> > > > just another bunch of code we'd have to add in Xe. Being to
> > > freely sleep
> > > > in backend without affecting other entities is really,
> really
> > > nice IMO
> > > > and I bet Xe isn't the only driver that is going to feel
> > this way.
> > > >
> > > > Last thing I'll say regardless of how anyone feels about
> > Xe using
> > > a 1 to
> > > > 1 relationship this patch IMO makes sense as I hope we can
> all
> > > agree a
> > > > workqueue scales better than kthreads.
> > >
> > > I don't know for sure what will scale better and for what use
> > case,
> > > combination of CPU cores vs number of GPU engines to keep
> > busy vs other
> > > system activity. But I wager someone is bound to ask for some
> > > numbers to
> > > make sure proposal is not negatively affecting any other
> drivers.
> > >
> > >
> > > Then let them ask. Waving your hands vaguely in the direction of
> > the
> > > rest of DRM and saying "Uh, someone (not me) might object" is
> > profoundly
> > > unhelpful. Sure, someone might. That's why it's on dri-devel.
> > If you
> > > think there's someone in particular who might have a useful
> > opinion on
> > > this, throw them in the CC so they don't miss the e-mail thread.
> > >
> > > Or are you asking for numbers? If so, what numbers are you
> > asking for?
> >
> > It was a heads up to the Xe team in case people weren't appreciating
> > how
> > the proposed change has the potential influence power and performance
> > across the board. And nothing in the follow up discussion made me
> think
> > it was considered so I don't think it was redundant to raise it.
> >
> > In my experience it is typical that such core changes come with some
> > numbers. Which is in case of drm scheduler is tricky and probably
> > requires explicitly asking everyone to test (rather than count on
> > "don't
> > miss the email thread"). Real products can fail to ship due ten mW
> here
> > or there. Like suddenly an extra core prevented from getting into
> deep
> > sleep.
> >
> > If that was "profoundly unhelpful" so be it.
> >
> >
> > With your above explanation, it makes more sense what you're asking.
> > It's still not something Matt is likely to be able to provide on his
> > own. We need to tag some other folks and ask them to test it out. We
> > could play around a bit with it on Xe but it's not exactly production
> > grade yet and is going to hit this differently from most. Likely
> > candidates are probably AMD and Freedreno.
>
> Whoever is setup to check out power and performance would be good to
> give it a spin, yes.
>
> PS. I don't think I was asking Matt to test with other devices. To start
> with I think Xe is a team effort. I was asking for more background on
> the design decision since patch 4/20 does not say anything on that
> angle, nor later in the thread it was IMO sufficiently addressed.
>
> > > Also, If we're talking about a design that might paint us into an
> > > Intel-HW-specific hole, that would be one thing. But we're not.
> > We're
> > > talking about switching which kernel threading/task mechanism to
> > use for
> > > what's really a very generic problem. The core Xe design works
> > without
> > > this patch (just with more kthreads). If we land this patch or
> > > something like it and get it wrong and it causes a performance
> > problem
> > > for someone down the line, we can revisit it.
> >
> > For some definition of "it works" - I really wouldn't suggest
> > shipping a
> > kthread per user context at any point.
> >
> >
> > You have yet to elaborate on why. What resources is it consuming that's
> > going to be a problem? Are you anticipating CPU affinity problems? Or
> > does it just seem wasteful?
>
> Well I don't know, commit message says the approach does not scale. :)
>
> > I think I largely agree that it's probably unnecessary/wasteful but
> > reducing the number of kthreads seems like a tractable problem to solve
> > regardless of where we put the gpu_scheduler object. Is this the right
> > solution? Maybe not. It was also proposed at one point that we could
> > split the scheduler into two pieces: A scheduler which owns the kthread,
> > and a back-end which targets some HW ring thing where you can have
> > multiple back-ends per scheduler. That's certainly more invasive from a
> > DRM scheduler internal API PoV but would solve the kthread problem in a
> > way that's more similar to what we have now.
> >
> > > In any case that's a low level question caused by the high
> > level design
> > > decision. So I'd think first focus on the high level - which
> > is the 1:1
> > > mapping of entity to scheduler instance proposal.
> > >
> > > Fundamentally it will be up to the DRM maintainers and the
> > community to
> > > bless your approach. And it is important to stress 1:1 is
> about
> > > userspace contexts, so I believe unlike any other current
> > scheduler
> > > user. And also important to stress this effectively does not
> > make Xe
> > > _really_ use the scheduler that much.
> > >
> > >
> > > I don't think this makes Xe nearly as much of a one-off as you
> > think it
> > > does. I've already told the Asahi team working on Apple M1/2
> > hardware
> > > to do it this way and it seems to be a pretty good mapping for
> > them. I
> > > believe this is roughly the plan for nouveau as well. It's not
> > the way
> > > it currently works for anyone because most other groups aren't
> > doing FW
> > > scheduling yet. In the world of FW scheduling and hardware
> > designed to
> > > support userspace direct-to-FW submit, I think the design makes
> > perfect
> > > sense (see below) and I expect we'll see more drivers move in this
> > > direction as those drivers evolve. (AMD is doing some customish
> > thing
> > > for how with gpu_scheduler on the front-end somehow. I've not dug
> > into
> > > those details.)
> > >
> > > I can only offer my opinion, which is that the two options
> > mentioned in
> > > this thread (either improve drm scheduler to cope with what is
> > > required,
> > > or split up the code so you can use just the parts of
> > drm_sched which
> > > you want - which is frontend dependency tracking) shouldn't
> be so
> > > readily dismissed, given how I think the idea was for the new
> > driver to
> > > work less in a silo and more in the community (not do kludges
> to
> > > workaround stuff because it is thought to be too hard to
> > improve common
> > > code), but fundamentally, "goto previous paragraph" for what
> I am
> > > concerned.
> > >
> > >
> > > Meta comment: It appears as if you're falling into the standard
> > i915
> > > team trap of having an internal discussion about what the
> community
> > > discussion might look like instead of actually having the
> community
> > > discussion. If you are seriously concerned about interactions
> with
> > > other drivers or whether or setting common direction, the right
> > way to
> > > do that is to break a patch or two out into a separate RFC series
> > and
> > > tag a handful of driver maintainers. Trying to predict the
> > questions
> > > other people might ask is pointless. Cc them and asking for their
> > input
> > > instead.
> >
> > I don't follow you here. It's not an internal discussion - I am
> raising
> > my concerns on the design publicly. I am supposed to write a patch to
> > show something, but am allowed to comment on a RFC series?
> >
> >
> > I may have misread your tone a bit. It felt a bit like too many
> > discussions I've had in the past where people are trying to predict what
> > others will say instead of just asking them. Reading it again, I was
> > probably jumping to conclusions a bit. Sorry about that.
>
> Okay no problem, thanks. In any case we don't have to keep discussing
> it, since I wrote one or two emails ago it is fundamentally on the
> maintainers and community to ack the approach. I only felt like RFC did
> not explain the potential downsides sufficiently so I wanted to probe
> that area a bit.
>
> > It is "drm/sched: Convert drm scheduler to use a work queue rather
> than
> > kthread" which should have Cc-ed _everyone_ who use drm scheduler.
> >
> >
> > Yeah, it probably should have. I think that's mostly what I've been
> > trying to say.
> >
> > >
> > > Regards,
> > >
> > > Tvrtko
> > >
> > > P.S. And as a related side note, there are more areas where
> > drm_sched
> > > could be improved, like for instance priority handling.
> > > Take a look at msm_submitqueue_create /
> > msm_gpu_convert_priority /
> > > get_sched_entity to see how msm works around the drm_sched
> > hardcoded
> > > limit of available priority levels, in order to avoid having
> > to leave a
> > > hw capability unused. I suspect msm would be happier if they
> > could have
> > > all priority levels equal in terms of whether they apply only
> > at the
> > > frontend level or completely throughout the pipeline.
> > >
> > > > [1]
> > >
> > https://patchwork.freedesktop.org/patch/515857/?series=112189&rev=1
> > <https://patchwork.freedesktop.org/patch/515857/?series=112189&rev=1
> >
> > >
> > <
> https://patchwork.freedesktop.org/patch/515857/?series=112189&rev=1 <
> https://patchwork.freedesktop.org/patch/515857/?series=112189&rev=1>>
> > > >
> > > >>> What would be interesting to learn is whether the option
> of
> > > refactoring
> > > >>> drm_sched to deal with out of order completion was
> > considered
> > > and what were
> > > >>> the conclusions.
> > > >>>
> > > >>
> > > >> I coded this up a while back when trying to convert the
> > i915 to
> > > the DRM
> > > >> scheduler it isn't all that hard either. The free flow
> > control
> > > on the
> > > >> ring (e.g. set job limit == SIZE OF RING / MAX JOB SIZE)
> is
> > > really what
> > > >> sold me on the this design.
> > >
> > >
> > > You're not the only one to suggest supporting out-of-order
> > completion.
> > > However, it's tricky and breaks a lot of internal assumptions of
> the
> > > scheduler. It also reduces functionality a bit because it can no
> > longer
> > > automatically rate-limit HW/FW queues which are often
> > fixed-size. (Ok,
> > > yes, it probably could but it becomes a substantially harder
> > problem.)
> > >
> > > It also seems like a worse mapping to me. The goal here is to
> turn
> > > submissions on a userspace-facing engine/queue into submissions
> > to a FW
> > > queue submissions, sorting out any dma_fence dependencies. Matt's
> > > description of saying this is a 1:1 mapping between sched/entity
> > doesn't
> > > tell the whole story. It's a 1:1:1 mapping between xe_engine,
> > > gpu_scheduler, and GuC FW engine. Why make it a 1:something:1
> > mapping?
> > > Why is that better?
> >
> > As I have stated before, what I think what would fit well for Xe is
> one
> > drm_scheduler per engine class. In specific terms on our current
> > hardware, one drm scheduler instance for render, compute, blitter,
> > video
> > and video enhance. Userspace contexts remain scheduler entities.
> >
> >
> > And this is where we fairly strongly disagree. More in a bit.
> >
> > That way you avoid the whole kthread/kworker story and you have it
> > actually use the entity picking code in the scheduler, which may be
> > useful when the backend is congested.
> >
> >
> > What back-end congestion are you referring to here? Running out of FW
> > queue IDs? Something else?
>
> CT channel, number of context ids.
>
> >
> > Yes you have to solve the out of order problem so in my mind that is
> > something to discuss. What the problem actually is (just TDR?), how
> > tricky and why etc.
> >
> > And yes you lose the handy LRCA ring buffer size management so you'd
> > have to make those entities not runnable in some other way.
> >
> > Regarding the argument you raise below - would any of that make the
> > frontend / backend separation worse and why? Do you think it is less
> > natural? If neither is true then all remains is that it appears extra
> > work to support out of order completion of entities has been
> discounted
> > in favour of an easy but IMO inelegant option.
> >
> >
> > Broadly speaking, the kernel needs to stop thinking about GPU scheduling
> > in terms of scheduling jobs and start thinking in terms of scheduling
> > contexts/engines. There is still some need for scheduling individual
> > jobs but that is only for the purpose of delaying them as needed to
> > resolve dma_fence dependencies. Once dependencies are resolved, they
> > get shoved onto the context/engine queue and from there the kernel only
> > really manages whole contexts/engines. This is a major architectural
> > shift, entirely different from the way i915 scheduling works. It's also
> > different from the historical usage of DRM scheduler which I think is
> > why this all looks a bit funny.
> >
> > To justify this architectural shift, let's look at where we're headed.
> > In the glorious future...
> >
> > 1. Userspace submits directly to firmware queues. The kernel has no
> > visibility whatsoever into individual jobs. At most it can pause/resume
> > FW contexts as needed to handle eviction and memory management.
> >
> > 2. Because of 1, apart from handing out the FW queue IDs at the
> > beginning, the kernel can't really juggle them that much. Depending on
> > FW design, it may be able to pause a client, give its IDs to another,
> > and then resume it later when IDs free up. What it's not doing is
> > juggling IDs on a job-by-job basis like i915 currently is.
> >
> > 3. Long-running compute jobs may not complete for days. This means
> > that memory management needs to happen in terms of pause/resume of
> > entire contexts/engines using the memory rather than based on waiting
> > for individual jobs to complete or pausing individual jobs until the
> > memory is available.
> >
> > 4. Synchronization happens via userspace memory fences (UMF) and the
> > kernel is mostly unaware of most dependencies and when a context/engine
> > is or is not runnable. Instead, it keeps as many of them minimally
> > active (memory is available, even if it's in system RAM) as possible and
> > lets the FW sort out dependencies. (There may need to be some facility
> > for sleeping a context until a memory change similar to futex() or
> > poll() for userspace threads. There are some details TBD.)
> >
> > Are there potential problems that will need to be solved here? Yes. Is
> > it a good design? Well, Microsoft has been living in this future for
> > half a decade or better and it's working quite well for them. It's also
> > the way all modern game consoles work. It really is just Linux that's
> > stuck with the same old job model we've had since the monumental shift
> > to DRI2.
> >
> > To that end, one of the core goals of the Xe project was to make the
> > driver internally behave as close to the above model as possible while
> > keeping the old-school job model as a very thin layer on top. As the
> > broader ecosystem problems (window-system support for UMF, for instance)
> > are solved, that layer can be peeled back. The core driver will already
> > be ready for it.
> >
> > To that end, the point of the DRM scheduler in Xe isn't to schedule
> > jobs. It's to resolve syncobj and dma-buf implicit sync dependencies
> > and stuff jobs into their respective context/engine queue once they're
> > ready. All the actual scheduling happens in firmware and any scheduling
> > the kernel does to deal with contention, oversubscriptions, too many
> > contexts, etc. is between contexts/engines, not individual jobs. Sure,
> > the individual job visibility is nice, but if we design around it, we'll
> > never get to the glorious future.
> >
> > I really need to turn the above (with a bit more detail) into a blog
> > post.... Maybe I'll do that this week.
> >
> > In any case, I hope that provides more insight into why Xe is designed
> > the way it is and why I'm pushing back so hard on trying to make it more
> > of a "classic" driver as far as scheduling is concerned. Are there
> > potential problems here? Yes, that's why Xe has been labeled a
> > prototype. Are such radical changes necessary to get to said glorious
> > future? Yes, I think they are. Will it be worth it? I believe so.
>
> Right, that's all solid I think. My takeaway is that frontend priority
> sorting and that stuff isn't needed and that is okay. And that there are
> multiple options to maybe improve drm scheduler, like the fore mentioned
> making it deal with out of order, or split into functional components,
> or split frontend/backend what you suggested. For most of them cost vs
> benefit is more or less not completely clear, neither how much effort
> was invested to look into them.
>
> One thing I missed from this explanation is how drm_scheduler per engine
> class interferes with the high level concepts. And I did not manage to
> pick up on what exactly is the TDR problem in that case. Maybe the two
> are one and the same.
>
> Bottom line is I still have the concern that conversion to kworkers has
> an opportunity to regress. Possibly more opportunity for some Xe use
> cases than to affect other vendors, since they would still be using per
> physical engine / queue scheduler instances.
>
> And to put my money where my mouth is I will try to put testing Xe
> inside the full blown ChromeOS environment in my team plans. It would
> probably also be beneficial if Xe team could take a look at real world
> behaviour of the extreme transcode use cases too. If the stack is ready
> for that and all. It would be better to know earlier rather than later
> if there is a fundamental issue.
>
> For the patch at hand, and the cover letter, it certainly feels it would
> benefit to record the past design discussion had with AMD folks, to
> explicitly copy other drivers, and to record the theoretical pros and
> cons of threads vs unbound workers as I have tried to highlight them.
>
> Regards,
>
> Tvrtko
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20230111/c0e9712a/attachment-0001.htm>
More information about the Intel-gfx
mailing list