[PATCH V5 00/11] Add disable kernel queue support

Tue Mar 25 17:32:20 UTC 2025

On 03/18, Alex Deucher wrote:
> On Tue, Mar 18, 2025 at 1:46 PM Rodrigo Siqueira <siqueira at igalia.com> wrote:
> >
> > On 03/13, Alex Deucher wrote:
> > > On Thu, Mar 13, 2025 at 6:21 PM Rodrigo Siqueira <siqueira at igalia.com> wrote:
> > > >
> > > > n 03/13, Alex Deucher wrote:
> > > > > To better evaluate user queues, add a module parameter
> > > > > to disable kernel queues.  With this set kernel queues
> > > > > are disabled and only user queues are available.  This
> > > > > frees up hardware resources for use in user queues which
> > > > > would otherwise be used by kernel queues and provides
> > > > > a way to validate user queues without the presence
> > > > > of kernel queues.
> > > >
> > > > Hi Alex,
> > > >
> > > > I'm trying to understand how GFX and MES deal with different queues, and
> > > > I used this patchset to guide me through that. In this sense, could you
> > > > help me with the following points?
> > > >
> > > > FWIU, the GFX has what are called pipes, which in turn have hardware
> > > > queues associated with them. For example, a GFX can have 2 pipes, and
> > > > each pipe could have 2 hardware queues; or it could have 1 pipe and 8
> > > > queue. Is this correct?
> > >
> >
> > Hi Alex, first of all, thanks a lot for your detailed explanation.
> >
> > I still have some other questions, see it inline.
> >
> > > Right.  For gfx, compute, and SDMA you have pipes (called instances on
> > > SDMA) and queues.  A pipe can only execute one queue at a time.  The
> >
> > What is the difference between GFX and Compute? Tbh, I thought they were
> > the same component.
> 
> They both share access to the shader cores, but they have different
> front ends.  GFX has a bunch of fixed function blocks used by draws
> while compute dispatches directly to the shaders.  There are separate
> pipes for each.  You can send dispatch packets to GFX, but you can't
> send draw packets to compute.
> 
> >
> > I was also thinking about the concept of a pipe, and I'm trying to
> > define what a pipe is in this context (the word pipe is one of those
> > words with many meanings in computers). Is the below definition accurate
> > enough?
> >
> >  Pipe, in the context of GFX, Compute, and SDMA, is a mechanism for
> >  running threads.
> 
> Yes. It's the hardware that actually processes the packets in a queue.
> You have multiple HQDs associated with a pipe, only one will be
> processed by a pipe at a time.
> 
> >
> > > pipe will switch between all of the mapped queues.  You have storage
> >
> > Above, you said that each pipe will switch between queues, and a little
> > bit below, in your explanation about MES, you said:
> >
> >  [..] If there are more MQDs than HQDs, the MES firmware will preempt
> >  other user queues to make sure each queue gets a time slice.
> >
> > Does it mean that the GFX Pipe has the mechanic of switching queues
> > while MES has the scheduling logic?
> 
> The pipes have hardware logic to switch between the HQD slots.  MES is
> a separate microcontroller which handles the mapping and unmapping of
> MQDs into HQDs.  It handles priorities and oversubscription (more MQDs
> than HQDs).
> 
> >
> > Does the below example and explanation make sense?
> >
> > Suppose the following scenario:
> >  - One pipe (pipe0) and two queues (queue[0], and queue[1]).
> >  - 3 MQDs (mqd[0], mqd[1], and mqd[2]).
> >  - pipe0 is running an user queue in queue[1].
> >  - pipe0 is running a kernel queue in queue[0].
> 
> Yes.  A pipe can only execute one queue at a time, it will dynamically
> switch between the active HQDs.
> 
> >
> > Fwiu, a pipe can change the current queue in execution, but it does not
> > do it by itself. In other words, it has no scheduling logic; it only has
> > the mechanics of switching queues inside it. When the pipe switches
> > between queues, it uses Mid Command Buffer Preemption (MCBP), which
> > saves some very basic information but has no register state; in other
> > words, those registers must be stored in memory (MES handles it?).
> 
> More or less.  A pipe will switch between queues on a command stream
> or on a packet by packet basis, depending on the engine.  You can
> preempt a queue if you want.  In general the driver will ask MES to do
> this if it needs to preempt a queue.  The MES will also do this
> internally for scheduling reasons.  MES firmware handles the saving of
> state to the MQD.
> 
> >
> > In turn, MES has access to all MQDs handed over to it, which means that
> > MES has all the queue states available for the scheduling and
> > communication with the GFX pipe. Suppose that the GFX pipe is running
> > mqd[2] in the queue[1], and now MES wants to replace it with mqd[0]. The
> > communication will be something like the following:
> >
> > 1. MES to GFX pipe0: Replace(mqd[2], in pipe0, queue[1]) with mqd[0].
> > 2. GFX pipe0: Just stop the current pipe, and start mqd[0].
> >
> > Does it looks correct to you?
> 
> MES would talk to the hardware to unmap queue[1] and save its state to
> mqd[2].  It would then talk to the hardware to map the state from
> mqd[0] into queue[1].
> 
> >
> > > in memory (called an MDQ -- Memory Queue Descriptor) which defines the
> > > state of the queue (GPU virtual addresses of the queue itself, save
> > > areas, doorbell, etc.).  The queues that the pipe switches between are
> > > defined by HQDs (Hardware Queue Descriptors).  These are basically
> > > register based memory for the queues that the pipe can switch between.
> >
> > I was thinking about this register-based memory part. Does it mean that
> > switching between it is just a matter of updating one of those LOW and
> > HIGH registers?
> 
> Not exactly, but close.  The HQD registers are saved in/out of the MQD
> and the MQD also has pointers to other buffers which store other
> things like pipeline state, etc.  Firmware basically tells the hw to
> preempt or umap the queues, waits for that to complete (waits for
> HQD_ACTIVE bit for the queue to go low), then saves the state to the
> MQD.  For resuming or mapping a queue, the opposite happens, firmware
> copies the state out of the MQD into the HQD registers and loads any
> additional state.  Setting the HQD_ACTIVE bit for the queue is what
> ultimately enables it.
> 
> >
> > > The driver sets up an MQD for each queue that it creates.  The MQDs
> > > are then handed to the MES firmware for mapping.  The MES firmware can
> > > map a queue as a legacy queue (i.e. a kernel queue) or a user queue.
> > > The difference is that a legacy queue is statically mapped to a HQD
> > > and is never preempted.  User queues are dynamically mapped to the
> > > HQDs by the MES firmware.  If there are more MQDs than HQDs, the MES
> > > firmware will preempt other user queues to make sure each queue gets a
> > > time slice.
> > >
> > > >
> > > > (for this next part, suppose 1 pipe 2 hardware queues)
> > > > By default, one of the hardware queues is reserved for the Kernel Queue,
> > > > and the user space could use the other. GFX has the MES block "connected"
> > > > to all pipe queues, and MES is responsible for scheduling different ring
> > > > buffers (in memory) in the pipe's hardware queue (effectively making the
> > > > ring active). However, since the kernel queue is always present, MES
> > > > only performs scheduling in one of the hardware queues. This scheduling
> > > > occurs with the MES mapping and unmapping available Rings in memory to
> > > > the hardware queue.
> > > >
> > > > Does the above description sound correct to you?  How about the below
> > > > diagram? Does it look correct to you?
> > >
> > > More or less.  The MES handles all of the queues (kernel or user).
> > > The only real difference is that kernel queues are statically mapped
> > > to an HQD while user queues are dynamically scheduled in the available
> > > HQDs based on level of over-subscription.  E.g., if you have hardware
> > > with 1 pipe and 2 HQDs you could have a kernel queue on 1 HQD and the
> > > MES would schedule all of the user queues on the remaining 1 HQD.  If
> > > you don't enable any kernel queues, then you have 2 HQDs that the MES
> > > can use for scheduling user queues.
> > >
> > > >
> > > > (I hope the diagram looks fine in your email client; if not, I can
> > > > attach a picture of it.)
> > > >
> > > > +-------------------------------------------------------------------------------------------------------------------------------------------+
> > > > |                                                           GFX                                                                             |
> > > > |                                                                                                                                           |
> > > > |                                                                               +-----------------------------+                             |
> > > > |           +---------------------------------------------+ (Hw Queue 0)        | Kernel Queue (No eviction)  +------- No MES Scheduling    |
> > > > |           |        (Hardware Queue 0)                   | ------------------->|                             |               |             |
> > > > |PIPE 0     |   -------------------------------------     |                     +-----------------------------+               X             |
> > > > |           |        (Hardware Queue 1)                   |                                                        +----------+---------+   |
> > > > |           |   -------------------------------------     |--+                                                     |                    |   |
> > > > |           |                                             |  |                  +----------------------------+     |                    |   |
> > > > |           +---------------------------------------------+  | (Hw Queue 1)     |                            |     |   MES Schedules    |   |
> > > > |                                                            +----------------> |  User Queue                +-----+                    |   |
> > > > |                                                                               |                            |     |                    |   |
> > > > |                                                                               +----------------------------+     |                    |   |
> > > > |                                                                                                                  +--------------------+   |
> > > > |                                                                                                                            |              |
> > > > |                                                                                      +-------------------------------------+              |
> > > > |                                                                                      |Un/Map Ring                                         |
> > > > |                                                                                      |                                                    |
> > > > +-------------------------------------------------------------------------------------------------------------------------------------------+
> > > >                                                                                        |
> > > >                                                                  +---------------------+--------------------------------------------+
> > > >                                                                  |   MEMORY            v                                            |
> > > >                                                                  |                                                                  |
> > > >                                                                  |                                                                  |
> > > >                                                                  |  +----------+                                                    |
> > > >                                                                  |  |          |  +---------+         +--------+                    |
> > > >                                                                  |  |    Ring 0|  | Ring 1  |  ...    | Ring N |                    |
> > > >                                                                  |  |          |  |         |         |        |                    |
> > > >                                                                  |  +----------+  +---------+         +--------+                    |
> > > >                                                                  |                                                                  |
> > > >                                                                  |                                                                  |
> > > >                                                                  +------------------------------------------------------------------+
> > > >
> > > > Is the idea in this series to experiment with making the kernel queue
> > > > not fully occupy one of the hardware queue? By making the kernel queue
> > > > able to be scheduled, this would provide one extra queue to be used for
> > > > other things. Is this correct?
> > >
> > > Right.  This series paves the way for getting rid of kernel queues all
> > > together.  Having no kernel queues leaves all of the resources
> > > available to user queues.
> >
> > Another question: I guess kernel queues use VMID 0, and all of the other
> > user queues will use a different VMID, right? Does the VMID matter for
> > this transition to make the kernel queue legacy?
> 
> vmid 0 is the GPU virtual address space used for all kernel driver
> operations.  For kernel queues, the queue itself operates in the vmid
> 0 address space, but each command buffer (Indirect Buffer -- IB)
> operates in a driver assigned non-0 vmid address space.  For kernel
> queues, the driver manages the vmids.  For user queues, the queue and
> IBs both operate in the user's non-0 vmid address space.  The MES
> manages the vmids assignments for user queues.  The driver provides a
> pointer to the user's GPU VM page tables and MES assigns a vmid when
> it maps the queue.  Driver provides a mask which vmids the MES can use
> so that there are no conflicts when mixing kernel and user queues.
> 
> Alex

Hi Alex,

Thanks a lot for all the detailed explanations and patience. I tried to
condense all the knowledge that you shared here and in other places in a
patchset available at:

https://lore.kernel.org/amd-gfx/20250325172623.225901-1-siqueira@igalia.com/T/#t

Thanks again!

> 
> >
> > Thanks
> >
> > >
> > > >
> > > > I'm unsure if I fully understand this series's idea; please correct me
> > > > if I'm wrong.
> > > >
> > > > Also, please elaborate more on the type of tasks that the kernel queue
> > > > handles. Tbh, I did not fully understand the idea behind it.
> > >
> > > In the future of user queues, kernel queues would not be created or
> > > used at all.  Today, on most existing hardware, kernel queues are all
> > > that is available.  Today, when an application submits work to the
> > > kernel driver, the kernel driver submits all of the application
> > > command buffers to kernel queues.  E.g., in most cases there is a
> > > single kernel GFX queue and all applications which want to use the GFX
> > > engine funnel into that queue.  The CS IOCTL basically takes the
> > > command buffers from the applications and schedules them on the kernel
> > > queue.  With user queues, each application will create its own user
> > > queues and will submit work directly to its user queues.  No need for
> > > an IOCTL for each submission, no need to share a single kernel queue,
> > > etc.
> > >
> > > Alex
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > v2: use num_gfx_rings and num_compute_rings per
> > > > >     Felix suggestion
> > > > > v3: include num_gfx_rings fix in amdgpu_gfx.c
> > > > > v4: additional fixes
> > > > > v5: MEC EOP interrupt handling fix (Sunil)
> > > > >
> > > > > Alex Deucher (11):
> > > > >   drm/amdgpu: add parameter to disable kernel queues
> > > > >   drm/amdgpu: add ring flag for no user submissions
> > > > >   drm/amdgpu/gfx: add generic handling for disable_kq
> > > > >   drm/amdgpu/mes: centralize gfx_hqd mask management
> > > > >   drm/amdgpu/mes: update hqd masks when disable_kq is set
> > > > >   drm/amdgpu/mes: make more vmids available when disable_kq=1
> > > > >   drm/amdgpu/gfx11: add support for disable_kq
> > > > >   drm/amdgpu/gfx12: add support for disable_kq
> > > > >   drm/amdgpu/sdma: add flag for tracking disable_kq
> > > > >   drm/amdgpu/sdma6: add support for disable_kq
> > > > >   drm/amdgpu/sdma7: add support for disable_kq
> > > > >
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu.h      |   1 +
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |   4 +
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |   9 ++
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  |   8 +-
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |   2 +
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  |  30 ++--
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c  |  26 ++-
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   2 +-
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |   1 +
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c   | 191 ++++++++++++++++-------
> > > > >  drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c   | 183 +++++++++++++++-------
> > > > >  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |   2 +-
> > > > >  drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c   |   2 +-
> > > > >  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c   |  16 +-
> > > > >  drivers/gpu/drm/amd/amdgpu/mes_v12_0.c   |  15 +-
> > > > >  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c   |   4 +
> > > > >  drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c   |   4 +
> > > > >  17 files changed, 345 insertions(+), 155 deletions(-)
> > > > >
> > > > > --
> > > > > 2.48.1
> > > > >
> > > >
> > > > --
> > > > Rodrigo Siqueira
> >
> > --
> > Rodrigo Siqueira

-- 
Rodrigo Siqueira