[PATCH 0/8] AMDGPU usermode queues

Dave Airlie airlied at gmail.com
Mon Feb 6 00:52:26 UTC 2023


On Sat, 4 Feb 2023 at 07:54, Shashank Sharma <shashank.sharma at amd.com> wrote:
>
> From: Shashank Sharma <contactshashanksharma at gmail.com>
>
> This patch series introduces AMDGPU usermode graphics queues.
> User queues is a method of GPU workload submission into the graphics
> hardware without any interaction with kernel/DRM schedulers. In this
> method, a userspace graphics application can create its own workqueue
> and submit it directly in the GPU HW.
>
> The general idea of how this is supposed to work:
> - The application creates the following GPU objetcs:
>   - A queue object to hold the workload packets.
>   - A read pointer object.
>   - A write pointer object.
>   - A doorbell page.
> - Kernel picks any 32-bit offset in the doorbell page for this queue.
> - The application uses the usermode_queue_create IOCTL introduced in
>   this patch, by passing the the GPU addresses of these objects (read
>   ptr, write ptr, queue base address and doorbell address)
> - The kernel creates the queue and maps it in the HW.
> - The application can start submitting the data in the queue as soon as
>   the kernel IOCTL returns.
> - Once the data is filled in the queue, the app must write the number of
>   dwords in the doorbell offset, and the GPU will start fetching the data.

So I just have one question about forward progress here, let's call it
the 51% of VRAM problem.

You have two apps they both have working sets that allocate > 51% of VRAM.

Application (a) has the VRAM and mapping for the user queues and is
submitting work
Application (b) wants to submit work, it has no queue mapping as it
was previously evicted, does (b) have to call an ioctl to get it's
mapping back?
When (b) calls the ioctl (a) loses it mapping. Control returns to (b),
but before it submits any work on the ring mapping it has, (a) gets
control and notices it has no queues, so calls the ioctl, and (b)
loses it mapping, and around and around they go never making forward
progress.

What's the exit strategy for something like that, fall back to kernel
submit so you can get a memory objects validated and submit some work?

Dave.


More information about the amd-gfx mailing list