<div dir="ltr"><div>Userspace needs a query that a queue IP type is supported. "available_rings" is used for that right now, but if that's 0, something else must indicate IP support.</div><div><br></div><div>amd_ip_info::num_queues should be non-zero even when user queues are supported. The exact number doesn't matter with user queues.</div><div><br></div><div>Marek</div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Mar 17, 2025 at 3:09 AM Liang, Prike <<a href="mailto:Prike.Liang@amd.com">Prike.Liang@amd.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[Public]<br>
<br>
We might still need to export each ring's number correctly; otherwise, the Mesa driver will consider there's no available ring supported from the driver and then further assert before submitting the user queue.<br>
<br>
If we want to keep the ring number being zero, the Mesa driver may need an attachment change to allow the command submitted to the zero-ring number if the user queue is enabled.<br>
<br>
Hi @Olsak, Marek Do you think it's fine to have the attachment patch for the userq support? Except for such changes, maybe we also need to clean up the IB-related part.<br>
<br>
Regards,<br>
      Prike<br>
<br>
> -----Original Message-----<br>
> From: amd-gfx <<a href="mailto:amd-gfx-bounces@lists.freedesktop.org" target="_blank">amd-gfx-bounces@lists.freedesktop.org</a>> On Behalf Of Alex<br>
> Deucher<br>
> Sent: Thursday, March 13, 2025 10:41 PM<br>
> To: <a href="mailto:amd-gfx@lists.freedesktop.org" target="_blank">amd-gfx@lists.freedesktop.org</a><br>
> Cc: Deucher, Alexander <<a href="mailto:Alexander.Deucher@amd.com" target="_blank">Alexander.Deucher@amd.com</a>>; Khatri, Sunil<br>
> <<a href="mailto:Sunil.Khatri@amd.com" target="_blank">Sunil.Khatri@amd.com</a>><br>
> Subject: [PATCH 02/11] drm/amdgpu: add ring flag for no user submissions<br>
><br>
> This would be set by IPs which only accept submissions from the kernel, not<br>
> userspace, such as when kernel queues are disabled. Don't expose the rings to<br>
> userspace and reject any submissions in the CS IOCTL.<br>
><br>
> Reviewed-by: Sunil Khatri<<a href="mailto:sunil.khatri@amd.com" target="_blank">sunil.khatri@amd.com</a>><br>
> Signed-off-by: Alex Deucher <<a href="mailto:alexander.deucher@amd.com" target="_blank">alexander.deucher@amd.com</a>><br>
> ---<br>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  4 ++++<br>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  | 30 ++++++++++++++++--------<br>
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  2 +-<br>
>  3 files changed, 25 insertions(+), 11 deletions(-)<br>
><br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> index 5df21529b3b13..5cc18034b75df 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> @@ -349,6 +349,10 @@ static int amdgpu_cs_p2_ib(struct amdgpu_cs_parser<br>
> *p,<br>
>       ring = amdgpu_job_ring(job);<br>
>       ib = &job->ibs[job->num_ibs++];<br>
><br>
> +     /* submissions to kernel queus are disabled */<br>
> +     if (ring->no_user_submission)<br>
> +             return -EINVAL;<br>
> +<br>
>       /* MM engine doesn't support user fences */<br>
>       if (p->uf_bo && ring->funcs->no_user_fence)<br>
>               return -EINVAL;<br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c<br>
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c<br>
> index cd6eb7a3bc58a..3b7dfd56ccd0e 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c<br>
> @@ -408,7 +408,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>       case AMDGPU_HW_IP_GFX:<br>
>               type = AMD_IP_BLOCK_TYPE_GFX;<br>
>               for (i = 0; i < adev->gfx.num_gfx_rings; i++)<br>
> -                     if (adev->gfx.gfx_ring[i].sched.ready)<br>
> +                     if (adev->gfx.gfx_ring[i].sched.ready &&<br>
> +                         !adev->gfx.gfx_ring[i].no_user_submission)<br>
>                               ++num_rings;<br>
>               ib_start_alignment = 32;<br>
>               ib_size_alignment = 32;<br>
> @@ -416,7 +417,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>       case AMDGPU_HW_IP_COMPUTE:<br>
>               type = AMD_IP_BLOCK_TYPE_GFX;<br>
>               for (i = 0; i < adev->gfx.num_compute_rings; i++)<br>
> -                     if (adev->gfx.compute_ring[i].sched.ready)<br>
> +                     if (adev->gfx.compute_ring[i].sched.ready &&<br>
> +                         !adev->gfx.compute_ring[i].no_user_submission)<br>
>                               ++num_rings;<br>
>               ib_start_alignment = 32;<br>
>               ib_size_alignment = 32;<br>
> @@ -424,7 +426,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>       case AMDGPU_HW_IP_DMA:<br>
>               type = AMD_IP_BLOCK_TYPE_SDMA;<br>
>               for (i = 0; i < adev->sdma.num_instances; i++)<br>
> -                     if (adev->sdma.instance[i].ring.sched.ready)<br>
> +                     if (adev->sdma.instance[i].ring.sched.ready &&<br>
> +                         !adev->gfx.gfx_ring[i].no_user_submission)<br>
>                               ++num_rings;<br>
>               ib_start_alignment = 256;<br>
>               ib_size_alignment = 4;<br>
> @@ -435,7 +438,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>                       if (adev->uvd.harvest_config & (1 << i))<br>
>                               continue;<br>
><br>
> -                     if (adev->uvd.inst[i].ring.sched.ready)<br>
> +                     if (adev->uvd.inst[i].ring.sched.ready &&<br>
> +                         !adev->uvd.inst[i].ring.no_user_submission)<br>
>                               ++num_rings;<br>
>               }<br>
>               ib_start_alignment = 256;<br>
> @@ -444,7 +448,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>       case AMDGPU_HW_IP_VCE:<br>
>               type = AMD_IP_BLOCK_TYPE_VCE;<br>
>               for (i = 0; i < adev->vce.num_rings; i++)<br>
> -                     if (adev->vce.ring[i].sched.ready)<br>
> +                     if (adev->vce.ring[i].sched.ready &&<br>
> +                         !adev->vce.ring[i].no_user_submission)<br>
>                               ++num_rings;<br>
>               ib_start_alignment = 256;<br>
>               ib_size_alignment = 4;<br>
> @@ -456,7 +461,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>                               continue;<br>
><br>
>                       for (j = 0; j < adev->uvd.num_enc_rings; j++)<br>
> -                             if (adev->uvd.inst[i].ring_enc[j].sched.ready)<br>
> +                             if (adev->uvd.inst[i].ring_enc[j].sched.ready &&<br>
> +                                 !adev->uvd.inst[i].ring_enc[j].no_user_submission)<br>
>                                       ++num_rings;<br>
>               }<br>
>               ib_start_alignment = 256;<br>
> @@ -468,7 +474,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>                       if (adev->vcn.harvest_config & (1 << i))<br>
>                               continue;<br>
><br>
> -                     if (adev->vcn.inst[i].ring_dec.sched.ready)<br>
> +                     if (adev->vcn.inst[i].ring_dec.sched.ready &&<br>
> +                         !adev->vcn.inst[i].ring_dec.no_user_submission)<br>
>                               ++num_rings;<br>
>               }<br>
>               ib_start_alignment = 256;<br>
> @@ -481,7 +488,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>                               continue;<br>
><br>
>                       for (j = 0; j < adev->vcn.inst[i].num_enc_rings; j++)<br>
> -                             if (adev->vcn.inst[i].ring_enc[j].sched.ready)<br>
> +                             if (adev->vcn.inst[i].ring_enc[j].sched.ready &&<br>
> +                                 !adev->vcn.inst[i].ring_enc[j].no_user_submission)<br>
>                                       ++num_rings;<br>
>               }<br>
>               ib_start_alignment = 256;<br>
> @@ -496,7 +504,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>                               continue;<br>
><br>
>                       for (j = 0; j < adev->jpeg.num_jpeg_rings; j++)<br>
> -                             if (adev->jpeg.inst[i].ring_dec[j].sched.ready)<br>
> +                             if (adev->jpeg.inst[i].ring_dec[j].sched.ready &&<br>
> +                                 !adev->jpeg.inst[i].ring_dec[j].no_user_submission)<br>
>                                       ++num_rings;<br>
>               }<br>
>               ib_start_alignment = 256;<br>
> @@ -504,7 +513,8 @@ static int amdgpu_hw_ip_info(struct amdgpu_device<br>
> *adev,<br>
>               break;<br>
>       case AMDGPU_HW_IP_VPE:<br>
>               type = AMD_IP_BLOCK_TYPE_VPE;<br>
> -             if (adev->vpe.ring.sched.ready)<br>
> +             if (adev->vpe.ring.sched.ready &&<br>
> +                 !adev->vpe.ring.no_user_submission)<br>
>                       ++num_rings;<br>
>               ib_start_alignment = 256;<br>
>               ib_size_alignment = 4;<br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> index b4fd1e17205e9..4a97afcb38b78 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> @@ -297,6 +297,7 @@ struct amdgpu_ring {<br>
>       struct dma_fence        *vmid_wait;<br>
>       bool                    has_compute_vm_bug;<br>
>       bool                    no_scheduler;<br>
> +     bool                    no_user_submission;<br>
>       int                     hw_prio;<br>
>       unsigned                num_hw_submission;<br>
>       atomic_t                *sched_score;<br>
> @@ -310,7 +311,6 @@ struct amdgpu_ring {<br>
>       unsigned int    entry_index;<br>
>       /* store the cached rptr to restore after reset */<br>
>       uint64_t cached_rptr;<br>
> -<br>
>  };<br>
><br>
>  #define amdgpu_ring_parse_cs(r, p, job, ib) ((r)->funcs->parse_cs((p), (job), (ib)))<br>
> --<br>
> 2.48.1<br>
<br>
</blockquote></div>