[PATCH v3 7/9] drm/amdgpu: map usermode queue into MES

Tue Apr 4 20:58:17 UTC 2023

On 2023-04-04 12:36, Shashank Sharma wrote:
> 
> On 04/04/2023 18:30, Luben Tuikov wrote:
>> On 2023-03-29 12:04, Shashank Sharma wrote:
>>> From: Shashank Sharma <contactshashanksharma at gmail.com>
>>>
>>> This patch adds new functions to map/unmap a usermode queue into
>>> the FW, using the MES ring. As soon as this mapping is done, the
>>> queue would  be considered ready to accept the workload.
>>>
>>> V1: Addressed review comments from Alex on the RFC patch series
>>>      - Map/Unmap should be IP specific.
>>> V2:
>>>      Addressed review comments from Christian:
>>>      - Fix the wptr_mc_addr calculation (moved into another patch)
>>>      Addressed review comments from Alex:
>>>      - Do not add fptrs for map/unmap
>>>
>>> V3: Integration with doorbell manager
>>>
>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>> Cc: Christian Koenig <christian.koenig at amd.com>
>>> Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
>>> ---
>> Just add all your Cc right here, and let git-send-email figure it out.
>> Between the Cc tags and the SMTP CC list, Felix is the only one missing.
> 
> No, that's not how it is.
> 
> You keep people cc'ed in the cover letter so that they get informed 
> every time this patch is pushed/used on any opensource branch.

The cover letter is optional, and you can add Cc tags
into the cover letter and then git-send-email would extract those
(any and all) tags from the cover letter too.

When you pick and choose whom to add in the Cc tags, and whom to
add to the SMTP CC list, it creates division.

> People who are added manually in cc are required for this code review 
> session.

No such rule exists. It is best to add all the Cc into the commit message,
so that it is preserved in Git history.

For instance, I just randomly did "git log drivers/gpu/drm/*.[hc]" in
amd-staging-drm-next, and this is the first commit which came up,

commit 097ee58f3ddf7d
Author: Harry Wentland <harry.wentland at amd.com>
Date:   Fri Jan 13 11:24:09 2023 -0500

    drm/connector: print max_requested_bpc in state debugfs

    This is useful to understand the bpc defaults and
    support of a driver.

    Signed-off-by: Harry Wentland <harry.wentland at amd.com>
    Cc: Pekka Paalanen <ppaalanen at gmail.com>
    Cc: Sebastian Wick <sebastian.wick at redhat.com>
    Cc: Vitaly.Prosyak at amd.com
    Cc: Uma Shankar <uma.shankar at intel.com>
    Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
    Cc: Joshua Ashton <joshua at froggi.es>
    Cc: Jani Nikula <jani.nikula at linux.intel.com>
    Cc: dri-devel at lists.freedesktop.org
    Cc: amd-gfx at lists.freedesktop.org
    Reviewed-By: Joshua Ashton <joshua at froggi.es>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230113162428.33874-3-harry.wentland@amd.com
    Signed-off-by: Alex Deucher <alexander.deucher at amd.com>

As you can see the whole Cc list and the MLs are part of the Cc tags.
And the rest of the commits are also good examples of how to do it.
(Don't worry about the Link tag--it is automatically added by tools
maintainers use, although some use Lore.)
This preserves things in Git history, and it's a good thing if we need
to data mine and brainstorm later on on patches, design, and so on.

A good tool to use is "scripts/get_maintainer.pl" which works
on a file, directory and even patch files.

I usually include everyone get_maintainer.pl prints, and on subsequent patch
revisions, also people who have previously commented on the patchset, as they
might be interested to follow up. It's a good thing to do.

Here are a couple of resources, in addition to DRM commits in the tree,
https://www.kernel.org/doc/html/v4.12/process/5.Posting.html#patch-formatting-and-changelogs
https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html#the-canonical-patch-format
(The second link includes links to more resources on good patch writing.)

Hope this helps.

Regards,
Luben

> 
> - Shashank
> 
>> Regards,
>> Luben
>>
>>>   .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c | 70 +++++++++++++++++++
>>>   1 file changed, 70 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
>>> index 39e90ea32fcb..1627641a4a4e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
>>> @@ -23,12 +23,73 @@
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_userqueue.h"
>>>   #include "v11_structs.h"
>>> +#include "amdgpu_mes.h"
>>>   
>>>   #define AMDGPU_USERQ_PROC_CTX_SZ PAGE_SIZE
>>>   #define AMDGPU_USERQ_GANG_CTX_SZ PAGE_SIZE
>>>   #define AMDGPU_USERQ_FW_CTX_SZ PAGE_SIZE
>>>   #define AMDGPU_USERQ_GDS_CTX_SZ PAGE_SIZE
>>>   
>>> +static int
>>> +amdgpu_userq_gfx_v11_map(struct amdgpu_userq_mgr *uq_mgr,
>>> +                         struct amdgpu_usermode_queue *queue)
>>> +{
>>> +    struct amdgpu_device *adev = uq_mgr->adev;
>>> +    struct mes_add_queue_input queue_input;
>>> +    int r;
>>> +
>>> +    memset(&queue_input, 0x0, sizeof(struct mes_add_queue_input));
>>> +
>>> +    queue_input.process_va_start = 0;
>>> +    queue_input.process_va_end = (adev->vm_manager.max_pfn - 1) << AMDGPU_GPU_PAGE_SHIFT;
>>> +    queue_input.process_quantum = 100000; /* 10ms */
>>> +    queue_input.gang_quantum = 10000; /* 1ms */
>>> +    queue_input.paging = false;
>>> +
>>> +    queue_input.gang_context_addr = queue->gang_ctx_gpu_addr;
>>> +    queue_input.process_context_addr = queue->proc_ctx_gpu_addr;
>>> +    queue_input.inprocess_gang_priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
>>> +    queue_input.gang_global_priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
>>> +
>>> +    queue_input.process_id = queue->vm->pasid;
>>> +    queue_input.queue_type = queue->queue_type;
>>> +    queue_input.mqd_addr = queue->mqd.gpu_addr;
>>> +    queue_input.wptr_addr = queue->userq_prop.wptr_gpu_addr;
>>> +    queue_input.queue_size = queue->userq_prop.queue_size >> 2;
>>> +    queue_input.doorbell_offset = queue->userq_prop.doorbell_index;
>>> +    queue_input.page_table_base_addr = amdgpu_gmc_pd_addr(queue->vm->root.bo);
>>> +
>>> +    amdgpu_mes_lock(&adev->mes);
>>> +    r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input);
>>> +    amdgpu_mes_unlock(&adev->mes);
>>> +    if (r) {
>>> +        DRM_ERROR("Failed to map queue in HW, err (%d)\n", r);
>>> +        return r;
>>> +    }
>>> +
>>> +    DRM_DEBUG_DRIVER("Queue %d mapped successfully\n", queue->queue_id);
>>> +    return 0;
>>> +}
>>> +
>>> +static void
>>> +amdgpu_userq_gfx_v11_unmap(struct amdgpu_userq_mgr *uq_mgr,
>>> +                           struct amdgpu_usermode_queue *queue)
>>> +{
>>> +    struct amdgpu_device *adev = uq_mgr->adev;
>>> +    struct mes_remove_queue_input queue_input;
>>> +    int r;
>>> +
>>> +    memset(&queue_input, 0x0, sizeof(struct mes_remove_queue_input));
>>> +    queue_input.doorbell_offset = queue->userq_prop.doorbell_index;
>>> +    queue_input.gang_context_addr = queue->gang_ctx_gpu_addr;
>>> +
>>> +    amdgpu_mes_lock(&adev->mes);
>>> +    r = adev->mes.funcs->remove_hw_queue(&adev->mes, &queue_input);
>>> +    amdgpu_mes_unlock(&adev->mes);
>>> +    if (r)
>>> +        DRM_ERROR("Failed to unmap queue in HW, err (%d)\n", r);
>>> +}
>>> +
>>>   static int amdgpu_userq_gfx_v11_create_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
>>>                                                    struct amdgpu_usermode_queue *queue)
>>>   {
>>> @@ -129,6 +190,14 @@ amdgpu_userq_gfx_v11_mqd_create(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_u
>>>   
>>>       amdgpu_userq_set_ctx_space(uq_mgr, queue);
>>>       amdgpu_bo_unreserve(mqd->obj);
>>> +
>>> +    /* Map the queue in HW using MES ring */
>>> +    r = amdgpu_userq_gfx_v11_map(uq_mgr, queue);
>>> +    if (r) {
>>> +        DRM_ERROR("Failed to map userqueue (%d)\n", r);
>>> +        goto free_ctx;
>>> +    }
>>> +
>>>       DRM_DEBUG_DRIVER("MQD for queue %d created\n", queue->queue_id);
>>>       return 0;
>>>   
>>> @@ -147,6 +216,7 @@ amdgpu_userq_gfx_v11_mqd_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_
>>>   {
>>>       struct amdgpu_userq_ctx_space *mqd = &queue->mqd;
>>>   
>>> +    amdgpu_userq_gfx_v11_unmap(uq_mgr, queue);
>>>       amdgpu_userq_gfx_v11_destroy_ctx_space(uq_mgr, queue);
>>>       amdgpu_bo_free_kernel(&mqd->obj,
>>>   			   &mqd->gpu_addr,