[PATCH v3] drm/amdgpu: skip xcp drm device allocation when out of drm resource

Mon Aug 14 14:14:16 UTC 2023

On 8/14/2023 7:28 PM, James Zhu wrote:
> 
> On 2023-08-12 13:00, Lazar, Lijo wrote:
>>
>>
>> On 8/12/2023 6:14 PM, James Zhu wrote:
>>>
>>> On 2023-08-11 21:39, Lazar, Lijo wrote:
>>>>
>>>> [AMD Official Use Only - General]
>>>>
>>>>
>>>> A dynamic partition switch could happen later.  The switch could 
>>>> still be successful in terms of hardware,
>>> [JZ] Only ignore render node assignment, and remove visibility in 
>>> user space, xcp continues to be generated as usual. so switch should 
>>> work as usual
>>
>> Switch is not useful for the user unless the apps can make use of the 
>> render nodes. A 'success' from hardware perspective doesn't turn out 
>> to be a 'success' for users eventually to make use of the extra 
>> partition.
> [JZ] Yes, After switch, app can use the render nodes (no more than 64 
> nodes in one system), Like 8P MI300X CPX mode,  with no external VGA 
> device,all 64 nodes can be used. with one  external VGA device, 63 nodes 
> can be used.

Understood. What I meant is we should be having the 'warn message' with 
partition switch (when user really wants to make use of partitions) 
rather than on driver load (as with this patch). Technically, some of 
those nodes are usable only after a partition switch.

>>
>>>> and hence gives a false feeling of success even if there are no 
>>>> render nodes available for any app to make use of the partition.
>>> [JZ] from driver prospective, the switch is real success, treat the 
>>> last one harvested in user space.. there is warning in kernel log, 
>>> and final solution for more than 64 nodes is on-going
>>
>> The render nodes are allocated during driver load and the message will 
>> go unnoticed. We could still allow the switch, but the message should 
>> be there during a partition switch like 'only x/y (x out of y nodes) 
>> are usable'. The worst case is - only 1 out of N meaning no benefit - 
>> and in that case user may switch back to normal mode to make use of 
>> full compute power.
>>
>>>>
>>>> Also, a kfd node is not expected to have a valid xcp pointer on 
>>>> devices without partition.
>>> [JZ] won't affect xcp pointer, only ddev.
>>>> This access could break then gpu->xcp->ddev.
>>> [JZ] added skip when ddev==NULL
>>
>> What I meant is xcp in kfd node could be NULL on SOCs like NV series. 
>> There should be a check for xcp before accessing ddev -
>> https://elixir.bootlin.com/linux/v6.5-rc5/source/drivers/gpu/drm/amd/amdkfd/kfd_device.c#L794 
>>
> [JZ] So it is potential bug before this patch. then we need review 
> current code to add all necessary xcp check.

We already have checks - 
https://elixir.bootlin.com/linux/v6.5-rc5/source/drivers/gpu/drm/amd/amdkfd/kfd_topology.c#L2027

Thanks,
Lijo

>>
>> Thanks,
>> Lijo
>>
>>>>
>>>> Thanks,
>>>> Lijo
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> *From:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of 
>>>> James Zhu <James.Zhu at amd.com>
>>>> *Sent:* Saturday, August 12, 2023 2:36:27 AM
>>>> *To:* amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
>>>> *Cc:* Lin, Amber <Amber.Lin at amd.com>; Zhu, James 
>>>> <James.Zhu at amd.com>; Kasiviswanathan, Harish 
>>>> <Harish.Kasiviswanathan at amd.com>; Koenig, Christian 
>>>> <Christian.Koenig at amd.com>
>>>> *Subject:* [PATCH v3] drm/amdgpu: skip xcp drm device allocation 
>>>> when out of drm resource
>>>> Return 0 when drm device alloc failed with -ENOSPC in
>>>> order to  allow amdgpu drive loading. But the xcp without
>>>> drm device node assigned won't be visiable in user space.
>>>> This helps amdgpu driver loading on system which has more
>>>> than 64 nodes, the current limitation.
>>>>
>>>> The proposal to add more drm nodes is discussed in public,
>>>> which will support up to 2^20 nodes totally.
>>>> kernel drm:
>>>> https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/ 
>>>>
>>>> libdrm:
>>>> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305
>>>>
>>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>>> Acked-by: Christian König <christian.koenig at amd.com>
>>>>
>>>> -v2: added warning message
>>>> -v3: use dev_warn
>>>> ---
>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c   | 13 ++++++++++++-
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 10 +++++++++-
>>>>  2 files changed, 21 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
>>>> index 9c9cca129498..565a1fa436d4 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
>>>> @@ -239,8 +239,13 @@ static int amdgpu_xcp_dev_alloc(struct 
>>>> amdgpu_device *adev)
>>>>
>>>>          for (i = 1; i < MAX_XCP; i++) {
>>>>                  ret = amdgpu_xcp_drm_dev_alloc(&p_ddev);
>>>> -               if (ret)
>>>> +               if (ret == -ENOSPC) {
>>>> +                       dev_warn(adev->dev,
>>>> +                       "Skip xcp node #%d when out of drm node 
>>>> resource.", i);
>>>> +                       return 0;
>>>> +               } else if (ret) {
>>>>                          return ret;
>>>> +               }
>>>>
>>>>                  /* Redirect all IOCTLs to the primary device */
>>>>                  adev->xcp_mgr->xcp[i].rdev = p_ddev->render->dev;
>>>> @@ -328,6 +333,9 @@ int amdgpu_xcp_dev_register(struct amdgpu_device 
>>>> *adev,
>>>>                  return 0;
>>>>
>>>>          for (i = 1; i < MAX_XCP; i++) {
>>>> +               if (!adev->xcp_mgr->xcp[i].ddev)
>>>> +                       break;
>>>> +
>>>>                  ret = drm_dev_register(adev->xcp_mgr->xcp[i].ddev, 
>>>> ent->driver_data);
>>>>                  if (ret)
>>>>                          return ret;
>>>> @@ -345,6 +353,9 @@ void amdgpu_xcp_dev_unplug(struct amdgpu_device 
>>>> *adev)
>>>>                  return;
>>>>
>>>>          for (i = 1; i < MAX_XCP; i++) {
>>>> +               if (!adev->xcp_mgr->xcp[i].ddev)
>>>> +                       break;
>>>> +
>>>>                  p_ddev = adev->xcp_mgr->xcp[i].ddev;
>>>>                  drm_dev_unplug(p_ddev);
>>>>                  p_ddev->render->dev = adev->xcp_mgr->xcp[i].rdev;
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>> index 3b0749390388..310df98ba46a 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>> @@ -1969,8 +1969,16 @@ int kfd_topology_add_device(struct kfd_node 
>>>> *gpu)
>>>>          int i;
>>>>          const char *asic_name = 
>>>> amdgpu_asic_name[gpu->adev->asic_type];
>>>>
>>>> +
>>>>          gpu_id = kfd_generate_gpu_id(gpu);
>>>> -       pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
>>>> +       if (!gpu->xcp->ddev) {
>>>> +               dev_warn(gpu->adev->dev,
>>>> +               "Won't add GPU (ID: 0x%x) to topology since it has 
>>>> no drm node assigned.",
>>>> +               gpu_id);
>>>> +               return 0;
>>>> +       } else {
>>>> +               pr_debug("Adding new GPU (ID: 0x%x) to topology\n", 
>>>> gpu_id);
>>>> +       }
>>>>
>>>>          /* Check to see if this gpu device exists in the 
>>>> topology_device_list.
>>>>           * If so, assign the gpu to that device,
>>>> -- 
>>>> 2.34.1
>>>>