[PATCH v2] drm/amdgpu: change vm->task_info handling
Felix Kuehling
felix.kuehling at amd.com
Wed Jan 24 16:38:53 UTC 2024
On 2024-01-24 9:32, Shashank Sharma wrote:
>
> On 19/01/2024 21:23, Felix Kuehling wrote:
>>
>> On 2024-01-18 14:21, Shashank Sharma wrote:
>>> This patch changes the handling and lifecycle of vm->task_info object.
>>> The major changes are:
>>> - vm->task_info is a dynamically allocated ptr now, and its uasge is
>>> reference counted.
>>> - introducing two new helper funcs for task_info lifecycle management
>>> - amdgpu_vm_get_task_info: reference counts up task_info before
>>> returning this info
>>> - amdgpu_vm_put_task_info: reference counts down task_info
>>> - last put to task_info() frees task_info from the vm.
>>>
>>> This patch also does logistical changes required for existing usage
>>> of vm->task_info.
>>>
>>> V2: Do not block all the prints when task_info not found (Felix)
>>>
>>> Cc: Christian Koenig <christian.koenig at amd.com>
>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>> Cc: Felix Kuehling <Felix.Kuehling at amd.com>
>>> Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
>>
>> Nit-picks inline.
[snip]
>>> +/**
>>> + * amdgpu_vm_put_task_info_pasid - reference down the vm task_info ptr
>>> + * frees the vm task_info ptr at the last put
>>> + *
>>> + * @adev: drm device pointer
>>> + * @task_info: task_info struct under discussion.
>>> + * @pasid: pasid of the VM which contains task_info
>>> + */
>>> +void amdgpu_vm_put_task_info_pasid(struct amdgpu_device *adev,
>>> + struct amdgpu_task_info *task_info,
>>> + u32 pasid)
>>> +{
>>> + int ret;
>>> +
>>> + ret = kref_put(&task_info->refcount, amdgpu_vm_destroy_task_info);
>>> +
>>> + /* Clean up if object was removed in the last put */
>>> + if (ret == 1) {
>>> + struct amdgpu_vm *vm;
>>> +
>>> + vm = amdgpu_vm_get_vm_from_pasid(adev, pasid);
>>> + if (!vm) {
>>> + WARN(1, "Invalid PASID %u to put task info\n", pasid);
>>> + return;
>>> + }
>>> +
>>> + vm->task_info = NULL;
>>
>> This doesn't make sense. If there is a VM pointing to the task_info,
>> then the ref count should not go to 0. Therefore this whole "look up
>> the VM from PASID and clear vm->task_info" seams bogus.
>
> Actually, (ret == 1) above indicates that cleanup function has been
> called and task_info has been freed, and the vm->task_info is a
> dangling ptr.
>
> The current design is
>
> - first open per process/fd creates vm->task_info
>
> - last close per process/fd frees vm->task_info
>
>>
>> I'd expect the last put_task_info call to come from amdgpu_vm_fini.
>> At that point setting task_info to NULL is probably unnecessary. But
>> if we wanted that, we could set it to NULL in the caller.
>>
> Even this can be done, I can call the first get_vm_info() from vm_init
> and last put from vm_fini.
Well, actually it doesn't matter where the last put comes from. The main
point is, that vm->task_info is a counted reference. As long as that
reference exists, the refcount should not become 0. If it does, that's a
bug somewhere. The vm->task_info reference is only dropped in
amdgpu_vm_fini, after which the whole vm structure is freed. So there
should never be a need to set vm->task_info to NULL in
amdgpu_vm_put_task_info_*.
[snip]
>>> +static int amdgpu_vm_create_task_info(struct amdgpu_vm *vm)
>>> +{
>>> + vm->task_info = kzalloc(sizeof(struct amdgpu_task_info),
>>> GFP_KERNEL);
>>> + if (!vm->task_info) {
>>> + DRM_ERROR("OOM while creating task_info space\n");
>>
>> Printing OOM error messages is usually redundant. The allocators are
>> already very noisy when OOM happens. I think checkpatch.pl also warns
>> about that. Maybe it doesn't catch DRM_ERROR but only printk and
>> friends.
>>
> Agree, I will make this debug instead of error.
Even a debug message is not needed. See https://lkml.org/lkml/2014/6/10/382.
[snip]
>> @@ -157,18 +157,26 @@ static int gmc_v10_0_process_interrupt(struct
>> amdgpu_device *adev,
>>> if (!printk_ratelimit())
>>> return 0;
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
>>> + task_info = amdgpu_vm_get_task_info_pasid(adev, entry->pasid);
>>> + if (task_info) {
>>> + dev_err(adev->dev,
>>> + "[%s] page fault (src_id:%u ring:%u vmid:%u pasid:%u,
>>> for process %s pid %d thread %s pid %d)\n",
>>> + entry->vmid_src ? "mmhub" : "gfxhub",
>>> + entry->src_id, entry->ring_id, entry->vmid,
>>> + entry->pasid, task_info->process_name, task_info->tgid,
>>> + task_info->task_name, task_info->pid);
>>> + amdgpu_vm_put_task_info_pasid(adev, task_info, entry->pasid);
>>> + } else {
>>> + dev_err(adev->dev,
>>> + "[%s] page fault (src_id:%u ring:%u vmid:%u pasid:%u,
>>> no process info)\n",
>>> + entry->vmid_src ? "mmhub" : "gfxhub",
>>> + entry->src_id, entry->ring_id, entry->vmid,
>>> + entry->pasid);
>>
>> Can we avoid the duplication here? It's too easy for them to get out
>> of sync. I think it's OK to change the message format so that the
>> process into is printed on a separate line. E.g.:
>>
> That's exactly I thought, but then I was afraid of breaking any
> existing scripts grepping for this exact text. I guess I can do this
> change and see if anyone complaints :).
I don't think there are any ABI guarantees for log messages.
Regards,
Felix
>
> - Shashank
>
>> dev_err(adev->dev,
>> - "[%s] page fault (src_id:%u ring:%u vmid:%u pasid:%u, for
>> process %s pid %d thread %s pid %d)\n",
>> + "[%s] page fault (src_id:%u ring:%u vmid:%u pasid:%u)\n",
>> entry->vmid_src ? "mmhub" : "gfxhub",
>> entry->src_id, entry->ring_id, entry->vmid,
>> - entry->pasid, task_info.process_name, task_info.tgid,
>> - task_info.task_name, task_info.pid);
>> + entry->pasid);
>> + if (task_info) {
>> + dev_err(adev->dev, " in process %s pid %d thread %s pid %d\n",
>> + task_info.process_name, task_info.tgid,
>> + task_info.task_name, task_info.pid);
>> + }
>>
>>
>>> + }
>>> - dev_err(adev->dev,
>>> - "[%s] page fault (src_id:%u ring:%u vmid:%u pasid:%u, for
>>> process %s pid %d thread %s pid %d)\n",
>>> - entry->vmid_src ? "mmhub" : "gfxhub",
>>> - entry->src_id, entry->ring_id, entry->vmid,
>>> - entry->pasid, task_info.process_name, task_info.tgid,
>>> - task_info.task_name, task_info.pid);
>>> dev_err(adev->dev, " in page starting at address 0x%016llx
>>> from client 0x%x (%s)\n",
>>> - addr, entry->client_id,
>>> - soc15_ih_clientid_name[entry->client_id]);
>>> + addr, entry->client_id,
>>> + soc15_ih_clientid_name[entry->client_id]);
>>> if (!amdgpu_sriov_vf(adev))
>>> hub->vmhub_funcs->print_l2_protection_fault_status(adev,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> index 23d7b548d13f..3dda6c310729 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> @@ -126,19 +126,28 @@ static int gmc_v11_0_process_interrupt(struct
>>> amdgpu_device *adev,
>>> }
>>> if (printk_ratelimit()) {
>>> - struct amdgpu_task_info task_info;
>>> -
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
>>> + struct amdgpu_task_info *task_info;
>>> +
>>> + task_info = amdgpu_vm_get_task_info_pasid(adev, entry->pasid);
>>> + if (task_info) {
>>> + dev_err(adev->dev,
>>> + "[%s] page fault (src_id:%u ring:%u vmid:%u
>>> pasid:%u, for process %s pid %d thread %s pid %d)\n",
>>> + entry->vmid_src ? "mmhub" : "gfxhub",
>>> + entry->src_id, entry->ring_id, entry->vmid,
>>> + entry->pasid, task_info->process_name,
>>> task_info->tgid,
>>> + task_info->task_name, task_info->pid);
>>> + amdgpu_vm_put_task_info_pasid(adev, task_info,
>>> entry->pasid);
>>> + } else {
>>> + dev_err(adev->dev,
>>> + "[%s] page fault (src_id:%u ring:%u vmid:%u
>>> pasid:%u, no process info)\n",
>>> + entry->vmid_src ? "mmhub" : "gfxhub",
>>> + entry->src_id, entry->ring_id, entry->vmid,
>>> + entry->pasid);
>>> + }
>>
>> Same as above.
>>
>>
>>> - dev_err(adev->dev,
>>> - "[%s] page fault (src_id:%u ring:%u vmid:%u pasid:%u,
>>> for process %s pid %d thread %s pid %d)\n",
>>> - entry->vmid_src ? "mmhub" : "gfxhub",
>>> - entry->src_id, entry->ring_id, entry->vmid,
>>> - entry->pasid, task_info.process_name, task_info.tgid,
>>> - task_info.task_name, task_info.pid);
>>> dev_err(adev->dev, " in page starting at address
>>> 0x%016llx from client %d\n",
>>> - addr, entry->client_id);
>>> + addr, entry->client_id);
>>> +
>>> if (!amdgpu_sriov_vf(adev))
>>> hub->vmhub_funcs->print_l2_protection_fault_status(adev, status);
>>> }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> index ff4ae73d27ec..aa3887c3bb35 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> @@ -1444,18 +1444,24 @@ static int gmc_v8_0_process_interrupt(struct
>>> amdgpu_device *adev,
>>> gmc_v8_0_set_fault_enable_default(adev, false);
>>> if (printk_ratelimit()) {
>>> - struct amdgpu_task_info task_info;
>>> -
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
>>> + struct amdgpu_task_info *task_info;
>>> +
>>> + task_info = amdgpu_vm_get_task_info_pasid(adev, entry->pasid);
>>> + if (task_info) {
>>> + dev_err(adev->dev, "GPU fault detected: %d 0x%08x for
>>> process %s pid %d thread %s pid %d\n",
>>> + entry->src_id, entry->src_data[0],
>>> task_info->process_name,
>>> + task_info->tgid, task_info->task_name,
>>> task_info->pid);
>>> + amdgpu_vm_put_task_info_pasid(adev, task_info,
>>> entry->pasid);
>>> + } else {
>>> + dev_err(adev->dev, "GPU fault detected: %d 0x%08x (no
>>> process info)\n",
>>> + entry->src_id, entry->src_data[0]);
>>> + }
>> Same as above.
>>
>>
>>> - dev_err(adev->dev, "GPU fault detected: %d 0x%08x for
>>> process %s pid %d thread %s pid %d\n",
>>> - entry->src_id, entry->src_data[0], task_info.process_name,
>>> - task_info.tgid, task_info.task_name, task_info.pid);
>>> dev_err(adev->dev, " VM_CONTEXT1_PROTECTION_FAULT_ADDR
>>> 0x%08X\n",
>>> - addr);
>>> + addr);
>>> dev_err(adev->dev, " VM_CONTEXT1_PROTECTION_FAULT_STATUS
>>> 0x%08X\n",
>>> status);
>>> +
>>> gmc_v8_0_vm_decode_fault(adev, status, addr, mc_client,
>>> entry->pasid);
>>> }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> index 2ac5820e9c92..075d633109e3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> @@ -549,7 +549,7 @@ static int gmc_v9_0_process_interrupt(struct
>>> amdgpu_device *adev,
>>> bool retry_fault = !!(entry->src_data[1] & 0x80);
>>> bool write_fault = !!(entry->src_data[1] & 0x20);
>>> uint32_t status = 0, cid = 0, rw = 0;
>>> - struct amdgpu_task_info task_info;
>>> + struct amdgpu_task_info *task_info;
>>> struct amdgpu_vmhub *hub;
>>> const char *mmhub_cid;
>>> const char *hub_name;
>>> @@ -626,15 +626,23 @@ static int gmc_v9_0_process_interrupt(struct
>>> amdgpu_device *adev,
>>> if (!printk_ratelimit())
>>> return 0;
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
>>> + task_info = amdgpu_vm_get_task_info_pasid(adev, entry->pasid);
>>> + if (task_info) {
>>> + dev_err(adev->dev,
>>> + "[%s] %s page fault (src_id:%u ring:%u vmid:%u
>>> pasid:%u, for process %s pid %d thread %s pid %d)\n",
>>> + hub_name, retry_fault ? "retry" : "no-retry",
>>> + entry->src_id, entry->ring_id, entry->vmid,
>>> + entry->pasid, task_info->process_name, task_info->tgid,
>>> + task_info->task_name, task_info->pid);
>>> + amdgpu_vm_put_task_info_pasid(adev, task_info, entry->pasid);
>>> + } else {
>>> + dev_err(adev->dev,
>>> + "[%s] %s page fault (src_id:%u ring:%u vmid:%u
>>> pasid:%u, no process info)\n",
>>> + hub_name, retry_fault ? "retry" : "no-retry",
>>> + entry->src_id, entry->ring_id, entry->vmid,
>>> + entry->pasid);
>>> + }
>> Same as above.
>>
>>
>>> - dev_err(adev->dev,
>>> - "[%s] %s page fault (src_id:%u ring:%u vmid:%u pasid:%u,
>>> for process %s pid %d thread %s pid %d)\n",
>>> - hub_name, retry_fault ? "retry" : "no-retry",
>>> - entry->src_id, entry->ring_id, entry->vmid,
>>> - entry->pasid, task_info.process_name, task_info.tgid,
>>> - task_info.task_name, task_info.pid);
>>> dev_err(adev->dev, " in page starting at address 0x%016llx
>>> from IH client 0x%x (%s)\n",
>>> addr, entry->client_id,
>>> soc15_ih_clientid_name[entry->client_id]);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> index 3d68dd5523c6..515d1a1ff141 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>> @@ -2104,7 +2104,7 @@ static int sdma_v4_0_print_iv_entry(struct
>>> amdgpu_device *adev,
>>> struct amdgpu_iv_entry *entry)
>>> {
>>> int instance;
>>> - struct amdgpu_task_info task_info;
>>> + struct amdgpu_task_info *task_info;
>>> u64 addr;
>>> instance = sdma_v4_0_irq_id_to_seq(entry->client_id);
>>> @@ -2116,15 +2116,23 @@ static int sdma_v4_0_print_iv_entry(struct
>>> amdgpu_device *adev,
>>> addr = (u64)entry->src_data[0] << 12;
>>> addr |= ((u64)entry->src_data[1] & 0xf) << 44;
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
>>> + task_info = amdgpu_vm_get_task_info_pasid(adev, entry->pasid);
>>> + if (task_info) {
>>> + dev_dbg_ratelimited(adev->dev,
>>> + "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
>>> + "pasid:%u, for process %s pid %d thread %s pid %d\n",
>>> + instance, addr, entry->src_id, entry->ring_id,
>>> entry->vmid,
>>> + entry->pasid, task_info->process_name, task_info->tgid,
>>> + task_info->task_name, task_info->pid);
>>> + amdgpu_vm_put_task_info_pasid(adev, task_info, entry->pasid);
>>> + } else {
>>> + dev_dbg_ratelimited(adev->dev,
>>> + "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
>>> + "pasid:%u, no process info\n",
>>> + instance, addr, entry->src_id, entry->ring_id,
>>> entry->vmid,
>>> + entry->pasid);
>>> + }
>> Same as above.
>>
>>
>>> - dev_dbg_ratelimited(adev->dev,
>>> - "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
>>> - "pasid:%u, for process %s pid %d thread %s pid %d\n",
>>> - instance, addr, entry->src_id, entry->ring_id, entry->vmid,
>>> - entry->pasid, task_info.process_name, task_info.tgid,
>>> - task_info.task_name, task_info.pid);
>>> return 0;
>>> }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>>> index 0f24af6f2810..d7e23bd90f7f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>>> @@ -1642,7 +1642,7 @@ static int sdma_v4_4_2_print_iv_entry(struct
>>> amdgpu_device *adev,
>>> struct amdgpu_iv_entry *entry)
>>> {
>>> int instance;
>>> - struct amdgpu_task_info task_info;
>>> + struct amdgpu_task_info *task_info;
>>> u64 addr;
>>> instance = sdma_v4_4_2_irq_id_to_seq(entry->client_id);
>>> @@ -1654,15 +1654,23 @@ static int sdma_v4_4_2_print_iv_entry(struct
>>> amdgpu_device *adev,
>>> addr = (u64)entry->src_data[0] << 12;
>>> addr |= ((u64)entry->src_data[1] & 0xf) << 44;
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
>>> + task_info = amdgpu_vm_get_task_info_pasid(adev, entry->pasid);
>>> + if (task_info) {
>>> + dev_dbg_ratelimited(adev->dev,
>>> + "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
>>> + "pasid:%u, for process %s pid %d thread %s pid %d\n",
>>> + instance, addr, entry->src_id, entry->ring_id,
>>> entry->vmid,
>>> + entry->pasid, task_info->process_name, task_info->tgid,
>>> + task_info->task_name, task_info->pid);
>>> + amdgpu_vm_put_task_info_pasid(adev, task_info, entry->pasid);
>>> + } else {
>>> + dev_dbg_ratelimited(adev->dev,
>>> + "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
>>> + "pasid:%u (no process info)\n",
>>> + instance, addr, entry->src_id, entry->ring_id,
>>> entry->vmid,
>>> + entry->pasid);
>>> + }
>> Same as above.
>>
>> Regards,
>> Felix
>>
>>
>>> - dev_dbg_ratelimited(adev->dev,
>>> - "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
>>> - "pasid:%u, for process %s pid %d thread %s pid %d\n",
>>> - instance, addr, entry->src_id, entry->ring_id, entry->vmid,
>>> - entry->pasid, task_info.process_name, task_info.tgid,
>>> - task_info.task_name, task_info.pid);
>>> return 0;
>>> }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> index d9953c2b2661..0dfe4b3bd18a 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> @@ -238,16 +238,16 @@ void
>>> kfd_smi_event_update_thermal_throttling(struct kfd_node *dev,
>>> void kfd_smi_event_update_vmfault(struct kfd_node *dev, uint16_t
>>> pasid)
>>> {
>>> - struct amdgpu_task_info task_info;
>>> -
>>> - memset(&task_info, 0, sizeof(struct amdgpu_task_info));
>>> - amdgpu_vm_get_task_info(dev->adev, pasid, &task_info);
>>> - /* Report VM faults from user applications, not retry from
>>> kernel */
>>> - if (!task_info.pid)
>>> - return;
>>> -
>>> - kfd_smi_event_add(0, dev, KFD_SMI_EVENT_VMFAULT, "%x:%s\n",
>>> - task_info.pid, task_info.task_name);
>>> + struct amdgpu_task_info *task_info;
>>> +
>>> + task_info = amdgpu_vm_get_task_info_pasid(dev->adev, pasid);
>>> + if (task_info) {
>>> + /* Report VM faults from user applications, not retry from
>>> kernel */
>>> + if (task_info->pid)
>>> + kfd_smi_event_add(0, dev, KFD_SMI_EVENT_VMFAULT,
>>> "%x:%s\n",
>>> + task_info->pid, task_info->task_name);
>>> + amdgpu_vm_put_task_info_pasid(dev->adev, task_info, pasid);
>>> + }
>>> }
>>> void kfd_smi_event_page_fault_start(struct kfd_node *node, pid_t
>>> pid,
More information about the amd-gfx
mailing list