[PATCH v3 2/2] drm/admgpu: Present amdgpu_task_info in VM_FAULTS.
Zhang, Jerry (Junwei)
Jerry.Zhang at amd.com
Mon Jul 9 05:13:18 UTC 2018
On 07/06/2018 03:27 AM, Andrey Grodzovsky wrote:
> Extract and present the reposnsible process and thread when
> VM_FAULT happens.
>
> v2: Use getter and setter functions.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++++
> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 10 +++++++---
> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 9 +++++++--
> 3 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 7a625f3..609c8f5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -187,6 +187,10 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
> if (p->uf_entry.robj)
> p->job->uf_addr = uf_offset;
> kfree(chunk_array);
> +
> + /* Use this opportunity to fill in task info for the vm */
> + amdgpu_vm_set_task_info(vm);
> +
Shall we set the task info when vm init?
> return 0;
>
> free_all_kdata:
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> index 08753e7..75f3ffb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> @@ -46,7 +46,6 @@
>
> #include "ivsrcid/ivsrcid_vislands30.h"
>
> -
> static void gmc_v8_0_set_gmc_funcs(struct amdgpu_device *adev);
> static void gmc_v8_0_set_irq_funcs(struct amdgpu_device *adev);
> static int gmc_v8_0_wait_for_idle(void *handle);
> @@ -1449,8 +1448,13 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
> gmc_v8_0_set_fault_enable_default(adev, false);
>
> if (printk_ratelimit()) {
> - dev_err(adev->dev, "GPU fault detected: %d 0x%08x\n",
> - entry->src_id, entry->src_data[0]);
> + struct amdgpu_task_info task_info = { 0 };
> +
> + amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
Shall we find vm and get the vm->task_info directly instead of filling local variable task_info?
(current style is also OK for me, just for more info)
And we may also check the return value for "NULL" case, otherwise it may cause access errorin dev_err(),
if failed to find vm (although, it's most unlikely to happen).
Jerry
> +
> + dev_err(adev->dev, "GPU fault detected: %d 0x%08x for process %s pid %d thread %s pid %d\n",
> + entry->src_id, entry->src_data[0], task_info.process_name,
> + task_info.tgid, task_info.task_name, task_info.pid);
> dev_err(adev->dev, " VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x%08X\n",
> addr);
> dev_err(adev->dev, " VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x%08X\n",
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 691a659..9df94b4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -259,11 +259,16 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev,
> }
>
> if (printk_ratelimit()) {
> + struct amdgpu_task_info task_info = { 0 };
> +
> + amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
> +
> dev_err(adev->dev,
> - "[%s] VMC page fault (src_id:%u ring:%u vmid:%u pasid:%u)\n",
> + "[%s] VMC page fault (src_id:%u ring:%u vmid:%u pasid:%u, for process %s pid %d thread %s pid %d\n)\n",
> entry->vmid_src ? "mmhub" : "gfxhub",
> entry->src_id, entry->ring_id, entry->vmid,
> - entry->pasid);
> + entry->pasid, task_info.process_name, task_info.tgid,
> + task_info.task_name, task_info.pid);
> dev_err(adev->dev, " at page 0x%016llx from %d\n",
> addr, entry->client_id);
> if (!amdgpu_sriov_vf(adev))
>
More information about the amd-gfx
mailing list