[PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini
Zhang, Hawking
Hawking.Zhang at amd.com
Thu May 11 02:28:27 UTC 2023
[AMD Official Use Only - General]
Please register dedicated ras_irq src and funcs for UVD_POISON, which should allow you to create vcn ras sw calls like gfx/sdma ip block.
Regards,
Hawking
-----Original Message-----
From: Zhang, Horatio <Hongkun.Zhang at amd.com>
Sent: Wednesday, May 10, 2023 18:55
To: Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Xu, Feifei <Feifei.Xu at amd.com>; Liu, Leo <Leo.Liu at amd.com>; Jiang, Sonny <Sonny.Jiang at amd.com>; Limonciello, Mario <Mario.Limonciello at amd.com>; Liu, HaoPing (Alan) <HaoPing.Liu at amd.com>; Zhou, Bob <Bob.Zhou at amd.com>
Subject: RE: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini
[AMD Official Use Only - General]
Hi Hawking,
When modprobe, the interrupt of jpeg/vcn was enabled in amdgpu_fence_driver_hw_init(). If the amdgpu_irq_get function is added in amdgpu_xxx_ras_late_init/xxx_v4_0_late_init, it will enable the instance interrupt twice.
My previous modification plan also had this issue. Perhaps we should remove the amdgpu_irq_put function from jpeg/vcn_v4_0_hw_fini.
Regards,
Horatio
-----Original Message-----
From: Zhang, Hawking <Hawking.Zhang at amd.com>
Sent: Monday, May 8, 2023 8:32 PM
To: Zhou1, Tao <Tao.Zhou1 at amd.com>; Zhang, Horatio <Hongkun.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Xu, Feifei <Feifei.Xu at amd.com>; Liu, Leo <Leo.Liu at amd.com>; Jiang, Sonny <Sonny.Jiang at amd.com>; Limonciello, Mario <Mario.Limonciello at amd.com>; Liu, HaoPing (Alan) <HaoPing.Liu at amd.com>; Zhang, Horatio <Hongkun.Zhang at amd.com>
Subject: RE: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini
[AMD Official Use Only - General]
Shall we consider creating amdgpu_vcn_ras_late_init as a common helper for interrupt enablement, like other IP blocks. This also reduces further effort when RAS feature is introduced in new version of vcn/jpeg
Regards,
Hawking
-----Original Message-----
From: Zhou1, Tao <Tao.Zhou1 at amd.com>
Sent: Monday, May 8, 2023 19:06
To: Zhang, Horatio <Hongkun.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Xu, Feifei <Feifei.Xu at amd.com>; Liu, Leo <Leo.Liu at amd.com>; Jiang, Sonny <Sonny.Jiang at amd.com>; Limonciello, Mario <Mario.Limonciello at amd.com>; Liu, HaoPing (Alan) <HaoPing.Liu at amd.com>; Zhang, Horatio <Hongkun.Zhang at amd.com>
Subject: RE: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini
[AMD Official Use Only - General]
The series is:
Reviewed-by: Tao Zhou <tao.zhou1 at amd.com>
> -----Original Message-----
> From: Horatio Zhang <Hongkun.Zhang at amd.com>
> Sent: Monday, May 8, 2023 6:20 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao
> <Tao.Zhou1 at amd.com>; Xu, Feifei <Feifei.Xu at amd.com>; Liu, Leo
> <Leo.Liu at amd.com>; Jiang, Sonny <Sonny.Jiang at amd.com>; Limonciello,
> Mario <Mario.Limonciello at amd.com>; Liu, HaoPing (Alan)
> <HaoPing.Liu at amd.com>; Zhang, Horatio <Hongkun.Zhang at amd.com>
> Subject: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in
> jpeg_v4_0_hw_fini
>
> During the suspend, the jpeg_v4_0_hw_init function will use the
> amdgpu_irq_put to disable the irq of jpeg.inst, but it was not enabled
> during the resume process, which resulted in a call trace during the GPU reset process.
>
> [ 50.497562] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu]
> [ 50.497619] RSP: 0018:ffffaa2400fcfcb0 EFLAGS: 00010246
> [ 50.497620] RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> 0000000000000000
> [ 50.497621] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [ 50.497621] RBP: ffffaa2400fcfcd0 R08: 0000000000000000 R09:
> 0000000000000000
> [ 50.497622] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff99b2105242d8
> [ 50.497622] R13: 0000000000000000 R14: ffff99b210500000 R15:
> ffff99b210500000
> [ 50.497623] FS: 0000000000000000(0000) GS:ffff99b518480000(0000)
> knlGS:0000000000000000
> [ 50.497623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 50.497624] CR2: 00007f9d32aa91e8 CR3: 00000001ba210000 CR4:
> 0000000000750ee0
> [ 50.497624] PKRU: 55555554
> [ 50.497625] Call Trace:
> [ 50.497625] <TASK>
> [ 50.497627] jpeg_v4_0_hw_fini+0x43/0xc0 [amdgpu]
> [ 50.497693] jpeg_v4_0_suspend+0x13/0x30 [amdgpu]
> [ 50.497751] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu]
> [ 50.497802] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu]
> [ 50.497854] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu]
> [ 50.497905] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu]
> [ 50.498005] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu]
> [ 50.498060] process_one_work+0x21f/0x400
> [ 50.498063] worker_thread+0x200/0x3f0
> [ 50.498064] ? process_one_work+0x400/0x400
> [ 50.498065] kthread+0xee/0x120
> [ 50.498067] ? kthread_complete_and_exit+0x20/0x20
> [ 50.498068] ret_from_fork+0x22/0x30
>
> Fixes: 86e8255f941e ("drm/amdgpu: add JPEG 4.0 RAS poison consumption
> handling")
> Signed-off-by: Horatio Zhang <Hongkun.Zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
> index 77e1e64aa1d1..b5c14a166063 100644
> --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
> @@ -66,6 +66,13 @@ static int jpeg_v4_0_early_init(void *handle)
> return 0;
> }
>
> +static int jpeg_v4_0_late_init(void *handle) {
> + struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> +
> + return amdgpu_irq_get(adev, &adev->jpeg.inst->irq, 0); }
> +
> /**
> * jpeg_v4_0_sw_init - sw init for JPEG block
> *
> @@ -696,7 +703,7 @@ static int jpeg_v4_0_process_interrupt(struct
> amdgpu_device *adev, static const struct amd_ip_funcs jpeg_v4_0_ip_funcs = {
> .name = "jpeg_v4_0",
> .early_init = jpeg_v4_0_early_init,
> - .late_init = NULL,
> + .late_init = jpeg_v4_0_late_init,
> .sw_init = jpeg_v4_0_sw_init,
> .sw_fini = jpeg_v4_0_sw_fini,
> .hw_init = jpeg_v4_0_hw_init,
> --
> 2.34.1
More information about the amd-gfx
mailing list