Crash on resume from S3

Andrey Grodzovsky andrey.grodzovsky at amd.com
Tue Jul 26 17:10:53 UTC 2022


The stack trace is expected part of reset procedure  so that ok. The 
issue you are having is a hang on one of GPU jobs during resume which 
triggers a GPU reset attempt.

You can open a ticket with this issue here 
https://gitlab.freedesktop.org/drm/amd/-/issues, please attach full 
dmesg log.

Andrey

On 2022-07-26 05:06, Tom Cook wrote:
> I have a Ryzen 7 3700U in an HP laptop.  lspci describes the GPU in this way:
>
> 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> [AMD/ATI] Picasso/Raven 2 [Radeon Vega Series / Radeon Vega Mobile
> Series] (rev c1)
>
> This laptop has never successfully resumed from suspend (I have tried
> every 5.x kernel).  Currently on 5.18.0, the system appears to be okay
> after resume apart from the gpu which is usually giving a blank
> screen, occasionally a scrambled output.  After rebooting, I see this
> in syslog:
>
> Jul 25 11:02:18 frog kernel: [240782.968674] amdgpu 0000:04:00.0:
> amdgpu: GPU reset begin!
> Jul 25 11:02:19 frog kernel: [240783.974891] amdgpu 0000:04:00.0:
> [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test
> failed (-110)
> Jul 25 11:02:19 frog kernel: [240783.988650] [drm] free PSP TMR buffer
> Jul 25 11:02:19 frog kernel: [240784.019057] CPU: 4 PID: 305612 Comm:
> kworker/u32:17 Not tainted 5.18.0 #1
> Jul 25 11:02:19 frog kernel: [240784.019063] Hardware name: HP HP ENVY
> x360 Convertible 15-ds0xxx/85DD, BIOS F.20 05/28/2020
> Jul 25 11:02:19 frog kernel: [240784.019067] Workqueue:
> amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
> Jul 25 11:02:19 frog kernel: [240784.019079] Call Trace:
> Jul 25 11:02:19 frog kernel: [240784.019082]  <TASK>
> Jul 25 11:02:19 frog kernel: [240784.019085]  dump_stack_lvl+0x49/0x5f
> Jul 25 11:02:19 frog kernel: [240784.019095]  dump_stack+0x10/0x12
> Jul 25 11:02:19 frog kernel: [240784.019099]
> amdgpu_do_asic_reset+0x2f/0x4e0 [amdgpu]
> Jul 25 11:02:19 frog kernel: [240784.019278]
> amdgpu_device_gpu_recover_imp+0x41e/0xb50 [amdgpu]
> Jul 25 11:02:19 frog kernel: [240784.019452]
> amdgpu_job_timedout+0x155/0x1b0 [amdgpu]
> Jul 25 11:02:19 frog kernel: [240784.019674]
> drm_sched_job_timedout+0x74/0xf0 [gpu_sched]
> Jul 25 11:02:19 frog kernel: [240784.019681]  ?
> amdgpu_cgs_destroy_device+0x10/0x10 [amdgpu]
> Jul 25 11:02:19 frog kernel: [240784.019896]  ?
> drm_sched_job_timedout+0x74/0xf0 [gpu_sched]
> Jul 25 11:02:19 frog kernel: [240784.019903]  process_one_work+0x227/0x440
> Jul 25 11:02:19 frog kernel: [240784.019908]  worker_thread+0x31/0x3d0
> Jul 25 11:02:19 frog kernel: [240784.019912]  ? process_one_work+0x440/0x440
> Jul 25 11:02:19 frog kernel: [240784.019914]  kthread+0xfe/0x130
> Jul 25 11:02:19 frog kernel: [240784.019918]  ?
> kthread_complete_and_exit+0x20/0x20
> Jul 25 11:02:19 frog kernel: [240784.019923]  ret_from_fork+0x22/0x30
> Jul 25 11:02:19 frog kernel: [240784.019930]  </TASK>
> Jul 25 11:02:19 frog kernel: [240784.019934] amdgpu 0000:04:00.0:
> amdgpu: MODE2 reset
> Jul 25 11:02:19 frog kernel: [240784.020178] amdgpu 0000:04:00.0:
> amdgpu: GPU reset succeeded, trying to resume
> Jul 25 11:02:19 frog kernel: [240784.020552] [drm] PCIE GART of 1024M enabled.
> Jul 25 11:02:19 frog kernel: [240784.020555] [drm] PTB located at
> 0x000000F400900000
> Jul 25 11:02:19 frog kernel: [240784.020577] [drm] VRAM is lost due to
> GPU reset!
> Jul 25 11:02:19 frog kernel: [240784.020579] [drm] PSP is resuming...
> Jul 25 11:02:19 frog kernel: [240784.040465] [drm] reserve 0x400000
> from 0xf47fc00000 for PSP TMR
>
> I'm running the latest BIOS from HP.  Is there anything I can do to
> work around this?  Or anything I can do to help debug it?
>
> Regards,
> Tom Cook


More information about the amd-gfx mailing list