amdgpu vs kexec

Christian König christian.koenig at amd.com
Wed Jun 18 09:05:44 UTC 2025


On 6/18/25 10:51, Peter Zijlstra wrote:
> On Tue, Jun 17, 2025 at 09:12:12PM -0500, Mario Limonciello wrote:
> 
>> How about if we reset before the kexec?  There is a symbol for drivers to
>> use to know they're about to go through kexec to do $THINGS.
>>
>> Something like this:
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 0fc0eeedc6461..2b1216b14d618 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -34,6 +34,7 @@
>>
>>  #include <linux/cc_platform.h>
>>  #include <linux/dynamic_debug.h>
>> +#include <linux/kexec.h>
>>  #include <linux/module.h>
>>  #include <linux/mmu_notifier.h>
>>  #include <linux/pm_runtime.h>
>> @@ -2544,6 +2545,9 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
>>                 adev->mp1_state = PP_MP1_STATE_UNLOAD;
>>         amdgpu_device_ip_suspend(adev);
>>         adev->mp1_state = PP_MP1_STATE_NONE;
>> +
>> +       if (kexec_in_progress)
>> +               amdgpu_asic_reset(adev);
>>  }
>>
>>  static int amdgpu_pmops_prepare(struct device *dev)
> 
> I will throw this in the dev kernel... I'll let you know.

Mhm if the drivers are informed about the kexec then we could also send the unload/reset packet only to the PSP IIRC.

That might have a better chance of succeeding than a full ASIC reset.

Lijo should know more about that.

Regards,
Christian.


More information about the amd-gfx mailing list