amdgpu vs kexec

Mario Limonciello superm1 at kernel.org
Thu Jun 19 13:32:58 UTC 2025


On 6/18/2025 6:55 PM, Baoquan He wrote:
> On 06/18/25 at 11:12am, Peter Zijlstra wrote:
>> On Wed, Jun 18, 2025 at 10:51:23AM +0200, Peter Zijlstra wrote:
>>> On Tue, Jun 17, 2025 at 09:12:12PM -0500, Mario Limonciello wrote:
>>>
>>>> How about if we reset before the kexec?  There is a symbol for drivers to
>>>> use to know they're about to go through kexec to do $THINGS.
>>>>
>>>> Something like this:
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index 0fc0eeedc6461..2b1216b14d618 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -34,6 +34,7 @@
>>>>
>>>>   #include <linux/cc_platform.h>
>>>>   #include <linux/dynamic_debug.h>
>>>> +#include <linux/kexec.h>
>>>>   #include <linux/module.h>
>>>>   #include <linux/mmu_notifier.h>
>>>>   #include <linux/pm_runtime.h>
>>>> @@ -2544,6 +2545,9 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
>>>>                  adev->mp1_state = PP_MP1_STATE_UNLOAD;
>>>>          amdgpu_device_ip_suspend(adev);
>>>>          adev->mp1_state = PP_MP1_STATE_NONE;
>>>> +
>>>> +       if (kexec_in_progress)
>>>> +               amdgpu_asic_reset(adev);
>>>>   }
>>>>
>>>>   static int amdgpu_pmops_prepare(struct device *dev)
>>>
>>> I will throw this in the dev kernel... I'll let you know.
>>
>> First hurdle appears to be that this symbol is not exported. I fixed
>> that, but perhaps the kexec folks don't like drivers to use this?
> 
> I can't find the original mail of this thread, while we don't have a
> known restriction about that afaik.
> 

FYI here's the whole thread:

https://lore.kernel.org/amd-gfx/423aec58-0ab2-4471-b986-dfb955e63ca8@kernel.org/T/#m68bea029aac9b7ec015a26a8dfb8268ffb007125




More information about the amd-gfx mailing list