[PATCH 3/3] drm/xe/pf: Don't resume device from restart worker
Michal Wajdeczko
michal.wajdeczko at intel.com
Wed Jul 30 21:48:01 UTC 2025
On 7/30/2025 11:10 PM, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Michal Wajdeczko
> Sent: Wednesday, July 30, 2025 10:49 AM
> To: intel-xe at lists.freedesktop.org
> Cc: Wajdeczko, Michal <Michal.Wajdeczko at intel.com>
> Subject: [PATCH 3/3] drm/xe/pf: Don't resume device from restart worker
>>
>> The PF's restart worker shouldn't attempt to resume the device on
>> its own, since its goal is to finish PF and VFs reprovisioning on
>> the recently reset GuC. Take extra RPM reference while scheduling
>> a work and release it from the worker or when we cancel a work.
>>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 23 +++++++++++++++++++----
>> 1 file changed, 19 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> index 8bc7d7f9f47a..0c9012fb625d 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> @@ -53,7 +53,11 @@ static void pf_init_workers(struct xe_gt *gt)
>>
>> static void pf_fini_workers(struct xe_gt *gt)
>> {
>> - disable_work_sync(>->sriov.pf.workers.restart);
>> + if (disable_work_sync(>->sriov.pf.workers.restart)) {
>> + xe_gt_sriov_dbg_verbose(gt, "pending restart disabled!\n");
>> + /* release a rpm reference taken on the worker behalf */
>> + xe_pm_runtime_put(gt_to_xe(gt));
>> + }
>> }
>>
>> /**
>> @@ -205,8 +209,11 @@ static void pf_cancel_restart(struct xe_gt *gt)
>> {
>> xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>>
>> - if (cancel_work_sync(>->sriov.pf.workers.restart))
>> + if (cancel_work_sync(>->sriov.pf.workers.restart)) {
>> xe_gt_sriov_dbg_verbose(gt, "pending restart canceled!\n");
>> + /* release a rpm reference taken on the worker behalf */
>> + xe_pm_runtime_put(gt_to_xe(gt));
>> + }
>> }
>>
>> /**
>> @@ -224,9 +231,12 @@ static void pf_restart(struct xe_gt *gt)
>> {
>> struct xe_device *xe = gt_to_xe(gt);
>>
>> - xe_pm_runtime_get(xe);
>> + xe_gt_assert(gt, !xe_pm_runtime_suspended(xe));
>> +
>> xe_gt_sriov_pf_config_restart(gt);
>> xe_gt_sriov_pf_control_restart(gt);
>> +
>> + /* release a rpm reference taken on our behalf */
>
> NIT:
> For consistency with the other two comments, maybe:
> s/our/the worker
> Or is the pm reference taken in this instance different from the pm reference
> taken in pf_cancel_restart and pf_fini_workers?
this is the worker context, hence 'our'
>
> There're also some other minor grammar things ("s/a rpm/an rpm" and
> "s/worker behalf/worker's behalf", for example) that can be applied more
> generally to the whole patch.
>
> I'm not going to block on minor grammatical fixups, though, so:
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
>
>> xe_pm_runtime_put(xe);
>>
>> xe_gt_sriov_dbg(gt, "restart completed\n");
>> @@ -245,8 +255,13 @@ static void pf_queue_restart(struct xe_gt *gt)
>>
>> xe_gt_assert(gt, IS_SRIOV_PF(xe));
>>
>> - if (!queue_work(xe->sriov.wq, >->sriov.pf.workers.restart))
>> + /* take a rpm reference on behalf of the worker */
>> + xe_pm_runtime_get_noresume(xe);
>> +
>> + if (!queue_work(xe->sriov.wq, >->sriov.pf.workers.restart)) {
>> xe_gt_sriov_dbg(gt, "restart already in queue!\n");
>> + xe_pm_runtime_put(xe);
>> + }
>> }
>>
>> /**
>> --
>> 2.47.1
>>
>>
More information about the Intel-xe
mailing list