[PATCH 2/2] drm/xe/vf: Don't update GuC reset policy when changing wedged mode
Laguna, Lukasz
lukasz.laguna at intel.com
Thu Apr 3 15:17:04 UTC 2025
On 4/3/2025 13:05, Michal Wajdeczko wrote:
>
> On 03.04.2025 11:41, Lukasz Laguna wrote:
>> Prevent the VF from attempting to update the GuC reset policy when
>> changing the wedged mode, as this operation is not supported for VFs.
>>
>> Log a message to indicate that GuC may still cause engine reset even
>> with wedged_mode=2.
>>
>> Signed-off-by: Lukasz Laguna <lukasz.laguna at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_debugfs.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
>> index d0503959a8ed..062668d02365 100644
>> --- a/drivers/gpu/drm/xe/xe_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
>> @@ -171,6 +171,11 @@ static ssize_t wedged_mode_set(struct file *f, const char __user *ubuf,
>>
>> xe->wedged.mode = wedged_mode;
>>
>> + if (IS_SRIOV_VF(xe)) {
>> + drm_info_once(&xe->drm, "VF can't change GuC's engine reset policy. GuC may still cause engine reset even with wedged_mode=2\n");
> never use drm_info_once() logs for places where multiple different
> devices can be reached, as then doing something similar on next device
> there will be no trace at all
ok
>
> also should we change xe->wedged.mode if it is N/A for VF?
We can't change engine reset policy, but I wouldn't say that wedged.mode
is N/A for VFs.
We can still disable it (mode=0), or use mode=2 to easily wedge device
with simple exec timeout in order to e.g. validate whether driver
behavior in case of wedged device is correct (all IOCTLs are blocked
until rebind).
> btw, what is our approach if someone on the PF already set some policy
> on GuC will not do engine reset? or it is n/a after a VF switch?
It's global policy - engine resets will be disabled for PF and VFs, and
that's something I would expect.
>> + return size;
> shouldn't we return -EPERM instead?
I returned size on purpose, as it's expected that VF is not able to
change reset policy. Still, informed user that GuC may cause engine
resets even in mode=2. I can add extra info, that if needed, engine
resets needs to be disabled by PF.
>
>> + }
>> +
>> xe_pm_runtime_get(xe);
>> for_each_gt(gt, xe, id) {
>> ret = xe_guc_ads_scheduler_policy_toggle_reset(>->uc.guc.ads);
More information about the Intel-xe
mailing list