[PATCH 2/2] drm/xe/vf: Don't update GuC reset policy when changing wedged mode

Laguna, Lukasz lukasz.laguna at intel.com
Thu Apr 3 15:17:04 UTC 2025


On 4/3/2025 13:05, Michal Wajdeczko wrote:
>
> On 03.04.2025 11:41, Lukasz Laguna wrote:
>> Prevent the VF from attempting to update the GuC reset policy when
>> changing the wedged mode, as this operation is not supported for VFs.
>>
>> Log a message to indicate that GuC may still cause engine reset even
>> with wedged_mode=2.
>>
>> Signed-off-by: Lukasz Laguna <lukasz.laguna at intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_debugfs.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
>> index d0503959a8ed..062668d02365 100644
>> --- a/drivers/gpu/drm/xe/xe_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
>> @@ -171,6 +171,11 @@ static ssize_t wedged_mode_set(struct file *f, const char __user *ubuf,
>>   
>>   	xe->wedged.mode = wedged_mode;
>>   
>> +	if (IS_SRIOV_VF(xe)) {
>> +		drm_info_once(&xe->drm, "VF can't change GuC's engine reset policy. GuC may still cause engine reset even with wedged_mode=2\n");
> never use drm_info_once() logs for places where multiple different
> devices can be reached, as then doing something similar on next device
> there will be no trace at all

ok

>
> also should we change xe->wedged.mode if it is N/A for VF?

We can't change engine reset policy, but I wouldn't say that wedged.mode 
is N/A for VFs.

We can still disable it (mode=0), or use mode=2 to easily wedge device 
with simple exec timeout in order to e.g. validate whether driver 
behavior in case of wedged device is correct (all IOCTLs are blocked 
until rebind).

> btw, what is our approach if someone on the PF already set some policy
> on GuC will not do engine reset? or it is n/a after a VF switch?

It's global policy - engine resets will be disabled for PF and VFs, and 
that's something I would expect.

>> +		return size;
> shouldn't we return -EPERM instead?

I returned size on purpose, as it's expected that VF is not able to 
change reset policy. Still, informed user that GuC may cause engine 
resets even in mode=2. I can add extra info, that if needed, engine 
resets needs to be disabled by PF.

>
>> +	}
>> +
>>   	xe_pm_runtime_get(xe);
>>   	for_each_gt(gt, xe, id) {
>>   		ret = xe_guc_ads_scheduler_policy_toggle_reset(&gt->uc.guc.ads);


More information about the Intel-xe mailing list