[PATCH] drm/xe/debugfs: Make sysfs gt force reset synchronous
Poosa, Karthik
karthik.poosa at intel.com
Wed Dec 27 12:59:39 UTC 2023
Hi Anshuman,
1. Can't force_reset be synchronous ?
2. regarding adding wait in get freq APIs,
there is already a flag 'pc->freq_ready' in xe_guc_pc_get_xxx_freq APIs,
which returns -EAGAIN if reset in progress, instead of waiting, which is
causing the test failures.
On 27-12-2023 16:06, Gupta, Anshuman wrote:
>
>> -----Original Message-----
>> From: Poosa, Karthik <karthik.poosa at intel.com>
>> Sent: Wednesday, December 27, 2023 2:02 PM
>> To: intel-xe at lists.freedesktop.org
>> Cc: Gupta, Anshuman <anshuman.gupta at intel.com>; Nilawar, Badal
>> <badal.nilawar at intel.com>; Brost, Matthew <matthew.brost at intel.com>;
>> Vivi, Rodrigo <rodrigo.vivi at intel.com>; Poosa, Karthik
>> <karthik.poosa at intel.com>
>> Subject: [PATCH] drm/xe/debugfs: Make sysfs gt force reset synchronous
>>
>> Wait for gt reset to complete before returning from force_reset sysfs call.
>> Without this igt test freq_reset_multiple fails sporadically in case xe_guc_pc is
>> not started.
>>
>> v2:
>> - Changed wait for completion to interruptible (Anshuman).
>> - Moved timeout to xe_gt.h (Anshuman).
>> - Created a debugfs for updating timeout (Rodrigo).
>>
>> Testcase: igt at xe_guc_pc@freq_reset_multiple
>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_gt.c | 2 ++
>> drivers/gpu/drm/xe/xe_gt_debugfs.c | 12 ++++++++++++
>> drivers/gpu/drm/xe/xe_gt_types.h | 6 ++++++
>> 3 files changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index
>> 3af2adec1295..47abb9336c58 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -65,6 +65,7 @@ struct xe_gt *xe_gt_alloc(struct xe_tile *tile)
>>
>> gt->tile = tile;
>> gt->ordered_wq = alloc_ordered_workqueue("gt-ordered-wq", 0);
>> + init_completion(>->reset_done);
>>
>> return gt;
>> }
>> @@ -633,6 +634,7 @@ static int gt_reset(struct xe_gt *gt)
>> xe_device_mem_access_put(gt_to_xe(gt));
>> XE_WARN_ON(err);
>>
>> + complete(>->reset_done);
>> xe_gt_info(gt, "reset done\n");
>>
>> return 0;
>> diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c
>> b/drivers/gpu/drm/xe/xe_gt_debugfs.c
>> index c4b67cf09f8f..fbda886c8a95 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
>> @@ -58,8 +58,16 @@ static int hw_engines(struct seq_file *m, void *data)
>> static int force_reset(struct seq_file *m, void *data) {
>> struct xe_gt *gt = node_to_gt(m->private);
>> + struct xe_device *xe = gt_to_xe(gt);
>> + long ret;
>>
>> xe_gt_reset_async(gt);
>> + ret = wait_for_completion_interruptible_timeout(>->reset_done,
>> +
> This would defeat the purpose of xe_gt_reset_async(), as this will make force_reset
> synchronous , I think we need wait_for_completion_interruptible_timeout in xe_gt_freq sysfs
> before reading the guc pc frequency.
> Something like below.
> guc_pc_freq_ready()
> {
> wait_for_completion_interruptible_timeout()
> }
>
> Thanks,
> Anshuman Gupta.
> msecs_to_jiffies(gt-
>>> reset_timeout_ms));
>> + if (ret <= 0) {
>> + drm_err(&xe->drm, "gt reset timed out/interrputed, ret
>> %ld\n", ret);
>> + return -ETIMEDOUT;
>> + }
>>
>> return 0;
>> }
>> @@ -225,6 +233,10 @@ void xe_gt_debugfs_register(struct xe_gt *gt)
>> return;
>> }
>>
>> + /* set a default timeout */
>> + gt->reset_timeout_ms = 1000;
>> + debugfs_create_u32("gt_reset_timeout_ms", 0600, root,
>> + >->reset_timeout_ms);
>> /*
>> * Allocate local copy as we need to pass in the GT to the debugfs
>> * entry and drm_debugfs_create_files just references the
>> drm_info_list diff --git a/drivers/gpu/drm/xe/xe_gt_types.h
>> b/drivers/gpu/drm/xe/xe_gt_types.h
>> index f74684660475..824cefde20d2 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_types.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_types.h
>> @@ -358,6 +358,12 @@ struct xe_gt {
>> /** @oob: bitmap with active OOB workaroudns */
>> unsigned long *oob;
>> } wa_active;
>> +
>> + /** @reset_done : completion for GT reset */
>> + struct completion reset_done;
>> +
>> + /** @gt_reset_timeout_ms : gt reset timeout in ms */
>> + u32 reset_timeout_ms;
>> };
>>
>> #endif
>> --
>> 2.25.1
More information about the Intel-xe
mailing list