[PATCH i-g-t v3] tests/intel/xe_exec_fault_mode: Don't return early
Nirmoy Das
nirmoy.das at linux.intel.com
Wed Aug 28 15:26:48 UTC 2024
On 8/28/2024 5:15 PM, Andrzej Hajda wrote:
>
>
> On 28.08.2024 11:55, Nirmoy Das wrote:
>> Tests that are causing pagefaults should wait for exec queue to be ban
>> otherwise pending engine resets because of on-going pagefaults would
>> cause failure in subsequent tests to fail.
>>
>> Set a larger 5 sec timeout if still tests fail, we can blame
>> driver in such case.
>
> I try to understand what causes such big delay, any ideas? Btw if the
> driver is to blame, maybe it should be fixed instead of increasing
> timeout in the test.
From this IGT test prospective, this subtest causes a engine reset and
exec ban so which it should wait. Now if that behavior doesn't met then
we need
fix the driver but I think that is different topic.
>
> In v2 there was one failure on PVC:
> https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11646/bat-pvc-2/igt@xe_exec_fault_mode@twice-invalid-userptr-fault.html
> This time it passed flawlessly (as well as in v1), but not due to
> increased time limit (at least dmesg shows the test took much less
> than 1second).
Yes I saw that, it just mean the ctx wasn't banned which is strange.
There is not enough info to debug.
> Let's wait for xeFULL pass, maybe it will show some interesting results.
Regards,
Nirmoy
>
> Regards
> Andrzej
>>
>> v2: specify timeout reason and iterate over exec_queues(Andrzej)
>> v3: increase timeout
>>
>> Cc: Andrzej Hajda <andrzej.hajda at intel.com>
>> Cc: Kamil Konieczny <kamil.konieczny at linux.intel.com>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> Cc: Tejas Upadhyay <tejas.upadhyay at intel.com>
>> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630
>> Reviewed-by: Matthew Brost <matthew.brost at intel.com> #v1
>> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
>> ---
>> tests/intel/xe_exec_fault_mode.c | 25 +++++++++++++++++++++++++
>> 1 file changed, 25 insertions(+)
>>
>> diff --git a/tests/intel/xe_exec_fault_mode.c
>> b/tests/intel/xe_exec_fault_mode.c
>> index 1f1f1e50b..e3e6047e7 100644
>> --- a/tests/intel/xe_exec_fault_mode.c
>> +++ b/tests/intel/xe_exec_fault_mode.c
>> @@ -36,6 +36,22 @@
>> #define INVALID_VA (0x1 << 8)
>> #define ENABLE_SCRATCH (0x1 << 9)
>> +static int get_ban_property(int xe, struct
>> drm_xe_engine_class_instance *eci,
>> + uint32_t vm, uint32_t exec_queue)
>> +{
>> + struct drm_xe_exec_queue_get_property args = {
>> + .value = -1,
>> + .reserved[0] = 0,
>> + .property = DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN,
>> + };
>> +
>> + args.exec_queue_id = exec_queue;
>> +
>> + do_ioctl(xe, DRM_IOCTL_XE_EXEC_QUEUE_GET_PROPERTY, &args);
>> +
>> + return args.value;
>> +}
>> +
>> /**
>> * SUBTEST: invalid-va
>> * Description: Access invalid va and check for EIO through user
>> fence.
>> @@ -324,6 +340,15 @@ test_exec(int fd, struct
>> drm_xe_engine_class_instance *eci,
>> xe_wait_ufence(fd, &data[0].vm_sync, USER_FENCE_VALUE,
>> bind_exec_queues[0], NSEC_PER_SEC);
>> + if ((flags & INVALID_FAULT)) {
>> + igt_set_timeout(5, "waiting for ban");
>> + for (i = 0; i < n_exec_queues; i++) {
>> + while (!get_ban_property(fd, eci, vm, exec_queues[i]))
>> + sched_yield();
>> + }
>> + igt_reset_timeout();
>> + }
>> +
>> if (!(flags & INVALID_FAULT) && !(flags & INVALID_VA)) {
>> for (i = j; i < n_execs; i++)
>> igt_assert_eq(data[i].data, 0xc0ffee);
>
More information about the igt-dev
mailing list