[PATCH i-g-t] tests/intel/gem_watchdog: Reduced timeouts for worst case scenario

John Harrison john.c.harrison at intel.com
Fri Feb 16 01:33:37 UTC 2024


On 2/13/2024 01:34, Tvrtko Ursulin wrote:
> On 12/02/2024 21:23, John.C.Harrison at Intel.com wrote:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> The watchdog test reduces the watchdog timer from 20s to 1s and then
>> uses a 5s timeout waiting for the watchdog to do its stuff. This works
>> fine in general, but if an engine reset is required by a context that
>> is actually dead for real then a pre-emption timeout must be factored
>> in. For RCS/CCS engines, that timeout is 7.5 seconds by default. Thus,
>> the test timeout expires first and the test fails.
>>
>> Normally, the system is not so dead when running this test as to
>> require an engine reset. A simple pre-emption works fine for the
>> spinner contexts that is uses. However, there is a hardware workaround
>> coming which prevents context switches when both RCS and CCS are busy.
>>
>> So add an explicit override of the pre-emption timeout as well as the
>> watchdog timeout. That will allow the test to keep working after the
>> new w/a lands.
>>
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>> ---
>>   tests/intel/gem_watchdog.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/tests/intel/gem_watchdog.c b/tests/intel/gem_watchdog.c
>> index 1e4c350214c0..c9dd0deb51aa 100644
>> --- a/tests/intel/gem_watchdog.c
>> +++ b/tests/intel/gem_watchdog.c
>> @@ -577,6 +577,16 @@ igt_main
>>             i915 = drm_reopen_driver(i915); /* Apply modparam. */
>>           ctx = intel_ctx_create_all_physical(i915);
>> +
>> +        for_each_ctx_engine(i915, ctx, e) {
>> +            /*
>> +             * Context termination by watchdog may require an engine 
>> reset. That only
>> +             * occurs after a pre-emption attempt has expired. For 
>> RCS/CCS engines,
>> +             * the pre-emption timeout is longer than this test is 
>> wanting to wait.
>> +             * So reduce that timeout in addition to the watchdog 
>> timeout itself.
>> +             */
>> +            gem_engine_property_printf(i915, e->name, 
>> "preempt_timeout_ms", "%d", 640);
>> +        }
>
> Restore at test exit for subsequent tests to be in a known environment?
IGT actually does the reverse. Part of the framework initialisation is 
to forcibly reset all the sysfs parameters to the official defaults (as 
exposed via the .default sysfs files). So in general, the tests don't 
bother trying to preserve such values.

John.

>
> Regards,
>
> Tvrtko
>
>>       }
>>         igt_subtest_group {



More information about the igt-dev mailing list