[PATCH i-g-t] tests/intel/gem_watchdog: Reduced timeouts for worst case scenario

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Feb 19 09:53:57 UTC 2024


On 16/02/2024 01:33, John Harrison wrote:
> On 2/13/2024 01:34, Tvrtko Ursulin wrote:
>> On 12/02/2024 21:23, John.C.Harrison at Intel.com wrote:
>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>
>>> The watchdog test reduces the watchdog timer from 20s to 1s and then
>>> uses a 5s timeout waiting for the watchdog to do its stuff. This works
>>> fine in general, but if an engine reset is required by a context that
>>> is actually dead for real then a pre-emption timeout must be factored
>>> in. For RCS/CCS engines, that timeout is 7.5 seconds by default. Thus,
>>> the test timeout expires first and the test fails.
>>>
>>> Normally, the system is not so dead when running this test as to
>>> require an engine reset. A simple pre-emption works fine for the
>>> spinner contexts that is uses. However, there is a hardware workaround
>>> coming which prevents context switches when both RCS and CCS are busy.
>>>
>>> So add an explicit override of the pre-emption timeout as well as the
>>> watchdog timeout. That will allow the test to keep working after the
>>> new w/a lands.
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>> ---
>>>   tests/intel/gem_watchdog.c | 10 ++++++++++
>>>   1 file changed, 10 insertions(+)
>>>
>>> diff --git a/tests/intel/gem_watchdog.c b/tests/intel/gem_watchdog.c
>>> index 1e4c350214c0..c9dd0deb51aa 100644
>>> --- a/tests/intel/gem_watchdog.c
>>> +++ b/tests/intel/gem_watchdog.c
>>> @@ -577,6 +577,16 @@ igt_main
>>>             i915 = drm_reopen_driver(i915); /* Apply modparam. */
>>>           ctx = intel_ctx_create_all_physical(i915);
>>> +
>>> +        for_each_ctx_engine(i915, ctx, e) {
>>> +            /*
>>> +             * Context termination by watchdog may require an engine 
>>> reset. That only
>>> +             * occurs after a pre-emption attempt has expired. For 
>>> RCS/CCS engines,
>>> +             * the pre-emption timeout is longer than this test is 
>>> wanting to wait.
>>> +             * So reduce that timeout in addition to the watchdog 
>>> timeout itself.
>>> +             */
>>> +            gem_engine_property_printf(i915, e->name, 
>>> "preempt_timeout_ms", "%d", 640);
>>> +        }
>>
>> Restore at test exit for subsequent tests to be in a known environment?
> IGT actually does the reverse. Part of the framework initialisation is 
> to forcibly reset all the sysfs parameters to the official defaults (as 
> exposed via the .default sysfs files). So in general, the tests don't 
> bother trying to preserve such values.

True, looks like I forgot about that.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko

> 
> John.
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>       }
>>>         igt_subtest_group {
> 


More information about the igt-dev mailing list