[igt-dev] [RFC] IGT GPU watchdog
Carlos Santa
carlos.santa at intel.com
Mon Apr 15 18:22:50 UTC 2019
Sharing this at this point as RFC to help expand the coverage
on this topic and help me debug some of the issues I am seeing.
The latest patch series in the kernel: https://patchwork.kernel.org/patch/10866659/
Test Coverage:
1. gem context created with a long batch run until completion
2. gem context created with a long batch but canceled after some time
using gpu watchdog timeout
3. 2 gem contexts created, ctx2 executed and ctx1 canceled after some
time using gpu watchdog timeout
4. the inverse of #3 above, ctx2 canceled after some time using gpu
watchdog time and ctx1 run until completion.
Preemption handling
1. Submit a long batch and after half of the executed run time
submit a higher priority batch with half the duration. Very the
latter was executed.
2. Submit a low priority long batch without gpu watchdog then
a higher priority with gpu watchdog and verify whether the
higher priority batch was canceled before the low priority
one completed.
Known issues:
1. The fence status EIO is not getting propagated in the kernel layer
after an engine reset using gpu watchdog, after each reset the fence
still returns -1.
2. The creation of a gem context with a low or high priority value
doesn't seem to work correctly, need help on this to test preemption,
see the code below as reference.
3. TODO: the subtest "gpu-watchdog-long-batch-2-contexts" uses a dummy
sleep(6) for now but this needs to be changed. The contexts can't be
destroyed either until both threads are done executing, so commented out
for now.
Carlos Santa (1):
tests/gem_watchdog: Initial set of tests for GPU watchdog
tests/Makefile.sources | 3 +
tests/i915/gem_watchdog.c | 439 ++++++++++++++++++++++++++++++++++++++++++++++
tests/meson.build | 1 +
3 files changed, 443 insertions(+)
create mode 100644 tests/i915/gem_watchdog.c
--
2.7.4
More information about the igt-dev
mailing list