[PATCH v10 0/6] DRM scheduler kunit tests

Tvrtko Ursulin tvrtko.ursulin at igalia.com
Mon Mar 24 11:25:51 UTC 2025


On 24/03/2025 09:44, Philipp Stanner wrote:
> On Mon, 2025-03-24 at 09:26 +0000, Tvrtko Ursulin wrote:
>> There has repeatedly been quite a bit of apprehension when any change
>> to the DRM
>> scheduler is proposed, with two main reasons being code base is
>> considered
>> fragile, not well understood and not very well documented, and
>> secondly the lack
>> of systematic testing outside the vendor specific tests suites and/or
>> test
>> farms.
>>
>> This series is an attempt to dislodge this status quo by adding some
>> unit tests
>> using the kunit framework.
>>
>> General approach is that there is a mock "hardware" backend which can
>> be
>> controlled from tests, which in turn allows exercising various
>> scheduler code
>> paths.
>>
>> Only some simple basic tests get added in the series and hopefully it
>> is easy to
>> understand what tests are doing.
>>
>> An obligatory "screenshot" for reference:
>>
>> [14:09:05] ============ drm_sched_basic_tests (3 subtests)
>> ============
>> [14:09:06] [PASSED] drm_sched_basic_submit
>> [14:09:06] ================== drm_sched_basic_test
>> ===================
>> [14:09:06] [PASSED] A queue of jobs in a single entity
>> [14:09:06] [PASSED] A chain of dependent jobs across multiple
>> entities
>> [14:09:06] [PASSED] Multiple independent job queues
>> [14:09:06] [PASSED] Multiple inter-dependent job queues
>> [14:09:07] ============== [PASSED] drm_sched_basic_test
>> ===============
>> [14:09:07] [PASSED] drm_sched_basic_entity_cleanup
>> [14:09:07] ============== [PASSED] drm_sched_basic_tests
>> ==============
>> [14:09:07] ======== drm_sched_basic_timeout_tests (1 subtest)
>> =========
>> [14:09:08] [PASSED] drm_sched_basic_timeout
>> [14:09:08] ========== [PASSED] drm_sched_basic_timeout_tests
>> ==========
>> [14:09:08] ======= drm_sched_basic_priority_tests (2 subtests)
>> ========
>> [14:09:10] [PASSED] drm_sched_priorities
>> [14:09:10] [PASSED] drm_sched_change_priority
>> [14:09:10] ========= [PASSED] drm_sched_basic_priority_tests
>> ==========
>> [14:09:10] ====== drm_sched_basic_modify_sched_tests (1 subtest)
>> ======
>> [14:09:11] [PASSED] drm_sched_test_modify_sched
>> [14:09:11] ======= [PASSED] drm_sched_basic_modify_sched_tests
>> ========
>> [14:09:11] ======== drm_sched_basic_credits_tests (1 subtest)
>> =========
>> [14:09:12] [PASSED] drm_sched_test_credits
>> [14:09:12] ========== [PASSED] drm_sched_basic_credits_tests
>> ==========
>> [14:09:12]
>> ============================================================
>> [14:09:12] Testing complete. Ran 11 tests: passed: 11
>> [14:09:13] Elapsed time: 13.539s total, 0.001s configuring, 3.004s
>> building, 10.462s running
>>
>> v2:
>>   * Parameterize a bunch of similar tests.
>>   * Improve test commentary.
>>   * Rename TDR test to timeout. (Christian)
>>   * Improve quality and consistency of naming. (Philipp)
>>
>> RFC v2 -> series v1:
>>   * Rebased for drm_sched_init changes.
>>   * Fixed modular build.
>>   * Added some comments.
>>   * Filename renames. (Philipp)
>>
>> v2:
>>   * Dealt with a bunch of checkpatch warnings.
>>
>> v3:
>>   * Some mock API renames, kerneldoc grammar fixes and indentation
>> fixes.
>>
>> v4:
>>   * Fix use after free caused by relying on scheduler fence for
>> querying status.
>>   * Kerneldoc fixes.
>>
>> v5:
>>   * Cleanup in-flight jobs on scheduler shutdown.
>>   * Change hang_limit to 1.
>>
>> v6:
>>   * Use KUNIT_ASSERT_TRUE/FALSE.
>>   * Fixed patch titles.
>>   * Added credit_limit test.
>>   * Added CONFIG_DRM_SCHED_KUNIT_TEST_ASPIRATIONAL.
>>
>> v7:
>>   * v6 omitted to send the first patch by mistake.
>>
>> v8:
>>   * Removed CONFIG_DRM_SCHED_KUNIT_TEST_ASPIRATIONAL for now.
>>   * Added Christian's acks.
>>
>> v9:
>>   * Fixed a potential memory leak caused by a race condition on mock
>> scheduler
>>     shutdown. In order to reliably clean up everything, we have keep
>> track of
>>     jobs even past the signalling stage, all until either DRM sched
>> core managed
>>     to run the ->free_job() callback, or until mock scheduler teardown
>> from the
>>     test.
>>
>> v10:
>>   * Rebase for a merge conflict in Kconfig.
>>
>> Cc: Christian König <christian.koenig at amd.com>
>> Cc: Danilo Krummrich <dakr at kernel.org>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> Cc: Philipp Stanner <phasta at kernel.org>
>>
>> Tvrtko Ursulin (6):
>>    drm: Move some options to separate new Kconfig
>>    drm/sched: Add scheduler unit testing infrastructure and some basic
>>      tests
>>    drm/sched: Add a simple timeout test
>>    drm/sched: Add basic priority tests
>>    drm/sched: Add a basic test for modifying entities scheduler list
>>    drm/sched: Add a basic test for checking credit limit
>>
>>   drivers/gpu/drm/Kconfig                       | 110 +---
>>   drivers/gpu/drm/Kconfig.debug                 | 116 +++++
>>   drivers/gpu/drm/scheduler/.kunitconfig        |  12 +
>>   drivers/gpu/drm/scheduler/Makefile            |   2 +
>>   drivers/gpu/drm/scheduler/tests/Makefile      |   7 +
>>   .../gpu/drm/scheduler/tests/mock_scheduler.c  | 359 +++++++++++++
>>   drivers/gpu/drm/scheduler/tests/sched_tests.h | 226 +++++++++
>>   drivers/gpu/drm/scheduler/tests/tests_basic.c | 476
>> ++++++++++++++++++
>>   8 files changed, 1203 insertions(+), 105 deletions(-)
>>   create mode 100644 drivers/gpu/drm/Kconfig.debug
>>   create mode 100644 drivers/gpu/drm/scheduler/.kunitconfig
>>   create mode 100644 drivers/gpu/drm/scheduler/tests/Makefile
>>   create mode 100644 drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>   create mode 100644 drivers/gpu/drm/scheduler/tests/sched_tests.h
>>   create mode 100644 drivers/gpu/drm/scheduler/tests/tests_basic.c
>>
> 
> 
> Applied to drm-misc-next
> 
> Thanks for your endurance!

Thank you for the much needed scrutiny and testing!

Regards,

Tvrtko



More information about the dri-devel mailing list