[PATCH 0/2] Delay disabling scheduling on a context
Alan Previn
alan.previn.teres.alexis at intel.com
Wed Sep 21 09:39:07 UTC 2022
This is a revival of the same series posted by Matthew Brost
back in October 2021 (https://patchwork.freedesktop.org/series/96167/).
Additional real world measured metrics is included this time around
that has proven the effectiveness of this series.
This series adds a delay before disabling scheduling the guc-context
when a context has become idle. The 2nd patch should explain it quite well.
This is the 9th rev of this series (counting from the first
version by Matt). Changes from prior revs:
v8: - refactoring of reset prep changes for cleaning out
contexts with pending delay-disable-worker not yet
executed.
- add comments based (Trvtko, Daniele)
v7: - This series was merged and then reverted after invalid
CI runs unblocked and uncovered a deadlock. Fixed that
deadlock
- Added a fix for a race condition between a new incoming
request and the delay-disable-schedule worker.
- Added a fix for GT reset where we move all contexts that
are pending delayed disable-sched directly into the
pending-disable state after cancelling the worker despite
having not sent the G2H since this in preparation for a
reset and a flush of outstanding expected G2H's would be
dropped anyway.
v6: - More cosmetics on comments for threshold and delay knobs.
(John Harrison).
v5: - Fixed cosmetic issues with the commit message and comments.
- Moved "SCHED_DISABLE_DELAY_MS" to the sole location used.
- Removed the tracing of intel_context_closed.
- Added the check to intel_guc_submission_is_used in the
debugfs that gets the current guc-id-threshold to match
the other debugfs functions added in this series.
- Changed __guc_get_sched_disable_gucid_threshold_default
to a macro.
- Added s-o-b to to the first patch as well.
- (All above from John Harrison)
v4: Fix build error.
v3: Differentiate and appropriately name helper functions for getting
the 'default threshold of num-guc-ids' vs the 'max threshold of
num-guc-ids' for bypassing sched-disable and use the correct one
for the debugfs validation (John Harrison).
v2: Changed the default of the schedule-disable delay to 34 milisecs
and added debugfs to control this timing knob. Also added a debugfs
to control the bypass for not delaying the schedule-disable if
the we are under pressure with a very low balance of remaining
Alan Previn (1):
HAX wip debugging + messages for igt analysis
Matthew Brost (1):
drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis
drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +-
drivers/gpu/drm/i915/gt/intel_context.h | 8 +
drivers/gpu/drm/i915/gt/intel_context_types.h | 7 +
drivers/gpu/drm/i915/gt/uc/intel_guc.h | 28 ++-
.../gpu/drm/i915/gt/uc/intel_guc_debugfs.c | 61 +++++
.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 216 +++++++++++++++---
drivers/gpu/drm/i915/i915_selftest.h | 2 +
7 files changed, 296 insertions(+), 28 deletions(-)
base-commit: 3bde74f15d452bf788ecab8933ee802b2ee9e673
--
2.25.1
More information about the Intel-gfx-trybot
mailing list