[PATCH v4 0/4] Improve on resume speed with VT-d enabled

Thomas Hellström thomas.hellstrom at linux.intel.com
Mon Mar 21 20:11:39 UTC 2022


When DMAR / VT-d is enabled, the display engine uses overfetching, presumably
to deal with the increased latency. To avoid referencing display engine
errors and DMAR faults, as a workaround the GGTT is populated with scatch
PTEs when VT-d is enabled. However starting with gen10, Write-combined
writing of scratch PTES is no longer possible and as a result, populating
the full GGTT with scratch PTEs like on resume becomes very slow as
uncached access is needed.

Therefore, replace filling the GGTT entirely with scratch pages with only
filling surrounding scanout vma with guard pages. This eliminates the 100+ms
delay in resume where we have to repopulate the GGTT with scratch.

While 100+ms might appear like a short time it's 10% to 20% of total resume
time and important in some applications.

Additional considerations:

Since GPUs where VT-d might be present should have at least 2GiB worth of
GGTT space, the extra guard pages should not really have a significant
impact on GGTT contention.

Neither should there be a problem with not populating GGTT with scratch
PTEs on unbind since that's not done when VT-d is not enabled either.

Finally, discrete GPUs should ideally not overfetch even with VT-d enabled,
but removing the workaround for discrete GPUs needs thorough testing and
will be done, if needed, as a follow up.


Patch 1 introduces accessors for vma.node.start and vma.node.size. While
this patch is the most invasive, the end result is actually something
we might want event without the introduced guard pages.

Patch 2 is intended to be squashed with patch 1 after verification that
wrapping is indeed correct.

Patch 3 introduces the concept of guard pages to i915_vma and wraps the
needed arithmetic in the accessors.

Patch 4 uses the guard pages to replace the old VT-d workaround.

Chris Wilson (3):
  drm/i915: Wrap all access to i915_vma.node.start|size
  drm/i915: Introduce guard pages to i915_vma
  drm/i915: Refine VT-d scanout workaround

Thomas Hellström (1):
  drm/i915: Wrap more accesses to i915.vma.node.start|size

 drivers/gpu/drm/i915/display/intel_fbdev.c    |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 13 ++++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 35 ++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c    |  4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 18 +++---
 .../i915/gem/selftests/i915_gem_client_blt.c  | 15 ++---
 .../drm/i915/gem/selftests/i915_gem_context.c | 15 +++--
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c    |  7 ++-
 drivers/gpu/drm/i915/gt/gen7_renderclear.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 39 ++++--------
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c  |  3 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |  2 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  8 +--
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 18 +++---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 15 ++---
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 16 ++---
 .../drm/i915/gt/selftest_ring_submission.c    |  2 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c        | 12 ++--
 .../gpu/drm/i915/gt/selftest_workarounds.c    |  8 +--
 drivers/gpu/drm/i915/i915_cmd_parser.c        |  4 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  3 +-
 drivers/gpu/drm/i915/i915_perf.c              |  2 +-
 drivers/gpu/drm/i915/i915_trace.h             |  8 +--
 drivers/gpu/drm/i915/i915_vma.c               | 59 +++++++++++++------
 drivers/gpu/drm/i915/i915_vma.h               | 52 +++++++++++++++-
 drivers/gpu/drm/i915/i915_vma_resource.c      |  4 +-
 drivers/gpu/drm/i915/i915_vma_resource.h      | 17 ++++--
 drivers/gpu/drm/i915/i915_vma_types.h         |  3 +-
 drivers/gpu/drm/i915/selftests/i915_request.c | 20 +++----
 drivers/gpu/drm/i915/selftests/i915_vma.c     |  8 +--
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  8 +--
 36 files changed, 259 insertions(+), 173 deletions(-)

-- 
2.34.1



More information about the Intel-gfx-trybot mailing list