[Intel-gfx] [PATCH v6 00/19] 48-bit PPGTT

Michel Thierry michel.thierry at intel.com
Wed Jul 29 09:23:44 PDT 2015


This clean-up version delays the 48-bit work to later patches and includes
more review comments from Akash and Chris. The first 5 patches prepare the
dynamic page allocation code to handle independent pdps, but no specific
code for 48-bit mode is added before the 5th patch.

In order expand the GPU address space, a 4th level translation is added,
the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E),
PML4[0-511], each pointing to a PDP. All the existing "dynamic alloc
ppgtt" functions are used, only adding the 4th level changes. I also
updated some remaining variables that were 32b only.

There are 2 hardware workarounds needed to allow correct operation with
48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset).
A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can
be allocated outside the first 4 PDPs; if not, the end range is forced to 4GB.
Also, more objects now use the DRM_MM_CREATE_TOP flag. To maintain
compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit function
that will flag these objects, while the existing drm_intel_bo_emit_reloc
clears it.

Finally, this feature is only available in BDW and Gen9, requires LRC
submission mode (execlists) and it can be detected by i915.enable_ppgtt=3.

Also note that this expanded address space is only available for full
PPGTT, aliasing PPGTT and Global GTT remain 32-bit.

I'll resend the userland patches (libdrm/mesa) in a different patchset, there
haven't been changes on them, but they require a rebase. I will also expand the
ppgtt igt test per Chris suggestions.

Michel Thierry (19):
  drm/i915: Remove unnecessary gen8_clamp_pd
  drm/i915/gen8: Make pdp allocation more dynamic
  drm/i915/gen8: Abstract PDP usage
  drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
  drm/i915/gen8: Add dynamic page trace events
  drm/i915/gen8: Add PML4 structure
  drm/i915/gen8: implement alloc/free for 4lvl
  drm/i915/gen8: Add 4 level switching infrastructure and lrc support
  drm/i915/gen8: Pass sg_iter through pte inserts
  drm/i915/gen8: Add 4 level support in insert_entries and clear_range
  drm/i915/gen8: Initialize PDPs and PML4
  drm/i915: Expand error state's address width to 64b
  drm/i915/gen8: Add ppgtt info and debug_dump
  drm/i915: object size needs to be u64
  drm/i915: batch_obj vm offset must be u64
  drm/i915/userptr: Kill user_size limit check
  drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  drm/i915/gen8: Flip the 48b switch
  drm/i915: Save some page table setup on repeated binds

 drivers/gpu/drm/i915/i915_debugfs.c        |  18 +-
 drivers/gpu/drm/i915/i915_drv.h            |  11 +-
 drivers/gpu/drm/i915/i915_gem.c            |  30 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  13 +
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 665 ++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_gtt.h        |  64 ++-
 drivers/gpu/drm/i915/i915_gem_userptr.c    |   4 -
 drivers/gpu/drm/i915/i915_gpu_error.c      |  24 +-
 drivers/gpu/drm/i915/i915_params.c         |   2 +-
 drivers/gpu/drm/i915/i915_reg.h            |   1 +
 drivers/gpu/drm/i915/i915_trace.h          |  32 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  60 ++-
 include/uapi/drm/i915_drm.h                |   3 +-
 13 files changed, 747 insertions(+), 180 deletions(-)

-- 
2.4.5



More information about the Intel-gfx mailing list