[Intel-xe] [PATCH 00/26] Separate GT and tile
Das, Nirmoy
nirmoy.das at linux.intel.com
Tue May 16 14:18:45 UTC 2023
On 5/11/2023 5:46 AM, Matt Roper wrote:
> A 'tile' is not the same thing as a 'GT.' For historical reasons, i915
> attempted to use a single 'struct intel_gt' to represent both concepts,
> although this design hasn't worked out terribly well. For Xe we have
> the opportunity to design the driver in a way that more accurately
> reflects the real hardware behavior.
>
> Different vendors use the term "tile" a bit differently, but in the
> Intel world, a 'tile' is pretty close to what most people would think of
> as being a complete GPU. When multiple GPUs are placed behind a single
> PCI device, that's what we refer to as a "multi-tile device." In such
> cases, pretty much all hardware is replicated per-tile, although certain
> responsibilities like PCI communication, reporting of interrupts to the
> OS, etc. are handled solely by the "root tile." A multi-tile platform
> takes care of tying the tiles together in a way such that interrupt
> notifications from remote tiles are forwarded to the root tile, the
> per-tile vram is combined into a single address space, etc.
>
> In contrast, a "GT" (which officially stands for "Graphics Technology")
> is the subset of a GPU/tile that is responsible for implementing
> graphics and/or media operations. The GT is where a lot of the driver
> implementation happens since it's where the hardware engines, the
> execution units, and the GuC all reside.
>
> Historically most Intel devices were single-tile devices that contained
> a single GT. PVC is currently the only released Intel platform built on
> a multi-tile design (i.e., multiple GPUs behind a single PCI device);
> each PVC tile only has a single GT. In contrast, platforms like MTL
> that have separate chips for render and media IP are still only a single
> logical GPU, but the graphics and media IP blocks are exposed each
> exposed as a separate GT within that single GPU. This is important from
> a software perspective because multi-GT platforms like MTL only
> replicate a subset of the GPU hardware and behave differently than
> multi-tile platforms like PVC where nearly everything is replicated.
>
> This series separates tiles from GTs in a manner that more closely
> matches the hardware behavior. We now consider a PCI device (xe_device)
> to contain one or more tiles (struct xe_tile). Each tile will contain
> one or two GTs (struct xe_gt). Although we don't have any platforms yet
> that are multi-tile *and* contain more than one GT per tile, that may
> change in the future. This driver redesign splits functionality as
> follows:
>
> Per-tile functionality (shared by all GTs within the tile):
> - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
> registers, display registers, etc.)
> - Global GTT
> - VRAM (if discrete)
> - Interrupt flows
> - Migration context
> - kernel batchbuffer pool
> - Primary GT
> - Media GT (if media version >= 13)
>
> Per-GT functionality:
> - GuC
> - Hardware engines
> - Programmable hardware units (subslices, EUs)
> - GSI subset of registers (multiple copies of these registers reside
> within the complete MMIO space provided by the tile, but at different
> offsets --- 0 for render, 0x380000 for media)
> - Multicast register steering
> - TLBs to cache page table translations
> - Reset capability
> - Low-level power management (e.g., C6)
> - Clock frequency
> - MOCS and PAT programming
>
> At the moment I've left USM / pagefault handling at the GT level,
> although I'm not familiar enough with that specific feature to know
> whether it's truly correct or not.
>
> The first patch in this series temporarily drops MTL media GT support.
> The driver doesn't load properly on MTL today, largely due to the
> mishandling of GT vs tile; dropping support completely allows us to more
> easily make the necessary driver redesign required. The media GT is
> re-enabled (properly this time) near the end of the series and this
> allows the driver to load successfully without error on MTL for the
> first time. There are still issues when submitting workloads to MTL
> after driver load (i.e., CAT errors), but those seem to be a separate
> platform-specific issues unrelated to the GT/tile work in this series
> that will need to be debugged and fixed separately.
>
>
> This series leaves a few open questions and FIXME's:
> - Unlike i915, the Xe driver has chosen to expose GTs to userspace
> rather than keeping them a hidden implementation detail. With the
> separation of xe_tile and xe_gt, we need to decide whether we also
> want to expose tiles (in addition to GTs), whether we want to _only_
> expose tiles (and keep the primary vs media GT separation a hidden
> internal detail), or something else.
> - How should GTs be numbered? Today it's straightforward --- PVC
> assigns GT IDs 0 and 1 to the primary GT of each tile. MTL assigns
> GT IDs 0 and 1 to the primary and media GTs of its sole tile. But if
> we have a platform in the future that has multiple tiles _and_
> multiple GTs per tile, how should we handle the numbering in that
> case?
> - Xe (mis)design used xe_gt as the target of all MMIO operations (i.e.,
> xe_mmio_*()). This really doesn't make sense, especially since
> there's a lot of MMIO accesses that are completely unrelated to GT
> (i.e., sgunit registers, display registers, etc.). i915 used
> 'intel_uncore' as the MMIO target, although that wasn't really an
> accurate reflection of the hardware either. What we really want is
> something that combines the MMIO register space (stored in the tile)
> with the GSI offset (stored in the GT). My current plan is to
> introduce an "xe_mmio_view" (name may change) in a future series that
> will serve as a target for register operations. There will be
> sensible APIs to obtain an xe_mmio_view appropriate to the type of
> register access being performed (and that will also be able to do
> some range sanity checking in debug drivers to help catch misuse).
> That's a somewhat large/invasive change, so I'm saving that for a
> follow-up series after this one is completed.
>
>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Michael J. Ruhl <michael.j.ruhl at intel.com>
> Cc: Nirmoy Das <nirmoy.das at intel.com>
Tested this on my MTL C0. You might be working on next revision but feel
free to add Tested-by: Nirmoy Das <nirmoy.das at intel.com> for the series.
>
>
> Matt Roper (26):
> drm/xe/mtl: Disable media GT
> drm/xe: Introduce xe_tile
> drm/xe: Add backpointer from gt to tile
> drm/xe: Add for_each_tile iterator
> drm/xe: Move register MMIO into xe_tile
> drm/xe: Move VRAM from GT to tile
> drm/xe: Memory allocations are tile-based, not GT-based
> drm/xe: Move migration from GT to tile
> drm/xe: Clarify 'gt' retrieval for primary tile
> drm/xe: Drop vram_id
> drm/xe: Drop extra_gts[] declarations and XE_GT_TYPE_REMOTE
> drm/xe: Allocate GT dynamically
> drm/xe: Add media GT to tile
> drm/xe: Move display IRQ postinstall out of GT function
> drm/xe: Interrupts are delivered per-tile, not per-GT
> drm/xe/irq: Handle ASLE backlight interrupts at same time as display
> drm/xe/irq: Actually call xe_irq_postinstall()
> drm/xe/irq: Ensure primary GuC won't clobber media GuC's interrupt
> mask
> drm/xe/irq: Untangle postinstall functions
> drm/xe: Replace xe_gt_irq_postinstall with xe_irq_enable_hwe
> drm/xe: Invalidate TLB on all affected GTs during GGTT updates
> drm/xe/tlb: Obtain forcewake when doing GGTT TLB invalidations
> drm/xe: Allow GT looping and lookup on standalone media
> drm/xe: Update query uapi to support standalone media
> drm/xe: Reinstate media GT support
> drm/xe: Clarify source of GT log messages
>
> drivers/gpu/drm/i915/display/intel_dsb.c | 5 +-
> drivers/gpu/drm/i915/display/intel_fbc.c | 3 +-
> drivers/gpu/drm/i915/display/intel_fbdev.c | 7 +-
> drivers/gpu/drm/xe/Makefile | 1 +
> .../drm/xe/compat-i915-headers/intel_uncore.h | 2 +-
> drivers/gpu/drm/xe/display/ext/i915_irq.c | 2 +-
> drivers/gpu/drm/xe/display/xe_fb_pin.c | 13 +-
> drivers/gpu/drm/xe/display/xe_plane_initial.c | 8 +-
> drivers/gpu/drm/xe/regs/xe_gt_regs.h | 8 +
> drivers/gpu/drm/xe/tests/xe_bo.c | 8 +-
> drivers/gpu/drm/xe/tests/xe_migrate.c | 15 +-
> drivers/gpu/drm/xe/xe_bb.c | 5 +-
> drivers/gpu/drm/xe/xe_bo.c | 104 ++---
> drivers/gpu/drm/xe/xe_bo.h | 20 +-
> drivers/gpu/drm/xe/xe_bo_evict.c | 22 +-
> drivers/gpu/drm/xe/xe_bo_types.h | 4 +-
> drivers/gpu/drm/xe/xe_device.c | 12 +-
> drivers/gpu/drm/xe/xe_device.h | 49 ++-
> drivers/gpu/drm/xe/xe_device_types.h | 107 ++++-
> drivers/gpu/drm/xe/xe_engine.c | 2 +-
> drivers/gpu/drm/xe/xe_ggtt.c | 45 +-
> drivers/gpu/drm/xe/xe_ggtt.h | 6 +-
> drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +-
> drivers/gpu/drm/xe/xe_gt.c | 191 ++-------
> drivers/gpu/drm/xe/xe_gt.h | 8 +-
> drivers/gpu/drm/xe/xe_gt_debugfs.c | 8 +-
> drivers/gpu/drm/xe/xe_gt_mcr.c | 2 +-
> drivers/gpu/drm/xe/xe_gt_pagefault.c | 16 +-
> drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 4 +-
> drivers/gpu/drm/xe/xe_gt_types.h | 87 ++--
> drivers/gpu/drm/xe/xe_guc.c | 11 +-
> drivers/gpu/drm/xe/xe_guc_ads.c | 5 +-
> drivers/gpu/drm/xe/xe_guc_ct.c | 5 +-
> drivers/gpu/drm/xe/xe_guc_hwconfig.c | 5 +-
> drivers/gpu/drm/xe/xe_guc_log.c | 6 +-
> drivers/gpu/drm/xe/xe_guc_pc.c | 5 +-
> drivers/gpu/drm/xe/xe_hw_engine.c | 6 +-
> drivers/gpu/drm/xe/xe_irq.c | 393 +++++++++---------
> drivers/gpu/drm/xe/xe_irq.h | 3 +-
> drivers/gpu/drm/xe/xe_lrc.c | 13 +-
> drivers/gpu/drm/xe/xe_lrc_types.h | 4 +-
> drivers/gpu/drm/xe/xe_migrate.c | 76 ++--
> drivers/gpu/drm/xe/xe_migrate.h | 9 +-
> drivers/gpu/drm/xe/xe_mmio.c | 92 ++--
> drivers/gpu/drm/xe/xe_mmio.h | 21 +-
> drivers/gpu/drm/xe/xe_mocs.c | 14 +-
> drivers/gpu/drm/xe/xe_pci.c | 92 ++--
> drivers/gpu/drm/xe/xe_pt.c | 150 ++++---
> drivers/gpu/drm/xe/xe_pt.h | 14 +-
> drivers/gpu/drm/xe/xe_query.c | 32 +-
> drivers/gpu/drm/xe/xe_res_cursor.h | 2 +-
> drivers/gpu/drm/xe/xe_sa.c | 13 +-
> drivers/gpu/drm/xe/xe_sa.h | 4 +-
> drivers/gpu/drm/xe/xe_tile.c | 89 ++++
> drivers/gpu/drm/xe/xe_tile.h | 16 +
> drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 4 +-
> drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 16 +-
> drivers/gpu/drm/xe/xe_ttm_vram_mgr.h | 4 +-
> drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h | 6 +-
> drivers/gpu/drm/xe/xe_uc_fw.c | 5 +-
> drivers/gpu/drm/xe/xe_vm.c | 156 ++++---
> drivers/gpu/drm/xe/xe_vm.h | 2 +-
> drivers/gpu/drm/xe/xe_vm_types.h | 22 +-
> include/uapi/drm/xe_drm.h | 4 +-
> 64 files changed, 1108 insertions(+), 957 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/xe_tile.c
> create mode 100644 drivers/gpu/drm/xe/xe_tile.h
>
More information about the Intel-xe
mailing list