[Intel-xe] [PATCH 00/26] Separate GT and tile
Thomas Hellström
thomas.hellstrom at linux.intel.com
Mon May 15 13:08:24 UTC 2023
Hi, Matt,
On Wed, 2023-05-10 at 20:46 -0700, Matt Roper wrote:
> A 'tile' is not the same thing as a 'GT.' For historical reasons,
> i915
> attempted to use a single 'struct intel_gt' to represent both
> concepts,
> although this design hasn't worked out terribly well. For Xe we have
> the opportunity to design the driver in a way that more accurately
> reflects the real hardware behavior.
>
> Different vendors use the term "tile" a bit differently, but in the
> Intel world, a 'tile' is pretty close to what most people would think
> of
> as being a complete GPU. When multiple GPUs are placed behind a
> single
> PCI device, that's what we refer to as a "multi-tile device." In
> such
> cases, pretty much all hardware is replicated per-tile, although
> certain
> responsibilities like PCI communication, reporting of interrupts to
> the
> OS, etc. are handled solely by the "root tile." A multi-tile
> platform
> takes care of tying the tiles together in a way such that interrupt
> notifications from remote tiles are forwarded to the root tile, the
> per-tile vram is combined into a single address space, etc.
>
> In contrast, a "GT" (which officially stands for "Graphics
> Technology")
> is the subset of a GPU/tile that is responsible for implementing
> graphics and/or media operations. The GT is where a lot of the
> driver
> implementation happens since it's where the hardware engines, the
> execution units, and the GuC all reside.
>
> Historically most Intel devices were single-tile devices that
> contained
> a single GT. PVC is currently the only released Intel platform built
> on
> a multi-tile design (i.e., multiple GPUs behind a single PCI device);
> each PVC tile only has a single GT. In contrast, platforms like MTL
> that have separate chips for render and media IP are still only a
> single
> logical GPU, but the graphics and media IP blocks are exposed each
> exposed as a separate GT within that single GPU. This is important
> from
> a software perspective because multi-GT platforms like MTL only
> replicate a subset of the GPU hardware and behave differently than
> multi-tile platforms like PVC where nearly everything is replicated.
>
> This series separates tiles from GTs in a manner that more closely
> matches the hardware behavior. We now consider a PCI device
> (xe_device)
> to contain one or more tiles (struct xe_tile). Each tile will
> contain
> one or two GTs (struct xe_gt). Although we don't have any platforms
> yet
> that are multi-tile *and* contain more than one GT per tile, that may
> change in the future. This driver redesign splits functionality as
> follows:
>
> Per-tile functionality (shared by all GTs within the tile):
> - Complete 4MB MMIO space (containing SGunit/SoC registers, GT
> registers, display registers, etc.)
> - Global GTT
> - VRAM (if discrete)
> - Interrupt flows
> - Migration context
> - kernel batchbuffer pool
> - Primary GT
> - Media GT (if media version >= 13)
>
> Per-GT functionality:
> - GuC
> - Hardware engines
> - Programmable hardware units (subslices, EUs)
> - GSI subset of registers (multiple copies of these registers reside
> within the complete MMIO space provided by the tile, but at
> different
> offsets --- 0 for render, 0x380000 for media)
> - Multicast register steering
> - TLBs to cache page table translations
> - Reset capability
> - Low-level power management (e.g., C6)
> - Clock frequency
> - MOCS and PAT programming
>
With that detailed cover-letter description, I think this makes sense.
I figure pagetables will need to be per tile with this splitup? What
about per-tile resources, like VRAM, that is accessible from all tiles
but with separate throughput / latencies depending on from which tile
they are accessed? Should those perhaps be per device with a per-tile
pointer to "preferred VRAM" and a map [tile][memory_type] of access
cost?
/Thomas
More information about the Intel-xe
mailing list