[RFC] Future TTM DMA direction
Thomas Hellstrom
thellstrom at vmware.com
Mon Jan 9 01:37:28 PST 2012
Hi!
When TTM was originally written, it was assumed that GPU apertures could
address pages directly, and that the CPU could access those pages
without explicit synchronization. The process of binding a page to a GPU
translation table was a simple one-step operation, and we needed to
worry about fragmentation in the GPU aperture only.
Now that we "sort of" support DMA memory there are three things I think
are missing:
1) We can't gracefully handle coherent DMA OOMs or coherent DMA
(Including CMA) memory fragmentation leading to failed allocations.
2) We can't handle dynamic mapping of pages into and out of dma, and
corresponding IOMMU space shortage or fragmentation, and CPU
synchronization.
3) We have no straightforward way of moving pages between devices.
I think a reasonable way to support this is to make binding to a
non-fixed (system page based) TTM memory type a two-step binding
process, so that a TTM placement consists of (DMA_TYPE, MEMORY_TYPE)
instead of only (MEMORY_TYPE).
In step 1) the bo is bound to a specific DMA type. These could be for
example:
(NONE, DYNAMIC, COHERENT, CMA), .... device dependent types could be
allowed as well.
In this step, we perform dma_sync_for_device, or allocate dma-specific
pages maintaining LRU lists so that if we receive a DMA memory
allocation OOM, we can unbind bo:s bound to the same DMA type. Standard
graphics cards would then, for example, use the NONE DMA type when run
on bare metal or COHERENT when run on Xen. A "COHERENT" OOM condition
would then lead to eviction of another bo. (Note that DMA eviction might
involve data copies and be costly, but still better than failing).
Binding with the DYNAMIC memory type would mean that CPU accesses are
disallowed, and that user-space CPU page mappings might need to be
killed, with a corresponding sync_for_cpu if they are faulted in again
(perhaps on a page-by-page basis). Any attempt to bo_kmap() a bo page
bound to DYNAMIC DMA mapping should trigger a BUG.
In step 2) The bo is bound to the GPU in the same way it's done today.
Evicting from DMA will of course also trigger an evict from GPU, but an
evict from GPU will not trigger a DMA evict.
Making a bo "anonymous" and thus moveable between devices would then
mean binding it to the "NONE" DMA type.
Comments, suggestions?
/Thomas
More information about the dri-devel
mailing list