[Intel-gfx] [PATCH 0/7] Enable SVM for Intel VT-d

David Woodhouse dwmw2 at infradead.org
Sat Oct 10 06:17:55 PDT 2015


On Fri, 2015-10-09 at 00:50 +0100, David Woodhouse wrote:
> This patch set enables PASID support for the Intel IOMMU, along with
> page request support.
> 
> Like its AMD counterpart, it exposes an IOMMU-specific API. I believe
> we'll have a session at the Kernel Summit later this month in which we
> can work out a generic API which will cover the two (now) existing
> implementations as well as upcoming ARM (and other?) versions.
> 
> For the time being, however, exposing an Intel-specific API is good
> enough, especially as we don't have the required TLP prefix support on
> our PCIe root ports and we *can't* support discrete PCIe devices with
> PASID support. It's purely on-chip stuff right now, which is basically
> only Intel graphics.
> 
> The AMD implementation allows a per-device PASID space, and managing
> the PASID space is left entirely to the device driver. In contrast,
> this implementation maintains a per-IOMMU PASID space, and drivers
> calling intel_svm_bind_mm() will be *given* the PASID that they are to
> use. In general we seem to be converging on using a single PASID space
> across *all* IOMMUs in the system, and this will support that mode of
> operation.

The other noticeable difference is the lifetime management of the mm.
My code takes a reference on it, and will only do the mmput() when the
driver unbinds the PASID. So the mmu_notifier's .release() method won't
get called before that.

The AMD version doesn't take that refcount, and its .release() method
therefore needs to actually call back into the device driver and ensure
that all access to the mm, including pending page faults, is flushed.
The locking issues there scare me a little, especially if page faults
are currently outstanding.

In the i915 case we have an open file descriptor associated with the
gfx context. When the process dies, the fd is closed and the driver can
go and clean up after it.

The amdkfd driver, on the other hand, keeps the device-side job running
even after the process has closed its file descriptor. So it *needs*
the .release() call to happen when the process exits, as it otherwise
doesn't know when to clean up.

I am somewhat dubious about that as a design decision. If we're moving
to a more explicit management of off-cpu tasks with mm access, as is to
be discussed at the Kernel Summit, then hopefully we can fix that. It's
a *lot* simpler if we just pin the mm while the device context has
access to it.

-- 
dwmw2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5691 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20151010/b5b1913b/attachment-0001.bin>


More information about the Intel-gfx mailing list