[Intel-gfx] [PATCH 0/7] Enable SVM for Intel VT-d

Tue Oct 13 05:06:54 PDT 2015

On Sat, Oct 10, 2015 at 02:17:55PM +0100, David Woodhouse wrote:
> On Fri, 2015-10-09 at 00:50 +0100, David Woodhouse wrote:
> > This patch set enables PASID support for the Intel IOMMU, along with
> > page request support.
> > 
> > Like its AMD counterpart, it exposes an IOMMU-specific API. I believe
> > we'll have a session at the Kernel Summit later this month in which we
> > can work out a generic API which will cover the two (now) existing
> > implementations as well as upcoming ARM (and other?) versions.
> > 
> > For the time being, however, exposing an Intel-specific API is good
> > enough, especially as we don't have the required TLP prefix support on
> > our PCIe root ports and we *can't* support discrete PCIe devices with
> > PASID support. It's purely on-chip stuff right now, which is basically
> > only Intel graphics.
> > 
> > The AMD implementation allows a per-device PASID space, and managing
> > the PASID space is left entirely to the device driver. In contrast,
> > this implementation maintains a per-IOMMU PASID space, and drivers
> > calling intel_svm_bind_mm() will be *given* the PASID that they are to
> > use. In general we seem to be converging on using a single PASID space
> > across *all* IOMMUs in the system, and this will support that mode of
> > operation.
> 
> The other noticeable difference is the lifetime management of the mm.
> My code takes a reference on it, and will only do the mmput() when the
> driver unbinds the PASID. So the mmu_notifier's .release() method won't
> get called before that.
> 
> The AMD version doesn't take that refcount, and its .release() method
> therefore needs to actually call back into the device driver and ensure
> that all access to the mm, including pending page faults, is flushed.
> The locking issues there scare me a little, especially if page faults
> are currently outstanding.
> 
> In the i915 case we have an open file descriptor associated with the
> gfx context. When the process dies, the fd is closed and the driver can
> go and clean up after it.
> 
> The amdkfd driver, on the other hand, keeps the device-side job running
> even after the process has closed its file descriptor. So it *needs*
> the .release() call to happen when the process exits, as it otherwise
> doesn't know when to clean up.
> 
> I am somewhat dubious about that as a design decision. If we're moving
> to a more explicit management of off-cpu tasks with mm access, as is to
> be discussed at the Kernel Summit, then hopefully we can fix that. It's
> a *lot* simpler if we just pin the mm while the device context has
> access to it.

I think acquiring a full reference on the mm makes sense. Conceptually an
svm context is just another compute thread, just not running on the cpu.
The other way round would mean that at mm exit we tear down these
additional threads, which seems a bit backwards.

Of course that special thread is attached to an fd, which has a completely
separate lifetime from threads. There's also the fun that a different mm
could submit commands to a foreign svm context. So no perfect fit, but on
a hunch still feels like grabbing a full reference is the cleaner design.
And it matches the refcounting we do for traditional gpu contexts on the
ppgtt address space they're using.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch