[Intel-gfx] [PATCH 6/9] drm/i915: driver based PASID handling
David Woodhouse
dwmw2 at infradead.org
Wed Oct 7 10:17:18 PDT 2015
On Wed, 2015-10-07 at 09:28 -0700, Jesse Barnes wrote:
> On 10/07/2015 09:14 AM, Daniel Vetter wrote:
> > On Wed, Oct 07, 2015 at 08:16:42AM -0700, Jesse Barnes wrote:
> > > On 10/07/2015 06:00 AM, David Woodhouse wrote:
> > > > On Fri, 2015-09-04 at 09:59 -0700, Jesse Barnes wrote:
> > > > > +
> > > > > + ret = handle_mm_fault(mm, vma, address,
> > > > > + desc.wr_req ? FAULT_FLAG_WRITE : 0);
> > > > > + if (ret & VM_FAULT_ERROR) {
> > > > > + gpu_mm_segv(tsk, address, SEGV_ACCERR); /* ? */
> > > > > + goto out_unlock;
> > > > > + }
> > > > > +
> > > >
> > > > Hm, do you need to force the SEGV there, in what ought to be generic
> > > > IOMMU code?
> > > >
> > > > Can you instead just let the fault handler return an appropriate
> > > > failure code to the IOMMU request queue and then deal with the
> > > > resulting error on the i915 device side?
> > >
> > > I'm not sure if we get enough info on the i915 side to handle it
> > > reasonably, we'll have to test that out.
> >
> > We do know precisely which context blew up, but without the TDR work we
> > can't yet just kill the offender selective without affecting the other
> > active gpu contexts.
>
> How? The notification from the IOMMU queue is asynchronous...
The page request, and the response, include 'private data' which an
endpoint can use to carry that kind of information.
In $7.5.1.1 of the VT-d specification it tells us:
"Private Data: The Private Data field can be used by
Root-Complex integrated endpoints to uniquely identify
device-specific private information associated with an
individual page request.
"For Intel ® Processor Graphics device, the Private Data field
specifies the identity of the GPU advanced-context (see
Section 3.10) sending the page request."
> > But besides that I really don't see a reason why we need to kill the
> > process if the gpu faults. After all if a thread sigfaults then signal
> > goes to that thread and not some random one (or the one thread that forked
> > the thread that blew up). And we do have interfaces to tell userspace that
> > something bad happened with the gpu work it submitted.
I certainly don't want the core IOMMU code killing things. I really
want to just complete the page request with an appropriate failure
code, and let the endpoint device deal with it from there.
--
David Woodhouse Open Source Technology Centre
David.Woodhouse at intel.com Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5691 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20151007/dd20fa84/attachment.bin>
More information about the Intel-gfx
mailing list