[Intel-gfx] [PATCH V6] drm/i915: Disable stolen memory when i915 runs in guest vm

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Mon May 8 10:07:10 UTC 2017


On la, 2017-05-06 at 02:58 +0000, Zhang, Xiong Y wrote:
> > 
> > On ke, 2017-05-03 at 09:22 +0000, Zhang, Xiong Y wrote:
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > + David and Jon
> > > > > 
> > > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote:
> > > > > 
> > > > > The blocking issue I see is that bisecting is still not pointing at
> > > > > relevant commits. Both bisected commits from Bugzilla are not related
> > > > > to changes in stolen memory usage behavior. I'd assume a successful
> > > > > bisect to land at the patches where we start creating kernel internal
> > > > > objects from stolen memory. Otherwise we could be ignoring a bug
> > > > > elsewhere. If it consistently lands on those patches, then there might
> > > > > be something wrong with them, in addition to stolen memory problems.
> > > > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla
> > descripted,
> > > 
> > > > 
> > > > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu
> > hang
> > > 
> > > > 
> > > > in guest dmesg. From this point, we could do git bisect.
> > > > But tons of IOMMU DMA R/W exception to stolen memory exist in host
> > dmesg
> > > 
> > > > 
> > > > when guest kernel is 4.8 and 4.9. This means guest domain iommu table
> > > > doesn't
> > > > have mapping for stolen memory and IGD fail in accessing stolen memory
> > > > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression
> > and
> > > 
> > > > 
> > > > shouldn't go git bisect. You could check this host error message from the
> > > > bugzilla
> > > > attachment. And this should be fixed first.
> > > > Anyway, I will try my best to get the ideal commit through git bisect, but
> > I'm
> > > 
> > > > 
> > > > afraid
> > > > the result is the same as past because we don't have a stable good point to
> > > > start git
> > > > bisect.
> > > [Zhang, Xiong Y] hi, Joonas:
> > > As you said, the gpu hang exist because i915 create ring buffer from stolen
> > memory.
> > > 
> > > I did git bisect again, and the following commit is the first bad commit:
> > > commit c58b735fc762e891481e92af7124b85cb0a51fce
> > > > > > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > > Date:   Thu Aug 18 17:16:57 2016 +0100
> > > 
> > >     drm/i915: Allocate rings from stolen
> > > 
> > >     If we have stolen available, make use of it for ringbuffer allocation.
> > >     Previously this was restricted to !llc platforms, as writing to stolen
> > >     requires a GGTT mapping - but now that we have partial mappable
> > support,
> > > 
> > >     the mappable aperture isn't quite so precious so we can use it more
> > >     freely and ringbuffers are a good user for the otherwise wasted stolen.
> > > 
> > > After reverting this patch from drm-intel-nightly, I didn't see gpu hang during
> > guest boot process.
> > > 
> > > So what's our next step ?
> > 
> > An appropriate next step would be to evaluate how much work it is to
> > support the RMRR passthrough David mentioned about in his commit.
> [Zhang, Xiong Y] As Kevin explained, KVM community found the disadvantage
> Of RMRR and have decided to not support RMRR passthrough, so it is really hard
> for us to push such solution and isn't related to the workload.
> Except usb and graphic card, all other devices with RMRR couldn't passthrough
> to guest. But the driver of usb and graphic card couldn't access RMRR in such
> environment.
> https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf

Does this patch have the right Cc's from KVM team? I'd like to hear
directly from them that even the usage of RMRRs that follow the
intention of VT-d spec are not going to be supported. That document
predates the patches to add the exclusion for graphics.

> > I'd also go talk with the IGD team, why they refuse to load the driver
> > when stolen memory is correctly reported as zero, and insist on being
> > lied to.
> [Zhang, Xiong Y] thanks a lot for doing so.

I don't have the contacts, so I assume you to pursue that.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


More information about the Intel-gfx mailing list