[Intel-gfx] [PATCH V6] drm/i915: Disable stolen memory when i915 runs in guest vm

Zhang, Xiong Y xiong.y.zhang at intel.com
Sat May 6 02:58:41 UTC 2017


> On ke, 2017-05-03 at 09:22 +0000, Zhang, Xiong Y wrote:
> > >
> > > >
> > > > + David and Jon
> > > >
> > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote:
> > > >
> > > > The blocking issue I see is that bisecting is still not pointing at
> > > > relevant commits. Both bisected commits from Bugzilla are not related
> > > > to changes in stolen memory usage behavior. I'd assume a successful
> > > > bisect to land at the patches where we start creating kernel internal
> > > > objects from stolen memory. Otherwise we could be ignoring a bug
> > > > elsewhere. If it consistently lands on those patches, then there might
> > > > be something wrong with them, in addition to stolen memory problems.
> > > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla
> descripted,
> > > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu
> hang
> > > in guest dmesg. From this point, we could do git bisect.
> > > But tons of IOMMU DMA R/W exception to stolen memory exist in host
> dmesg
> > > when guest kernel is 4.8 and 4.9. This means guest domain iommu table
> > > doesn't
> > > have mapping for stolen memory and IGD fail in accessing stolen memory
> > > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression
> and
> > > shouldn't go git bisect. You could check this host error message from the
> > > bugzilla
> > > attachment. And this should be fixed first.
> > > Anyway, I will try my best to get the ideal commit through git bisect, but
> I'm
> > > afraid
> > > the result is the same as past because we don't have a stable good point to
> > > start git
> > > bisect.
> > [Zhang, Xiong Y] hi, Joonas:
> > As you said, the gpu hang exist because i915 create ring buffer from stolen
> memory.
> > I did git bisect again, and the following commit is the first bad commit:
> > commit c58b735fc762e891481e92af7124b85cb0a51fce
> > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > Date:   Thu Aug 18 17:16:57 2016 +0100
> >
> >     drm/i915: Allocate rings from stolen
> >
> >     If we have stolen available, make use of it for ringbuffer allocation.
> >     Previously this was restricted to !llc platforms, as writing to stolen
> >     requires a GGTT mapping - but now that we have partial mappable
> support,
> >     the mappable aperture isn't quite so precious so we can use it more
> >     freely and ringbuffers are a good user for the otherwise wasted stolen.
> >
> > After reverting this patch from drm-intel-nightly, I didn't see gpu hang during
> guest boot process.
> > So what's our next step ?
> 
> An appropriate next step would be to evaluate how much work it is to
> support the RMRR passthrough David mentioned about in his commit.
[Zhang, Xiong Y] As Kevin explained, KVM community found the disadvantage
Of RMRR and have decided to not support RMRR passthrough, so it is really hard
for us to push such solution and isn't related to the workload.
Except usb and graphic card, all other devices with RMRR couldn't passthrough
to guest. But the driver of usb and graphic card couldn't access RMRR in such
environment.
https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf

> I'd also go talk with the IGD team, why they refuse to load the driver
> when stolen memory is correctly reported as zero, and insist on being
> lied to.
[Zhang, Xiong Y] thanks a lot for doing so.
> 
> While doing that, please update the freedesktop.org bugs.
[Zhang, Xiong Y] sure, I will update bugzilla once we have further
finding and make a decision.
> 
> Regards, Joonas
> --
> Joonas Lahtinen
> Open Source Technology Center
> Intel Corporation


More information about the Intel-gfx mailing list