[Intel-gfx] [PATCH v2] iommu/intel: Exclude devices using RMRRs from IOMMU API domains

Tue Jun 17 19:53:14 CEST 2014

On Tue, Jun 17, 2014 at 10:59:51AM -0600, Alex Williamson wrote:
> On Tue, 2014-06-17 at 18:45 +0200, Daniel Vetter wrote:
> > On Tue, Jun 17, 2014 at 08:15:47AM -0600, Alex Williamson wrote:
> > > On Tue, 2014-06-17 at 15:44 +0200, Daniel Vetter wrote:
> > > > On Tue, Jun 17, 2014 at 07:16:22AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2014-06-17 at 13:41 +0100, David Woodhouse wrote:
> > > > > > On Tue, 2014-06-17 at 06:22 -0600, Alex Williamson wrote:
> > > > > > > On Tue, 2014-06-17 at 08:04 +0100, David Woodhouse wrote:
> > > > > > > > On Mon, 2014-06-16 at 23:35 -0600, Alex Williamson wrote:
> > > > > > > > > 
> > > > > > > > > Any idea what an off-the-shelf Asus motherboard would be doing with an
> > > > > > > > > RMRR on the Intel HD graphics?
> > > > > > > > > 
> > > > > > > > > dmar: RMRR base: 0x000000bb800000 end: 0x000000bf9fffff
> > > > > > > > > IOMMU: Setting identity map for device 0000:00:02.0 [0xbb800000 - 0xbf9fffff]
> > > > > > > > 
> > > > > > > > Hm, we should have thought of that sooner. That's quite normal — it's
> > > > > > > > for the 'stolen' memory used for the framebuffer. And maybe also the
> > > > > > > > GTT, and shadow GTT and other things; I forget precisely what, and it
> > > > > > > > varies from one setup to another.
> > > > > > > 
> > > > > > > Why exactly do these things need to be identity mapped through the
> > > > > > > IOMMU?  This sounds like something a normal device might do with a
> > > > > > > coherent mapping.
> > > > > > 
> > > > > > The BIOS (EFI or VESA) sets up a framebuffer in stolen main memory. It's
> > > > > > accessed by DMA, using the physical address. The RMRR exists because we
> > > > > > need it *not* to suddenly stop working the moment the OS turns on the
> > > > > > IOMMU.
> > > > > > 
> > > > > > The OS graphics driver, if any, is not loaded at this point.
> > > > > > 
> > > > > > And even later, the OS graphics driver may choose to make use of the
> > > > > > 'stolen' memory for various purposes. And since it was already stolen,
> > > > > > it doesn't go and set up *another* mapping for it; it knows that a
> > > > > > mapping already exists.
> > > > > > 
> > > > > > > > I'd expect fairly much all systems to have an RMRR for the integrated
> > > > > > > > graphics device if they have one, and your patch¹ is going to prevent
> > > > > > > > assignment of those to guests... as you've presumably noticed.
> > > > > > > > 
> > > > > > > > I'm not sure if the i915 driver is capable of fully reprogramming the
> > > > > > > > hardware to completely stop using that region, to allow assignment to a
> > > > > > > > guest with a 'pure' memory map and no stolen region. I suppose it must,
> > > > > > > > if assignment to guests was working correctly before?
> > > > > > > 
> > > > > > > IGD assignment has never worked with KVM.
> > > > > > 
> > > > > > Hm. It works with Xen though, doesn't it?
> > > > > 
> > > > > Apparently
> > > > > 
> > > > > > Are we content to say that it'll *never* work with KVM, and thus we can
> > > > > > live with the fact that your patch makes it harder to fix whatever was
> > > > > > wrong in the first place?
> > > > > 
> > > > > Probably not.  However, it seems like you're saying that this RMRR is
> > > > > used by and visible to OS level drivers, versus backchannel
> > > > > communication channels, invisible to the OS.  I think the latter is
> > > > > specifically what we want to prevent by excluding devices with RMRRs.
> > > > > This is a challenging use case, but it seems to be understood.  If when
> > > > > IGD is bound to vfio-pci we can be sure that access to the RMRR area
> > > > > ceases, then we can tear it down and re-establish it from
> > > > > userspace/QEMU, describe it to the guest in an e820 reserved region, and
> > > > > never consider hotplug of the device for guests.  If that's the case,
> > > > > maybe it's another exception, like USB.  I'll need to look through i915
> > > > > more to find how the region is discovered.  Thanks,
> > > > 
> > > > We have a bunch of register in the mmio bar set up by the bios that tells
> > > > us the address and size of the stolen range we can use. The address we
> > > > need for programming ptes, the size to know how much there is. We also
> > > > have an early boot pci quirk in x86 nowadays to make sure the pci layer
> > > > doesn't put random stuff in that range.
> > > > 
> > > > See drivers/gpu/drm/i915/i915_gem_gtt.c (search for stolen size)
> > > > i915_gem_stolen.c (look at stolen_to_phys) and the early quirks in
> > > > arch/x86/kernel/early-quirks.c for copies of the same code.
> > > 
> > > Thanks for the tips.  If the purpose of the RMRR is to maintain
> > > consistency across the OS enabling VT-d, then there's really no reason
> > > for this to be identity mapped in a guest (where VT-d is not exposed) is
> > > there?  It may waste the memory that's already reserved on the platform
> > > to not setup an identity map, but I could back stolen memory by
> > > non-stolen user memory, couldn't I?  It might be nice to avoid adding an
> > > identity mapping interface to the IOMMU API, even if it costs some
> > > memory to do so.  Or maybe I could expose the RMRR area through the VFIO
> > > device file descriptor, allow it to be mmap'd there, then allow that
> > > mmap to be mapped through the IOMMU.  Thanks,
> > 
> > The stolen range is locked down at boot in the memory controller and at
> > least on some platforms not cpu accessible. Also our gpu is famous for
> > warts in the tlb and pte lookup hw, so I wouldn't be surprised at all if
> > the stolen range couldn't be backed by normal memory. Our driver otoh will
> > survive if you set the stolen size to 0 (with slight feature degration).
> 
> Do you know if the same is true of the Windows driver for stolen size?
> We can easily set the guest physical address of stolen memory to match
> the physical hardware, which would hopefully keep the GPU happy, but if
> it's special at the memory controller level, it sounds like we'd really
> need to identity map it.  Thanks,

No idea what windows does here, and the path between me and the windows
team for such inquiries is extremely long :(
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch