[Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support

Michael S. Tsirkin mst at redhat.com
Wed Jul 2 18:53:04 CEST 2014


On Wed, Jul 02, 2014 at 12:23:37PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 02, 2014 at 04:50:15PM +0200, Paolo Bonzini wrote:
> > Il 02/07/2014 16:00, Konrad Rzeszutek Wilk ha scritto:
> > >With this long thread I lost a bit context about the challenges
> > >that exists. But let me try summarizing it here - which will hopefully
> > >get some consensus.
> > >
> > >1). Fix IGD hardware to not use Southbridge magic addresses.
> > >    We can moan and moan but I doubt it is going to change.
> > 
> > There are two problems:
> > 
> > - Northbridge (i.e. MCH i.e. PCI host bridge) configuration space addresses
> 
> Right. So in  drivers/gpu/drm/i915/i915_dma.c:
> 1135 #define MCHBAR_I915 0x44                                                        
> 1136 #define MCHBAR_I965 0x48                     
> 
> 1147         int reg = INTEL_INFO(dev)->gen >= 4 ? MCHBAR_I965 : MCHBAR_I915;        
> 1152         if (INTEL_INFO(dev)->gen >= 4)                                          
> 1153                 pci_read_config_dword(dev_priv->bridge_dev, reg + 4, &temp_hi); 
> 1154         pci_read_config_dword(dev_priv->bridge_dev, reg, &temp_lo);             
> 1155         mchbar_addr = ((u64)temp_hi << 32) | temp_lo;                
> 
> and
> 
> 1139 #define DEVEN_REG 0x54                                                          
> 
> 1193         int mchbar_reg = INTEL_INFO(dev)->gen >= 4 ? MCHBAR_I965 : MCHBAR_I915; 
> 1202         if (IS_I915G(dev) || IS_I915GM(dev)) {                                  
> 1203                 pci_read_config_dword(dev_priv->bridge_dev, DEVEN_REG, &temp);  
> 1204                 enabled = !!(temp & DEVEN_MCHBAR_EN);                           
> 1205         } else {                                                                
> 1206                 pci_read_config_dword(dev_priv->bridge_dev, mchbar_reg, &temp); 
> 1207                 enabled = temp & 1;                                             
> 1208         }                                                
> > 
> > - Southbridge (i.e. PCH i.e. ISA bridge) vendor/device ID; some versions of
> > the driver identify it by class, some versions identify it by slot (1f.0).
> 
> Right, So in  drivers/gpu/drm/i915/i915_drv.c the giant intel_detect_pch
> which sets the pch_type based on :
> 
>  432                 if (pch->vendor == PCI_VENDOR_ID_INTEL) {                       
>  433                         unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>  434                         dev_priv->pch_id = id;                                  
>  435                                                                                 
>  436                         if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { 
> 
> It checks for 0x3b00, 0x1c00, 0x1e00, 0x8c00 and 0x9c00.
> The INTEL_PCH_DEVICE_ID_MASK is 0xff00
> > 
> > To solve the first, make a new machine type, PIIX4-based, and pass through
> > the registers you need.  The patch must document _exactly_ why the registers
> > are safe to pass.  If they are not reserved on PIIX4, the patch must
> > document what the same offsets mean on PIIX4, and why it's sensible to
> > assume that firmware for virtual machine will not read/write them.  Bonus
> > point for also documenting the same for Q35.
> 
> OK. They look to be related to setting up an MBAR , but I don't understand
> why it is needed. Hopefully some of the i915 folks CC-ed here can answer.
> 
> > 
> > Regarding the second, fixing IGD hardware to not rely on chipset magic is a
> > no-go, I agree.  I disagree that it's a no-go to define a "backdoor" that
> > lets a hypervisor pass the right information to the driver without hacking
> > the chipset device model.
> > 
> > The hardware folks would have to give us a place for a pair of registers
> > (something like data/address), and a bit somewhere else that would be always
> > 0 on hardware and always 1 if the hypervisor is implementing the pair of
> > registers.  This is similar to CPUID, which has the HYPERVISOR bit +
> > hypervisor-defined leaves at 0x40000000.
> > 
> > The data/address pair could be in a BAR, in configuration space, in the low
> > VGA ports at 0x3c0-0x3df, wherever.  The hypervisor bit can be in the same
> > place or somewhere else---again, whatever is convenient for the hardware
> > folks.  We just need *one bit* that is known-zero on all hardware, and 8
> > bytes in a reserved area.  I don't think it's too hard to find this space,
> > and I really, really would like Intel to follow up on a paravirtualized
> > backdoor.
> > 
> > That said, we have the problem of existing guests, so I agree something else
> > is needed.
> > 
> > >     a) Two bridges - one 'passthrough' and the legacy ISA bridge
> > >        that QEMU emulates. Both Linux and Windows are OK with
> > >        two bridges (even thought it is pretty weird).
> > 
> > This is pretty much the only solution for existing Linux guests that look up
> > the southbridge by class.
> 
> Right.
> > 
> > The proposed solution here is to define a new "pci stub" device in QEMU that
> > lets you define a do-nothing device with your desired vendor ID, device ID,
> > class and optionally subsystem IDs.
> 
> <nods>
> > 
> > The new machine type (the one that instantiates the special
> > IGD-passthrough-enabled northbridge) can then instantiate this stub device
> > at 1f.0 with the desired vendor ID, device ID and class ID.
> 
> Which is kind of neat because you can use a different type of device ID with 
> (say make it look like Ibex Peak) and pair it up with an IGD that is found
> only on LynxPoint. Oh fun!
> > 
> > If we cannot get the paravirtualized backdoor, it would also make sense to:
> > 
> > - have drivers standardize on a single way to probe the southbridge
> > 
> > - make this be neither by class (because the firmware wants to distinguish
> > the actual ISA bridge from the stub, and it can do so by looking up the
> > class), nor by slot (because this conflicts with the Q35 chipset model that
> > has the southbridge at 1f.0).
> > 
> > mst's proposal was to probe by subsystem id.  I'm not sure I understood the
> > details exactly, but I trust him. :)  However, in case it wasn't clear I
> > think a paravirtualized backdoor would still be better.
> 
> OK, like this:
> 
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 651e65e..03f2829 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -433,6 +433,8 @@ void intel_detect_pch(struct drm_device *dev)
>  			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>  			dev_priv->pch_id = id;
>  
> +			if (pch->subsystem_vendor == PCI_VENDOR_ID_XEN)
> +				id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
>  			if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) {
>  				dev_priv->pch_type = PCH_IBX;
>  				DRM_DEBUG_KMS("Found Ibex Peak PCH\n");

No!
The point is to avoid looking at PCH at all, only look at
the card itself.

> > 
> > >     b) One bridge - the one that QEMU emulates - and lets emulate
> > >        more of the registers (by emulate - I mean for some get the
> > >        data from the real hardware).
> > >
> > >           b1). We can't use the legacy because the registers are
> > >                above 256 (is that correct? Did I miss something?)
> > 
> > As I understand it, mst brought up Q35 because the northbridge configuration
> > space layout might be more similar to what the driver expects than for
> > PIIX4.  But I don't think anyone really said whether this is true or false.
> > 
> > I think Q35 is absolutely not a requirement for IGD passthrough, especially
> > until this statement is either proved or disproved.
> 
> OK, so lets drop that.
> > 
> > >4). Code does a bit of sysfs that could use some refacturing with
> > >    the KVM code.
> > >    Problem: More time needed to do the code restructing.
> > 
> > FWIW, I don't really care about code sharing with KVM.  That's a separate
> > problem and it's not necessary to bring it up and make waters even more
> > muddy.
> > 
> 
> OK, lets drop that for now.
> > Paolo



More information about the Intel-gfx mailing list