[Intel-gfx] i915 dma faults on Xen
Jason Andryuk
jandryuk at gmail.com
Mon Feb 22 12:49:25 UTC 2021
On Mon, Feb 22, 2021 at 5:18 AM Roger Pau Monné <roger.pau at citrix.com> wrote:
>
> On Fri, Feb 19, 2021 at 12:30:23PM -0500, Jason Andryuk wrote:
> > On Wed, Oct 21, 2020 at 9:59 AM Jan Beulich <jbeulich at suse.com> wrote:
> > >
> > > On 21.10.2020 15:36, Jason Andryuk wrote:
> > > > On Wed, Oct 21, 2020 at 8:53 AM Jan Beulich <jbeulich at suse.com> wrote:
> > > >>
> > > >> On 21.10.2020 14:45, Jason Andryuk wrote:
> > > >>> On Wed, Oct 21, 2020 at 5:58 AM Roger Pau Monné <roger.pau at citrix.com> wrote:
> > > >>>> Hm, it's hard to tell what's going on. My limited experience with
> > > >>>> IOMMU faults on broken systems there's a small range that initially
> > > >>>> triggers those, and then the device goes wonky and starts accessing a
> > > >>>> whole load of invalid addresses.
> > > >>>>
> > > >>>> You could try adding those manually using the rmrr Xen command line
> > > >>>> option [0], maybe you can figure out which range(s) are missing?
> > > >>>
> > > >>> They seem to change, so it's hard to know. Would there be harm in
> > > >>> adding one to cover the end of RAM ( 0x04,7c80,0000 ) to (
> > > >>> 0xff,ffff,ffff )? Maybe that would just quiet the pointless faults
> > > >>> while leaving the IOMMU enabled?
> > > >>
> > > >> While they may quieten the faults, I don't think those faults are
> > > >> pointless. They indicate some problem with the software (less
> > > >> likely the hardware, possibly the firmware) that you're using.
> > > >> Also there's the question of what the overall behavior is going
> > > >> to be when devices are permitted to access unpopulated address
> > > >> ranges. I assume you did check already that no devices have their
> > > >> BARs placed in that range?
> > > >
> > > > Isn't no-igfx already letting them try to read those unpopulated addresses?
> > >
> > > Yes, and it is for the reason that the documentation for the
> > > option says "If specifying `no-igfx` fixes anything, please
> > > report the problem." I imply from in in particular that one
> > > better wouldn't use it for non-development purposes of whatever
> > > kind.
> >
> > I stopped seeing these DMA faults, but I didn't know what made them go
> > away. Then when working with an older 5.4.64 kernel, I saw them
> > again. Eric bisected down to the 5.4.y version of mainline linux
> > commit:
> >
> > commit 8195400f7ea95399f721ad21f4d663a62c65036f
> > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > Date: Mon Oct 19 11:15:23 2020 +0100
> >
> > drm/i915: Force VT'd workarounds when running as a guest OS
> >
> > If i915.ko is being used as a passthrough device, it does not know if
> > the host is using intel_iommu. Mixing the iommu and gfx causes a few
> > issues (such as scanout overfetch) which we need to workaround inside
> > the driver, so if we detect we are running under a hypervisor, also
> > assume the device access is being virtualised.
>
> So the commit above fixes the DMA faults seen on Linux when using a
> i915 gfx card?
Yes, DMA faults are not seen with this commit. i915 behaves
differently when it detects VT-d active, and this commit sets the VT-d
behavior when running under any hypervisor.
Regards,
Jason
More information about the Intel-gfx
mailing list