i915-related and general system freezes with specific kernel config // IOMMU

Daniel Vetter daniel.vetter at ffwll.ch
Sat Jan 19 05:27:48 PST 2013

Hi Mihai,

You have a gen4.5 chipset which is known to be utterly broken for
IOMMU+intel gpu. Looks like a few distros started enabling IOMMU by
default (fc18 has similar issues) and we've never added the proper
quirks. See https://bugzilla.kernel.org/show_bug.cgi?id=51921 for a
proposed patch to fix this (i.e. automatically set
intel_iommu=igfx_off for affected platfroms). Testing highly welcome.

Cheers, Daniel

On Sat, Jan 19, 2013 at 12:48 AM, Mihai Moldovan <ionic at ionic.de> wrote:
> Hi Daniel, David and everyone else,
> I'm experiencing system freezes on a box using the vanilla 3.7.2 (actually down
> to 3.2 or something) kernel with a custom configuration.
> There are two problems:
>   [*] related to i915 with modeset enabled; upon loading the kernel module with
> modeset=1, the box will instantly freeze.
>   [*] seemingly unrelated to i915; the box will randomly freeze without any
> clear indication of why and moreover no apparent trigger.
> After months, nay, years of being "locked" into 3.0.2 for the random freezes and
> i915 problems, I started playing with the kernel again and out of sheer
> desperation installed the current debian testing kernel, based on 3.2.35.
> From what I could see, it worked fine... no more crashes, neither when loading
> i915, nor randomly after some time (well, at least not for a day.)
> This time out of frustration, I ripped the config file used by debian to build
> the kernel out of its deb package and rebuilt my (almost[1]) vanilla 3.7.2
> kernel with this configuration exactly, updated via the oldconfig target and
> changed to include AHCI, RAID and SCSI drivers statically, so that I wouldn't
> need some initramfs to boot my system ... and ... with this config, I am not
> experiencing any i915 problems nor system freezes?!
> I then tried to spot any "obvious" differences between the two config files and
> to "approximate" my config file to the debian config.
> Comparing the dmesg output from 3.7.2 built with the slightly modified debian
> config to my 3.7.2 built with my config, I came across IOMMU entries which
> differed. My kernel config enables Intel IOMMU by default, while the debian
> config doesn't.
> Looking up IOMMU stuff in Documentation/, I found out that IOMMU *may* have bugs
> with the internal graphics card and there is an option called
> intel_iommu=igfx_off to disable IOMMU remapping for the integrated graphics card...
> I tried booting "my" kernel with intel_iommu=igfx_off and lo and behold, no more
> crashes when loading i915 with modeset enabled! Yay... but anyway, that's
> definitely a kernel bug.
> Next, regarding the random freezes... so did the kernel booted with
> intel_iommu=igfx_off. It seems the random freeze issue is kind of decoupled of
> the graphics issue.
> Testing further, I rebooted using iommu=off and intel_iommu=off. So far, I had
> no random crashes, but the system uptime of XXXXREPLACEMEXXXX minutes is too
> small to draw conclusions yet.
> Anyway, booting with both options made my USB ports unusable. Also, my PCIe and
> PCI WiFi cards stopped working. Seems like the kernel can't enumerate those
> devices due to... guess what, DMA remapping errors!
> Note that the debian-config kernel with CONFIG_INTEL_IOMMU=y and
> CONFIG_INTEL_IOMMU_DEFAULT_ON=n did not produce such errors. Both my USB and
> WiFi cards have been working.
> Any idea why is that?
> As I'm not sure who to CC exactly, I'm adding both the i915 and Intel IOMMU
> maintainers Daniel and David.
> I have included several files:
>   [*] the "debianish" config file
>   [*] my current config file (IOMMU still on by default)
>   [*] dmesg for the kernel built with the "debianish" config file
>   [*] dmesg for the kernel built with "my" config file, no IOMMU options passed
>   [*] dmesg for the kernel built with "my" config file, intel_iommu=igfx_off passed
>   [*] dmesg for the kernel built with "my" config file, iommu=off and
> intel_iommu=off passed
> Hope we can squash those bugs!
> Best regards,
> Mihai
> [1] only one "external" patch applied to ath9k, totally unrelated to the rest of
> the system, just changing regulatory stuff.

Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

