Device loses its IRQ number on driver unload?

Thomas Hellstrom thellstrom at vmware.com
Tue Mar 10 05:55:25 PDT 2015


On 03/09/2015 09:25 PM, Dave Airlie wrote:
> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom at vmware.com> wrote:
>> On 03/09/2015 04:22 PM, Daniel Vetter wrote:
>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote:
>>>> Hi,
>>>>
>>>> I'm not sure this started with 4.0 but when I rmmod the device driver
>>>> like so
>>>> rmmod vmwgfx
>>>>
>>>> The device loses its IRQ line as shown in lscpi:
>>>>    Flags: bus master, medium devsel, latency 64 <irq missing here>
>>>>
>>>> and a subsequent modprobe will fail since pdev->irq is 0.
>>>>
>>>> Is anyone else seeing this with other drivers?
>>> I seen occasionally (over the past couple of kernels) random zeros in pdev
>>> but dismissed it as broken machines or bugs in i915 (we have them ...).
>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no.
>>> -Daniel
>> OK. Thanks for the info. Since in my case this is 100% reproducible I
>> guess I have an excellent opportunity to bisect the problem :-/
>>
> does lspci -H1, or some option like to direct access hw show it?
>
> just whether this is the kernel copy or the hw register getting messed up.
>
> Dave.
Hi, Dave,

lspci -H1 indeed shows the IRQ number. It turns out that the commit
introduced in 4.0 breaking this is

b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit
commit b4b55cda587442477a3a9f0669e26bba4b7800c0
Author: Jiang Liu <jiang.liu at linux.intel.com>
Date:   Thu Feb 5 13:44:47 2015 +0800

    x86/PCI: Refine the way to release PCI IRQ resources


It's obvious from the commit message that unloading the driver *should*
drop the irq resource but its not
obvious what's reallocating that resource on driver load...

Anyway, it turns out that adding a
pci_disable_device(pdev) in the pci driver's remove() method
(vmw_remove() in my case) appears to fix the problem:
The device irq is removed on driver unload and enabled again on driver
load There appears to be no pci_disable_device() on driver exit in core drm.

However it still beats me why other drm drivers aren't seeing this, and
IMHO that commit should probably add a warning message if the pci device
isn't disabled on pci driver unload......

/Thomas



More information about the dri-devel mailing list