[Nouveau] 4.20.0-rc3 nouveau/Quadro P2000 Mobile: runpm causing ACPI errors, lockups

Michael S. Tsirkin mst at redhat.com
Thu Nov 29 17:12:19 UTC 2018


On Thu, Nov 29, 2018 at 11:53:53AM +0100, Karol Herbst wrote:
> On Thu, Nov 29, 2018 at 2:29 AM Michael S. Tsirkin <mst at redhat.com> wrote:
> >
> > On Thu, Nov 29, 2018 at 12:21:31AM +0100, Karol Herbst wrote:
> > > this was already debugged and there is no point in searching inside
> > > the Firmware. It's not a firmware bug or anything.
> > >
> > > The proper fix is to do something inside Nouveau so that we don't
> > > upset the device and being able to runtime resume it again.
> > >
> > > The initial thing we do inside Nouveau to cause those issues is to run
> > > that so called "DEVINIT" script inside the vbios to initialize the
> > > GPU, problem is, it changes something on the PCIe configuration so
> > > that the GPU isn't able to runtime resume anymore. I am in contact
> > > with Nvidia about that issue and hopefully we get the proper answers.
> > > When I was digging into that myself I was able to make the situation
> > > more stable by setting the PCIE link speed to the boot defaults, but
> > > that was still pretty unstable.
> > >
> > > Anyway, because the binary driver fails here as well (through
> > > bumblebee and so on) there isn't much of reverse engineering we can do
> > > besides guessing and trying it on literally every hardware until it
> > > works.
> > >
> > > We also have an upstream bug for this issue:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=156341
> >
> > If you like I can probably dump the pcie registers on card
> > and/or the pcie port under windows. The card works there :)
> > Let me know.
> >
> > --
> > MST
> 
> the problem is, we would need to know the registers right before
> suspending the GPU. If someone would be able to trace all PCIe
> register read and writes for the entire suspending/resume process,
> that would be very helpful.


Well I can pass the card to a VM, and trace it on the hypervisor, that
isn't a problem.  A tricky thing is the ACPI tables, would need to
somehow know which ones are relevant to pass them to guest ... ideas on
that?

-- 
MST


More information about the dri-devel mailing list