[Nouveau] 4.20.0-rc3 nouveau/Quadro P2000 Mobile: runpm causing ACPI errors, lockups

Karol Herbst kherbst at redhat.com
Thu Nov 29 10:53:53 UTC 2018


On Thu, Nov 29, 2018 at 2:29 AM Michael S. Tsirkin <mst at redhat.com> wrote:
>
> On Thu, Nov 29, 2018 at 12:21:31AM +0100, Karol Herbst wrote:
> > this was already debugged and there is no point in searching inside
> > the Firmware. It's not a firmware bug or anything.
> >
> > The proper fix is to do something inside Nouveau so that we don't
> > upset the device and being able to runtime resume it again.
> >
> > The initial thing we do inside Nouveau to cause those issues is to run
> > that so called "DEVINIT" script inside the vbios to initialize the
> > GPU, problem is, it changes something on the PCIe configuration so
> > that the GPU isn't able to runtime resume anymore. I am in contact
> > with Nvidia about that issue and hopefully we get the proper answers.
> > When I was digging into that myself I was able to make the situation
> > more stable by setting the PCIE link speed to the boot defaults, but
> > that was still pretty unstable.
> >
> > Anyway, because the binary driver fails here as well (through
> > bumblebee and so on) there isn't much of reverse engineering we can do
> > besides guessing and trying it on literally every hardware until it
> > works.
> >
> > We also have an upstream bug for this issue:
> > https://bugzilla.kernel.org/show_bug.cgi?id=156341
>
> If you like I can probably dump the pcie registers on card
> and/or the pcie port under windows. The card works there :)
> Let me know.
>
> --
> MST

the problem is, we would need to know the registers right before
suspending the GPU. If someone would be able to trace all PCIe
register read and writes for the entire suspending/resume process,
that would be very helpful.


More information about the dri-devel mailing list