[Nouveau] Addressing the problem of noisy GPUs under Nouveau

Ilia Mirkin imirkin at alum.mit.edu
Wed Nov 22 02:06:01 UTC 2017


On Tue, Nov 21, 2017 at 8:29 PM, Andy Ritger <aritger at nvidia.com> wrote:
> Hi Martin,

Martin should have complete answers,

>
> I was asked to clarify a few things:
>
> (1) Are all the user reports of loud fans on Fermi-era GPUs?

Yes. Although I believe some GK208 users are also having trouble,
including yours truly. (It's been quite a while since I've checked
though... my memory is weak in that regard.)

>
> (2) When the VBIOS POSTs the card, it loads initial ucode onto the Falcon
> processor (PMU), which will do basic fan management on its own.  We call this
> init ucode "IFR" (Init From ROM).  nvidia.ko will restore the IFR ucode when
> unloaded.  I assume the loud fan symptom occurs after Nouveau is loaded and
> running, correct?  I.e., this is a problem in Nouveau's fan control
> programming, rather than a problem in IFR.

Correct.

>
> (3) IFR will run until something else is loaded on the Falcon processor (PMU).
> On Fermi, I assume the Nouveau kernel driver is uploading the Nouveau-written
> ucode from here:
>
>     drivers/gpu/drm/nouveau/nvkm/subdev/pmu/fuc
>
> correct?  I only ask to rule out the possibility that IFR and Nouveau are both
> attempting to program fans simultaneously.  The symptoms you describe don't
> sound like that, but just double checking...

Correct.

>
> (4) Given the PMU ucode debacle, I'm embarrassed to ask, but at least on Fermi,
> how much does Nouveau strictly depend on Nouveau's PMU ucode?  Would it be an
> option to just let IFR continue to manage fans?

Reclocking is still on our horizon, which clearly won't happen without
nouveau PMU code loaded. Not sure what it's used for until reclocking
becomes a thing on Fermi.

>
> (5) Lastly, I was asked how Nouveau determines what fan speed to (attempt
> to) program.

I'll let Martin answer this, but as you're probably aware, there's 2
different ways this can be done - there might be a PWM, we might have
to toggle it manually. Maybe something else still.

Have a look at drm/nouveau/nvkm/subdev/therm/fan.c and the various
bits it ends up calling (pre-GF119 fermi's end up with the nv50
fan_set, I believe).

The bios stuff is parsed in nvkm/subdev/bios/fan.c and therm.c,
although I believe Martin's latest analysis is more advanced than
what's in that code.

Martin's question was very long, but it boils down to this:

How do we compute the correct values to write into the e114/e118 pwm
registers based on the VBIOS contents and current state of the board
(like temperature).

We generally do this right, but appear to get it extra-wrong for certain GPUs.

Cheers,

  -ilia


More information about the Nouveau mailing list