[Nouveau] Addressing the problem of noisy GPUs under Nouveau

Karol Herbst kherbst at redhat.com
Wed Nov 22 03:55:39 UTC 2017


On Wed, Nov 22, 2017 at 3:06 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> On Tue, Nov 21, 2017 at 8:29 PM, Andy Ritger <aritger at nvidia.com> wrote:
>> Hi Martin,
>
> Martin should have complete answers,
>
>>
>> I was asked to clarify a few things:
>>
>> (1) Are all the user reports of loud fans on Fermi-era GPUs?
>
> Yes. Although I believe some GK208 users are also having trouble,
> including yours truly. (It's been quite a while since I've checked
> though... my memory is weak in that regard.)
>

I think there are some Keplers where we drive the fans too loud? Maybe
it got fixed, but I am sure some users complaint about this on Kepler
GPUs.

>>
>> (2) When the VBIOS POSTs the card, it loads initial ucode onto the Falcon
>> processor (PMU), which will do basic fan management on its own.  We call this
>> init ucode "IFR" (Init From ROM).  nvidia.ko will restore the IFR ucode when
>> unloaded.  I assume the loud fan symptom occurs after Nouveau is loaded and
>> running, correct?  I.e., this is a problem in Nouveau's fan control
>> programming, rather than a problem in IFR.
>
> Correct.
>
>>
>> (3) IFR will run until something else is loaded on the Falcon processor (PMU).
>> On Fermi, I assume the Nouveau kernel driver is uploading the Nouveau-written
>> ucode from here:
>>
>>     drivers/gpu/drm/nouveau/nvkm/subdev/pmu/fuc
>>
>> correct?  I only ask to rule out the possibility that IFR and Nouveau are both
>> attempting to program fans simultaneously.  The symptoms you describe don't
>> sound like that, but just double checking...
>
> Correct.
>
>>
>> (4) Given the PMU ucode debacle, I'm embarrassed to ask, but at least on Fermi,
>> how much does Nouveau strictly depend on Nouveau's PMU ucode?  Would it be an
>> option to just let IFR continue to manage fans?
>
> Reclocking is still on our horizon, which clearly won't happen without
> nouveau PMU code loaded. Not sure what it's used for until reclocking
> becomes a thing on Fermi.
>

well I plan to use the PMU for the PMU counters readout code. Not that
it matters much on Fermi...

>>
>> (5) Lastly, I was asked how Nouveau determines what fan speed to (attempt
>> to) program.
>
> I'll let Martin answer this, but as you're probably aware, there's 2
> different ways this can be done - there might be a PWM, we might have
> to toggle it manually. Maybe something else still.
>
> Have a look at drm/nouveau/nvkm/subdev/therm/fan.c and the various
> bits it ends up calling (pre-GF119 fermi's end up with the nv50
> fan_set, I believe).
>
> The bios stuff is parsed in nvkm/subdev/bios/fan.c and therm.c,
> although I believe Martin's latest analysis is more advanced than
> what's in that code.
>
> Martin's question was very long, but it boils down to this:
>
> How do we compute the correct values to write into the e114/e118 pwm
> registers based on the VBIOS contents and current state of the board
> (like temperature).
>
> We generally do this right, but appear to get it extra-wrong for certain GPUs.
>

well short answer is: Nouveau parses the vbios and see what it has to
do. Apparently it is wrong in some cases. I don't think there is
anything else Nouveau tries to do like having its own curves for
calculating fan speeds or so.

> Cheers,
>
>   -ilia
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


More information about the Nouveau mailing list