[Nouveau] Addressing the problem of noisy GPUs under Nouveau

Martin Peres martin.peres at free.fr
Thu Nov 23 22:48:50 UTC 2017


On 23/11/17 10:06, John Hubbard wrote:
> On 11/22/2017 05:07 PM, Martin Peres wrote:
>> Hey,
>>
>> Thanks for your answer, Andy!
>>
>> On 22/11/17 04:06, Ilia Mirkin wrote:
>>> On Tue, Nov 21, 2017 at 8:29 PM, Andy Ritger <aritger at nvidia.com> wrote:
>>> Martin's question was very long, but it boils down to this:
>>>
>>> How do we compute the correct values to write into the e114/e118 pwm
>>> registers based on the VBIOS contents and current state of the board
>>> (like temperature).
>>
>> Unfortunately, it can also be the e11c/e120 couple, or 0x200d8/dc on
>> GF119+, or 0x200cd/d0 on Kepler+.
>>
>> At least, it looks like we know which PWM controler we need to drive, so
>> I did not want to muddy the water even more by giving register
>> addresses, rather concentrating on the problem at hand: How to compute
>> the duty value for the PWM controler.
>>
>>>
>>> We generally do this right, but appear to get it extra-wrong for certain GPUs.
>>
>> Yes... So far, we are always safe, but users tend to mind when their
>> computer sound like a jumbo jet at take off... Who would have thought? :D
>>
>> Anyway, looking forward to your answer!
>>
>> Cheers,
>> Martin
> 
> 
> Hi Martin,
> 
> One of our firmware engineers thinks that this looks a lot like PWM inversion.
> For some SKUs, the interpretation of the PWM duty cycle is inverted. That 
> would probably make it *very* difficult to find a sensible algorithm that 
> covered all the SKUs, given that some are inverted and others are not.
> 
> For the noisy GPUs, a very useful experiment would be to try inverting it, 
> like this:
> 
> 	pwmDutyCycle = pwmPeriod - pwmDutyCycle;
> 
> ...and then see if fan control starts behaving closer to how you've actually 
> programmed it.
> 
> Would that be easy enough to try out? It should help narrow down the
> problem at least.
>

Hey John,

Unfortunately, we know about PWM inversion, and one can know which mode
to use based on the GPIO entry associated to the fan (inverted). We have
had support for this in Nouveau for a long time. At the very least, this
is not the problem on my GF108.

I am certain that the problem I am seeing is related to this vbios table
I wrote about (BIT P, offset 0x18). It is used to compute what PWM duty
I should use for both 0 and 100% of the fan speed.

Computing the value for 0% fan speed is difficult because of
non-continuous nature of some of the functions[1], but I can always
over-approximate. However, I failed to accurately compute the duty I
need to write to get the 100% fan speed (I have cases where I greatly
over-estimate it...).

Could you please check out the vbios table I am pointing at? I am quite
sure that your documentation will be clearer than my babbling :D

Thanks,
Martin

[1] http://fs.mupuf.org/nvidia/fan_calib/pwm_offset.png


More information about the Nouveau mailing list