[Nouveau] Addressing the problem of noisy GPUs under Nouveau

Martin Peres martin.peres at free.fr
Mon Jan 29 17:24:07 UTC 2018


On 29/01/18 09:51, John Hubbard wrote:
> On 01/28/2018 04:05 PM, Martin Peres wrote:
>> On 29/01/18 01:24, Martin Peres wrote:
>>> On 28/11/17 07:32, John Hubbard wrote:
>>>> On 11/23/2017 02:48 PM, Martin Peres wrote:
>>>>> On 23/11/17 10:06, John Hubbard wrote:
>>>>>> On 11/22/2017 05:07 PM, Martin Peres wrote:
>>>>>>> Hey,
>>>>>>>
>>>>>>> Thanks for your answer, Andy!
>>>>>>>
>>>>>>> On 22/11/17 04:06, Ilia Mirkin wrote:
>>>>>>>> On Tue, Nov 21, 2017 at 8:29 PM, Andy Ritger <aritger at nvidia.com> wrote:
>>>>>>>> Martin's question was very long, but it boils down to this:
>>>>>>>>
>>>>>>>> How do we compute the correct values to write into the e114/e118 pwm
>>>>>>>> registers based on the VBIOS contents and current state of the board
>>>>>>>> (like temperature).
>>>>>>>
>>>>>>> Unfortunately, it can also be the e11c/e120 couple, or 0x200d8/dc on
>>>>>>> GF119+, or 0x200cd/d0 on Kepler+.
>>>>>>>
>>>>>>> At least, it looks like we know which PWM controler we need to drive, so
>>>>>>> I did not want to muddy the water even more by giving register
>>>>>>> addresses, rather concentrating on the problem at hand: How to compute
>>>>>>> the duty value for the PWM controler.
>>>>>>>
>>>>>>>>
>>>>>>>> We generally do this right, but appear to get it extra-wrong for certain GPUs.
>>>>>>>
>>>>>>> Yes... So far, we are always safe, but users tend to mind when their
>>>>>>> computer sound like a jumbo jet at take off... Who would have thought? :D
>>>>>>>
>>>>>>> Anyway, looking forward to your answer!
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> Hi Martin,
>>>>>>
>>>>>> One of our firmware engineers thinks that this looks a lot like PWM inversion.
>>>>>> For some SKUs, the interpretation of the PWM duty cycle is inverted. That 
>>>>>> would probably make it *very* difficult to find a sensible algorithm that 
>>>>>> covered all the SKUs, given that some are inverted and others are not.
>>>>>>
>>>>>> For the noisy GPUs, a very useful experiment would be to try inverting it, 
>>>>>> like this:
>>>>>>
>>>>>> 	pwmDutyCycle = pwmPeriod - pwmDutyCycle;
>>>>>>
>>>>>> ...and then see if fan control starts behaving closer to how you've actually 
>>>>>> programmed it.
>>>>>>
>>>>>> Would that be easy enough to try out? It should help narrow down the
>>>>>> problem at least.
>>>>>>
>>>>>
>>>>> Hey John,
>>>>>
>>>>> Unfortunately, we know about PWM inversion, and one can know which mode
>>>>> to use based on the GPIO entry associated to the fan (inverted). We have
>>>>> had support for this in Nouveau for a long time. At the very least, this
>>>>> is not the problem on my GF108.
>>>>>
>>>>> I am certain that the problem I am seeing is related to this vbios table
>>>>> I wrote about (BIT P, offset 0x18). It is used to compute what PWM duty
>>>>> I should use for both 0 and 100% of the fan speed.
>>>>>
>>>>> Computing the value for 0% fan speed is difficult because of
>>>>> non-continuous nature of some of the functions[1], but I can always
>>>>> over-approximate. However, I failed to accurately compute the duty I
>>>>> need to write to get the 100% fan speed (I have cases where I greatly
>>>>> over-estimate it...).
>>>>>
>>>>> Could you please check out the vbios table I am pointing at? I am quite
>>>>> sure that your documentation will be clearer than my babbling :D
>>>>
>>>> Yes. We will check on this. There has been some productive discussion 
>>>> internally, but it will take some more investigation.
>>>>
>>>> thanks,
>>>> John Hubbard
>>>
>>> Have the productive discussions panned out?
> 
> Yes, we concluded our discussions, and decided that I should study the situation 
> and write some documentation.  I just finished my research and writeup late last Friday, 
> though, so my colleagues haven't had a chance to review it. Not to put undue
> pressure on them, but I'm hoping that will go quickly now. The long pole is
> done. :)
> 
> I was going to wait until the review was done, to respond, but I wanted to ACK 
> this and to let you know that I do realize that the tables below are not directly 
> answering your question.
> 
> (What happened here is: the new tables below are not actually what I've 
> personally been working on; they just happen to be a very good set of supporting 
> documentation in the exact same area. One of our teammates was already working 
> on these independently, and managed to get them released.)

Thanks for the information and your work, it is greatly appreciated.

No need to hurry, I will be away from home for 2 weeks.

Thanks,
Martin
> 
> thanks,
> 



More information about the Nouveau mailing list