Looking for a start point for fixing a bug

Адонай Элохим algonkvel at gmail.com
Sun Aug 10 21:03:57 PDT 2014


2014-08-11 1:53 GMT+04:00 Niels Ole Salscheider
<niels_ole at salscheider-online.de>:
> On Monday 11 August 2014, 01:19:32, Адонай Элохим wrote:
>> Hello again, hope you are still reading my texts...
>>
>> I digged through the code and narrowed down the issue I wanted to fix.
>> It appears to be related to the `bool thermal_active` dpm struct
>> member and this piece of code:
>>
>> if (rdev->asic->dpm.force_performance_level) {
>>         if (rdev->pm.dpm.thermal_active) {
>>             enum radeon_dpm_forced_level level = rdev->pm.dpm.forced_level;
>>             /* force low perf level for thermal */
>>             radeon_dpm_force_performance_level(rdev,
>> RADEON_DPM_FORCED_LEVEL_LOW);
>>             /* save the user's level */
>>             rdev->pm.dpm.forced_level = level;
>>         } else {
>>             /* otherwise, user selected level */
>>             radeon_dpm_force_performance_level(rdev,
>> rdev->pm.dpm.forced_level); }
>>     }
>>
>> I did a double check here - at boot `thermal_active` is `false` and
>> thus, performance level is properly initiated. But at resume from
>> suspend `thermal_active` is true and performance level is strictly
>> bound to low profile.
>> Besides you cannot change it via echo 1 > /sys/.../force_dpm_level,
>> again thanks to `thermal_active` checked there.
>>
>> Could you explain meaning of this small boolean to me? I'd like to
>> make a small neat patch fixing this, but I'm scared of doing it in
>> wrong way.
>> Sorry if I'm being too persistent.
>
> I think thermal_active means that the temperature got too high so that low
> clocks have to be used.
>
> Just some idea, but thermal.work only gets scheduled when the high to low
> temperature interrupt occurs. When the temperature is too high before suspend
> (so that thermal_active is true) and it gets low during standby this interrupt
> will not occur. thermal.work is therefore not scheduled...
>
> Ole
>

You were right, Ole. The driver thinks the temperature is too high.
Thanks a lot!
It seems the function ci_set_thermal_temperature_range is missing some lines:


diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c
index 584090a..102a4bc 100644
--- a/radeon/ci_dpm.c
+++ b/radeon/ci_dpm.c
@@ -869,6 +869,9 @@ static int ci_set_thermal_temperature_range(struct
radeon_device *rdev,
        WREG32_SMC(CG_THERMAL_CTRL, tmp);
 #endif

+    rdev->pm.dpm.thermal.min_temp = low_temp;
+    rdev->pm.dpm.thermal.max_temp = high_temp;
+
        return 0;
 }


All other similar callbacks for different families of cards have these
lines. I wonder if there is any specific case for not doing this...
How do I propose it as a patch anyway?

>> Thanks,
>> Oleg
>>
>> 2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher at gmail.com>:
>> > On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel at gmail.com>
> wrote:
>> >> Hello all!
>> >>
>> >> I have some spare time and knowledge in C to try to fix some bugs I am
>> >> seeing on my machine.
>> >> So I've checked out and compiled all git trees that I may need and now
>> >> I'm
>> >> beginning to read articles.
>> >>
>> >> And this is the point from where I don't know where to go. I want to fix
>> >> particular bug #79806 [1].
>> >> For me there are many places where this bug can hide - mesa? dri? radeon
>> >> kernel module? and I just don't know whether should I start reading
>> >> articles about mesa hacking or about dri architecture or about kernel
>> >> module development.
>> >>
>> >> Now I think the best thing for me is to start looking through radeon
>> >> kernel
>> >> module code (I've got ingenious idea that power management resides there)
>> >> and read more about its architecture. Is this right? I mean, I just want
>> >> to
>> >> find out, is this a right place to start looking at for this bug?
>> >
>> > The power management is handled in the kernel driver.  See radeon_pm.c
>> > and the relevant *_dpm.c files depending on what asic you have.
>> >
>> > Alex
>> >
>> >> P.S. Sorry for my English in case it's not good, I'm learning it now
>> >>
>> >> P.P.S. And thanks for your hard work!
>> >>
>> >> -------------------------------------------
>> >> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>> >>
>> >> _______________________________________________
>> >> dri-devel mailing list
>> >> dri-devel at lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>


More information about the dri-devel mailing list