[PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

Christian König ckoenig.leichtzumerken at gmail.com
Wed Sep 8 09:38:13 UTC 2021


Am 08.09.21 um 11:29 schrieb Lazar, Lijo:
>
>
> On 9/8/2021 2:32 PM, Yu, Lang wrote:
>> [AMD Official Use Only]
>>
>>
>>
>>> -----Original Message-----
>>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>>> Sent: Wednesday, September 8, 2021 4:55 PM
>>> To: Yu, Lang <Lang.Yu at amd.com>; Christian König
>>> <ckoenig.leichtzumerken at gmail.com>; amd-gfx at lists.freedesktop.org
>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Huang, Ray
>>> <Ray.Huang at amd.com>; Tian Tao <tiantao6 at hisilicon.com>
>>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
>>>
>>>
>>>
>>> On 9/8/2021 1:14 PM, Yu, Lang wrote:
>>>> [AMD Official Use Only]
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>>>>> Sent: Wednesday, September 8, 2021 3:36 PM
>>>>> To: Christian König <ckoenig.leichtzumerken at gmail.com>; Yu, Lang
>>>>> <Lang.Yu at amd.com>; amd-gfx at lists.freedesktop.org
>>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Huang, Ray
>>>>> <Ray.Huang at amd.com>; Tian Tao <tiantao6 at hisilicon.com>
>>>>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
>>>>> warnings
>>>>>
>>>>>
>>>>>
>>>>> On 9/8/2021 12:07 PM, Christian König wrote:
>>>>>> Am 08.09.21 um 07:56 schrieb Lang Yu:
>>>>>>> sysfs_emit and sysfs_emit_at requrie a page boundary aligned buf
>>>>>>> address. Make them happy!
>>>>>>>
>>>>>>> Warning Log:
>>>>>>> [  492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0 [
>>>>>>> 492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765
>>>>>>> sysfs_emit_at+0x4a/0xa0
>>>>>>> [  492.654805] Call Trace:
>>>>>>> [  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu] [
>>>>>>> 492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu] [
>>>>>>> 492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu] [
>>>>>>> 492.659733]  ? preempt_schedule_common+0x18/0x30 [ 492.660713]
>>>>>>> smu_print_ppclk_levels+0x65/0x90 [amdgpu] [ 492.662107]
>>>>>>> amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu] [ 492.663620]
>>>>>>> dev_attr_show+0x1d/0x40
>>>>>>
>>>>>> Mhm, that at least partially doesn't looks like the right 
>>>>>> approach to me.
>>>>>>
>>>>>> Why do we have string printing and sysfs code in the hardware
>>>>>> version specific backend in the first place?
>>>>>>
>>>>>
>>>>> This is a callback meant for printing ASIC specific information to
>>>>> sysfs node. The buffer passed in sysfs read is passed as it is to 
>>>>> the callback API.
>>>>>
>>>>>> That stuff needs to be implemented for each hardware generation and
>>>>>> is now cluttered with sysfs buffer offset calculations.
>>>>>>
>>>>>
>>>>> Looks like the warning happened because of this usage.
>>>>>
>>>>>                   size = amdgpu_dpm_print_clock_levels(adev, 
>>>>> OD_SCLK, buf);
>>>>>                   size += amdgpu_dpm_print_clock_levels(adev, 
>>>>> OD_MCLK,
>>>>> buf+size);
>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>>>> OD_VDDC_CURVE, buf+size);
>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>>>> OD_VDDGFX_OFFSET, buf+size);
>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>>>> OD_RANGE,
>>>>> buf+size);
>>>>>                   size += amdgpu_dpm_print_clock_levels(adev, 
>>>>> OD_CCLK,
>>>>> buf+size);
>>>>>
>>>>>
>>>> [Yu, Lang]
>>>> Yes. So it is fine we just fix the caller 
>>>> amdgpu_get_pp_od_clk_voltage like
>>> following:
>>>>
>>>> static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
>>>>         struct device_attribute *attr,
>>>>         char *buf)
>>>> {
>>>>     struct drm_device *ddev = dev_get_drvdata(dev);
>>>>     struct amdgpu_device *adev = drm_to_adev(ddev);
>>>>     ssize_t size, offset;
>>>>     int ret, i;
>>>>     char temp_buf[512];
>>>>     char clock_type[] = {OD_SCLK, OD_MCLK, OD_VDDC_CURVE,
>>>>                          OD_VDDGFX_OFFSET, OD_RANGE, OD_CCLK};
>>>>
>>>>     if (amdgpu_in_reset(adev))
>>>>         return -EPERM;
>>>>     if (adev->in_suspend && !adev->in_runpm)
>>>>         return -EPERM;
>>>>
>>>>     ret = pm_runtime_get_sync(ddev->dev);
>>>>     if (ret < 0) {
>>>>         pm_runtime_put_autosuspend(ddev->dev);
>>>>         return ret;
>>>>     }
>>>>
>>>>     offset = 0;
>>>>
>>>>     if (adev->powerplay.pp_funcs->print_clock_levels) {
>>>>         for (i = 0; i < ARRAY_SIZE(clock_type); i++) {
>>>>             size = amdgpu_dpm_print_clock_levels(adev,
>>> clock_type[i], buf);
>>>>             if (offset + size > PAGE_SIZE)
>>>>                 break;
>>>>             memcpy(temp_buf + offset, buf, size);
>>>>             offset += size;
>>>>         }
>>>>         memcpy(buf, temp_buf, offset);
>>>>         size = offset;
>>>>     } else {
>>>>         size = sysfs_emit(buf, "\n");
>>>>     }
>>>>     pm_runtime_mark_last_busy(ddev->dev);
>>>>     pm_runtime_put_autosuspend(ddev->dev);
>>>>
>>>>     return size;
>>>> }
>>>>
>>> Prefer to avoid any extra stack or heap usage for buffer. Maybe 
>>> another arg to
>>> pass offset along with buf?
>>>
>> [Yu, Lang]
>> Actually, the buf address contains the offset(offset_in_page(buf)) .
>
> Though it's not a problem based on codeflow, static analysis tools 
> might complain.
>
>> Or we just rollback to sprintf/snprintf.
>>
>
> snprintf with (PAGE_SIZE-size) may be simpler. I think Darren took the 
> effort to convert these, he may have some other ideas.

This is not what I meant. See from the design point of view the 
print_clock_levels() callback is the bad idea to begin with.

What we should have instead is a callback which returns the clock as a 
value which is then printed in the amdgpu_get_pp_od_clk_voltage() function.

This avoids passing around the buffer and remaining size everywhere and 
also guarantees that the sysfs have unified printing over all hardware 
generations.

Regards,
Christian.



More information about the amd-gfx mailing list