[PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

Christian König ckoenig.leichtzumerken at gmail.com
Thu Sep 9 08:01:12 UTC 2021


Am 09.09.21 um 05:28 schrieb Lazar, Lijo:
> On 9/9/2021 8:13 AM, Yu, Lang wrote:
>> [AMD Official Use Only]
>>
>> So the final decision is rollback to scnprintf().
>>
>> If we can define our own helper functions like sysfs_emit/sysfs_emit_at
>>
>> but without page boundary aligned limitation to make life easier?
>>
>
> No, we do want to make it clear that this function is used for sysfs 
> files and make use of the extra checks provided by the sysfs_emit* 
> functions. Looking at the origins of sysf_emit_at() specifically, 
> there are indeed some cases of printing more than one values per file 
> and multi-line usage.

Correct, but those are rather limited and well documented special cases. 
E.g. for example if you need to grab a lock to get multiple values which 
are supposed to be coherent to each other.

I think that's the case here, so printing multiple values is probably ok 
in general. But we still need to get the implementation straight.

> So I'm fine with your original patch. Maybe, you can make the 
> intention explicit by keeping the offset and buf start calculations in 
> a separate inline function.
>     smu_get_sysfs_buf()

Exactly that is what is not ok. So once more the intended use case of 
those functions is:

offs = sysfs_emit(page, ...);
offs += sysfs_emit_at(page, offs, ....);
offs += sysfs_emit_at(page, offs, ....);
...

Another possible alternative which I think should be allowed is:

offs = 0;
for_each_clock_in_my_device(..) {
     offs += sysfs_emit_at(page, offs, ....);
}

But when you are calculating the initial offset manually then there is 
certainly something wrong here and that is not the intended usage pattern.

Regards,
Christian.

>
> Thanks,
> Lijo
>
>> Regards,
>>
>> Lang
>>
>> *From:* Powell, Darren <Darren.Powell at amd.com>
>> *Sent:* Thursday, September 9, 2021 6:18 AM
>> *To:* Christian König <ckoenig.leichtzumerken at gmail.com>; Lazar, Lijo 
>> <Lijo.Lazar at amd.com>; Yu, Lang <Lang.Yu at amd.com>; 
>> amd-gfx at lists.freedesktop.org
>> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Huang, Ray 
>> <Ray.Huang at amd.com>; Tian Tao <tiantao6 at hisilicon.com>
>> *Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
>>
>> [AMD Official Use Only]
>>
>> ------------------------------------------------------------------------
>>
>> *From:*Christian König <ckoenig.leichtzumerken at gmail.com 
>> <mailto:ckoenig.leichtzumerken at gmail.com>>
>> *Sent:* Wednesday, September 8, 2021 8:43 AM
>> *To:* Lazar, Lijo <Lijo.Lazar at amd.com <mailto:Lijo.Lazar at amd.com>>; 
>> Yu, Lang <Lang.Yu at amd.com <mailto:Lang.Yu at amd.com>>; 
>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org> 
>> <amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>>
>> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com 
>> <mailto:Alexander.Deucher at amd.com>>; Huang, Ray <Ray.Huang at amd.com 
>> <mailto:Ray.Huang at amd.com>>; Tian Tao <tiantao6 at hisilicon.com 
>> <mailto:tiantao6 at hisilicon.com>>; Powell, Darren 
>> <Darren.Powell at amd.com <mailto:Darren.Powell at amd.com>>
>> *Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
>>
>> Am 08.09.21 um 12:22 schrieb Lazar, Lijo:
>>  > On 9/8/2021 3:08 PM, Christian König wrote:
>>  >> Am 08.09.21 um 11:29 schrieb Lazar, Lijo:
>>  >>> On 9/8/2021 2:32 PM, Yu, Lang wrote:
>>  >>>> [AMD Official Use Only]
>>  >>>>> -----Original Message-----
>>  >>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com 
>> <mailto:Lijo.Lazar at amd.com>>
>>  >>>>> Sent: Wednesday, September 8, 2021 4:55 PM
>>  >>>>> To: Yu, Lang <Lang.Yu at amd.com <mailto:Lang.Yu at amd.com>>; 
>> Christian König
>>  >>>>> <ckoenig.leichtzumerken at gmail.com 
>> <mailto:ckoenig.leichtzumerken at gmail.com>>; 
>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>>  >>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com 
>> <mailto:Alexander.Deucher at amd.com>>; Huang, Ray
>>  >>>>> <Ray.Huang at amd.com <mailto:Ray.Huang at amd.com>>; Tian Tao 
>> <tiantao6 at hisilicon.com <mailto:tiantao6 at hisilicon.com>>
>>  >>>>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
>>  >>>>> warnings
>>  >>>>>
>>  >>>>>
>>  >>>>>
>>  >>>>> On 9/8/2021 1:14 PM, Yu, Lang wrote:
>>  >>>>>> [AMD Official Use Only]
>>  >>>>>>
>>  >>>>>>
>>  >>>>>>
>>  >>>>>>> -----Original Message-----
>>  >>>>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com 
>> <mailto:Lijo.Lazar at amd.com>>
>>  >>>>>>> Sent: Wednesday, September 8, 2021 3:36 PM
>>  >>>>>>> To: Christian König <ckoenig.leichtzumerken at gmail.com 
>> <mailto:ckoenig.leichtzumerken at gmail.com>>; Yu, Lang
>>  >>>>>>> <Lang.Yu at amd.com <mailto:Lang.Yu at amd.com>>; 
>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>>  >>>>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com 
>> <mailto:Alexander.Deucher at amd.com>>; Huang, Ray
>>  >>>>>>> <Ray.Huang at amd.com <mailto:Ray.Huang at amd.com>>; Tian Tao 
>> <tiantao6 at hisilicon.com <mailto:tiantao6 at hisilicon.com>>
>>  >>>>>>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
>>  >>>>>>> warnings
>>  >>>>>>>
>>  >>>>>>>
>>  >>>>>>>
>>  >>>>>>> On 9/8/2021 12:07 PM, Christian König wrote:
>>  >>>>>>>> Am 08.09.21 um 07:56 schrieb Lang Yu:
>>  >>>>>>>>> sysfs_emit and sysfs_emit_at requrie a page boundary 
>> aligned buf
>>  >>>>>>>>> address. Make them happy!
>>  >>>>>>>>>
>>  >>>>>>>>> Warning Log:
>>  >>>>>>>>> [  492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde 
>> at:0 [
>>  >>>>>>>>> 492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765
>>  >>>>>>>>> sysfs_emit_at+0x4a/0xa0
>>  >>>>>>>>> [  492.654805] Call Trace:
>>  >>>>>>>>> [  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 
>> [amdgpu] [
>>  >>>>>>>>> 492.656780] vangogh_print_clk_levels+0x369/0x410 [amdgpu] [
>>  >>>>>>>>> 492.658245] vangogh_common_print_clk_levels+0x77/0x80 
>> [amdgpu] [
>>  >>>>>>>>> 492.659733]  ? preempt_schedule_common+0x18/0x30 [ 
>> 492.660713]
>>  >>>>>>>>> smu_print_ppclk_levels+0x65/0x90 [amdgpu] [ 492.662107]
>>  >>>>>>>>> amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu] [ 
>> 492.663620]
>>  >>>>>>>>> dev_attr_show+0x1d/0x40
>>  >>>>>>>>
>>  >>>>>>>> Mhm, that at least partially doesn't looks like the right
>>  >>>>>>>> approach to me.
>>  >>>>>>>>
>>  >>>>>>>> Why do we have string printing and sysfs code in the hardware
>>  >>>>>>>> version specific backend in the first place?
>>  >>>>>>>>
>>  >>>>>>>
>>  >>>>>>> This is a callback meant for printing ASIC specific 
>> information to
>>  >>>>>>> sysfs node. The buffer passed in sysfs read is passed as it is
>>  >>>>>>> to the callback API.
>>  >>>>>>>
>>  >>>>>>>> That stuff needs to be implemented for each hardware 
>> generation
>>  >>>>>>>> and
>>  >>>>>>>> is now cluttered with sysfs buffer offset calculations.
>>  >>>>>>>>
>>  >>>>>>>
>>  >>>>>>> Looks like the warning happened because of this usage.
>>  >>>>>>>
>>  >>>>>>>                   size = amdgpu_dpm_print_clock_levels(adev,
>>  >>>>>>> OD_SCLK, buf);
>>  >>>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>  >>>>>>> OD_MCLK,
>>  >>>>>>> buf+size);
>>  >>>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>  >>>>>>> OD_VDDC_CURVE, buf+size);
>>  >>>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>  >>>>>>> OD_VDDGFX_OFFSET, buf+size);
>>  >>>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>  >>>>>>> OD_RANGE,
>>  >>>>>>> buf+size);
>>  >>>>>>>                   size += amdgpu_dpm_print_clock_levels(adev,
>>  >>>>>>> OD_CCLK,
>>  >>>>>>> buf+size);
>>  >>>>>>>
>>  >>>>>>>
>>  >>>>>> [Yu, Lang]
>>  >>>>>> Yes. So it is fine we just fix the caller
>>  >>>>>> amdgpu_get_pp_od_clk_voltage like
>>  >>>>> following:
>>  >>>>>>
>>  >>>>>> static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
>>  >>>>>>         struct device_attribute *attr,
>>  >>>>>>         char *buf)
>>  >>>>>> {
>>  >>>>>>     struct drm_device *ddev = dev_get_drvdata(dev);
>>  >>>>>>     struct amdgpu_device *adev = drm_to_adev(ddev);
>>  >>>>>>     ssize_t size, offset;
>>  >>>>>>     int ret, i;
>>  >>>>>>     char temp_buf[512];
>>  >>>>>>     char clock_type[] = {OD_SCLK, OD_MCLK, OD_VDDC_CURVE,
>>  >>>>>> OD_VDDGFX_OFFSET, OD_RANGE, OD_CCLK};
>>  >>>>>>
>>  >>>>>>     if (amdgpu_in_reset(adev))
>>  >>>>>>         return -EPERM;
>>  >>>>>>     if (adev->in_suspend && !adev->in_runpm)
>>  >>>>>>         return -EPERM;
>>  >>>>>>
>>  >>>>>>     ret = pm_runtime_get_sync(ddev->dev);
>>  >>>>>>     if (ret < 0) {
>>  >>>>>> pm_runtime_put_autosuspend(ddev->dev);
>>  >>>>>>         return ret;
>>  >>>>>>     }
>>  >>>>>>
>>  >>>>>>     offset = 0;
>>  >>>>>>
>>  >>>>>>     if (adev->powerplay.pp_funcs->print_clock_levels) {
>>  >>>>>>         for (i = 0; i < ARRAY_SIZE(clock_type); i++) {
>>  >>>>>>             size = amdgpu_dpm_print_clock_levels(adev,
>>  >>>>> clock_type[i], buf);
>>  >>>>>>             if (offset + size > PAGE_SIZE)
>>  >>>>>>                 break;
>>  >>>>>>             memcpy(temp_buf + offset, buf, size);
>>  >>>>>>             offset += size;
>>  >>>>>>         }
>>  >>>>>>         memcpy(buf, temp_buf, offset);
>>  >>>>>>         size = offset;
>>  >>>>>>     } else {
>>  >>>>>>         size = sysfs_emit(buf, "\n");
>>  >>>>>>     }
>>  >>>>>>     pm_runtime_mark_last_busy(ddev->dev);
>>  >>>>>>     pm_runtime_put_autosuspend(ddev->dev);
>>  >>>>>>
>>  >>>>>>     return size;
>>  >>>>>> }
>>  >>>>>>
>>  >>>>> Prefer to avoid any extra stack or heap usage for buffer. Maybe
>>  >>>>> another arg to
>>  >>>>> pass offset along with buf?
>>  >>>>>
>>  >>>> [Yu, Lang]
>>  >>>> Actually, the buf address contains the 
>> offset(offset_in_page(buf)) .
>>  >>>
>>  >>> Though it's not a problem based on codeflow, static analysis tools
>>  >>> might complain.
>>  >>>
>>  >>>> Or we just rollback to sprintf/snprintf.
>>  >>>>
>>  >>>
>>  >>> snprintf with (PAGE_SIZE-size) may be simpler. I think Darren took
>>  >>> the effort to convert these, he may have some other ideas.
>> The changes I made were intended to simply replace snprintf with 
>> sysfs_emit as per
>>
>> linux/Documentation/filesystems/sysfs.rst:246,247
>>
>>   -  show() should only use sysfs_emit() or sysfs_emit_at() when 
>> formatting
>>
>>      the value to be returned to user space.
>>
>> I specifically tried not to change the design, but only as I didn't have
>>
>> background in Power Management.
>>
>>  >>
>>  >> This is not what I meant. See from the design point of view the
>>  >> print_clock_levels() callback is the bad idea to begin with.
>>  >>
>>  >> What we should have instead is a callback which returns the clock as
>>  >> a value which is then printed in the amdgpu_get_pp_od_clk_voltage()
>>  >> function.
>>  >>
>>  >> This avoids passing around the buffer and remaining size everywhere
>>  >> and also guarantees that the sysfs have unified printing over all
>>  >> hardware generations.
>>  >>
>>  >
>>  > The scenario is one node used for multiple parameters - OD_SCLK,
>>  > OD_CCLK, OD_VDDGFX_OFFSET etc.(mostly to avoid cluttering sysfs with
>>  > lots of nodes). On top of it, the parameters supported (for ex: CCLK
>>  > is not valid on dGPUs),  the number of levels supported etc. vary
>>  > across ASICs. There has to be multiple calls or the call needs to
>>  > return multiple values for a single parameter (for ex: up to 4, 8 or
>>  > 16 levels of GFXCLK depending on ASIC).
>>
>> Well exactly that is questionable design for sysfs.
>>
>> See the sysfs_emit() and sysfs_emit_at() functions are designed the way
>> they are because you should have only one value per file, which is then
>> printed at exactly one location.
>>
>> Take a look at the documentation for sysfs for more details.
>>
>>  > I don't know the history of the callback, mostly it was considered
>>  > more efficient to print it directly rather than fetch and print.
>>  > Alex/Evan may know the details.
>>
>> Yeah, somebody with a bit more background in power management needs to
>> take a closer look at this here. Just keep me looped in.
>>
>> Regards,
>> Christian.
>>
>>  >
>>  > Thanks,
>>  > Lijo
>>  >
>>  >> Regards,
>>  >> Christian.
>>  >>
>>



More information about the amd-gfx mailing list