[PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
Lazar, Lijo
lijo.lazar at amd.com
Thu Sep 9 03:28:00 UTC 2021
On 9/9/2021 8:13 AM, Yu, Lang wrote:
> [AMD Official Use Only]
>
> So the final decision is rollback to scnprintf().
>
> If we can define our own helper functions like sysfs_emit/sysfs_emit_at
>
> but without page boundary aligned limitation to make life easier?
>
No, we do want to make it clear that this function is used for sysfs
files and make use of the extra checks provided by the sysfs_emit*
functions. Looking at the origins of sysf_emit_at() specifically, there
are indeed some cases of printing more than one values per file and
multi-line usage.
So I'm fine with your original patch. Maybe, you can make the intention
explicit by keeping the offset and buf start calculations in a separate
inline function.
smu_get_sysfs_buf()
Thanks,
Lijo
> Regards,
>
> Lang
>
> *From:* Powell, Darren <Darren.Powell at amd.com>
> *Sent:* Thursday, September 9, 2021 6:18 AM
> *To:* Christian König <ckoenig.leichtzumerken at gmail.com>; Lazar, Lijo
> <Lijo.Lazar at amd.com>; Yu, Lang <Lang.Yu at amd.com>;
> amd-gfx at lists.freedesktop.org
> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Huang, Ray
> <Ray.Huang at amd.com>; Tian Tao <tiantao6 at hisilicon.com>
> *Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
>
> [AMD Official Use Only]
>
> ------------------------------------------------------------------------
>
> *From:*Christian König <ckoenig.leichtzumerken at gmail.com
> <mailto:ckoenig.leichtzumerken at gmail.com>>
> *Sent:* Wednesday, September 8, 2021 8:43 AM
> *To:* Lazar, Lijo <Lijo.Lazar at amd.com <mailto:Lijo.Lazar at amd.com>>; Yu,
> Lang <Lang.Yu at amd.com <mailto:Lang.Yu at amd.com>>;
> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
> <amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>>
> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com
> <mailto:Alexander.Deucher at amd.com>>; Huang, Ray <Ray.Huang at amd.com
> <mailto:Ray.Huang at amd.com>>; Tian Tao <tiantao6 at hisilicon.com
> <mailto:tiantao6 at hisilicon.com>>; Powell, Darren <Darren.Powell at amd.com
> <mailto:Darren.Powell at amd.com>>
> *Subject:* Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
>
> Am 08.09.21 um 12:22 schrieb Lazar, Lijo:
> > On 9/8/2021 3:08 PM, Christian König wrote:
> >> Am 08.09.21 um 11:29 schrieb Lazar, Lijo:
> >>> On 9/8/2021 2:32 PM, Yu, Lang wrote:
> >>>> [AMD Official Use Only]
> >>>>> -----Original Message-----
> >>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com <mailto:Lijo.Lazar at amd.com>>
> >>>>> Sent: Wednesday, September 8, 2021 4:55 PM
> >>>>> To: Yu, Lang <Lang.Yu at amd.com <mailto:Lang.Yu at amd.com>>;
> Christian König
> >>>>> <ckoenig.leichtzumerken at gmail.com
> <mailto:ckoenig.leichtzumerken at gmail.com>>;
> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
> >>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com
> <mailto:Alexander.Deucher at amd.com>>; Huang, Ray
> >>>>> <Ray.Huang at amd.com <mailto:Ray.Huang at amd.com>>; Tian Tao
> <tiantao6 at hisilicon.com <mailto:tiantao6 at hisilicon.com>>
> >>>>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
> >>>>> warnings
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 9/8/2021 1:14 PM, Yu, Lang wrote:
> >>>>>> [AMD Official Use Only]
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com <mailto:Lijo.Lazar at amd.com>>
> >>>>>>> Sent: Wednesday, September 8, 2021 3:36 PM
> >>>>>>> To: Christian König <ckoenig.leichtzumerken at gmail.com
> <mailto:ckoenig.leichtzumerken at gmail.com>>; Yu, Lang
> >>>>>>> <Lang.Yu at amd.com <mailto:Lang.Yu at amd.com>>;
> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
> >>>>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com
> <mailto:Alexander.Deucher at amd.com>>; Huang, Ray
> >>>>>>> <Ray.Huang at amd.com <mailto:Ray.Huang at amd.com>>; Tian Tao
> <tiantao6 at hisilicon.com <mailto:tiantao6 at hisilicon.com>>
> >>>>>>> Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
> >>>>>>> warnings
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 9/8/2021 12:07 PM, Christian König wrote:
> >>>>>>>> Am 08.09.21 um 07:56 schrieb Lang Yu:
> >>>>>>>>> sysfs_emit and sysfs_emit_at requrie a page boundary aligned buf
> >>>>>>>>> address. Make them happy!
> >>>>>>>>>
> >>>>>>>>> Warning Log:
> >>>>>>>>> [ 492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0 [
> >>>>>>>>> 492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765
> >>>>>>>>> sysfs_emit_at+0x4a/0xa0
> >>>>>>>>> [ 492.654805] Call Trace:
> >>>>>>>>> [ 492.655353] ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu] [
> >>>>>>>>> 492.656780] vangogh_print_clk_levels+0x369/0x410 [amdgpu] [
> >>>>>>>>> 492.658245] vangogh_common_print_clk_levels+0x77/0x80 [amdgpu] [
> >>>>>>>>> 492.659733] ? preempt_schedule_common+0x18/0x30 [ 492.660713]
> >>>>>>>>> smu_print_ppclk_levels+0x65/0x90 [amdgpu] [ 492.662107]
> >>>>>>>>> amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu] [ 492.663620]
> >>>>>>>>> dev_attr_show+0x1d/0x40
> >>>>>>>>
> >>>>>>>> Mhm, that at least partially doesn't looks like the right
> >>>>>>>> approach to me.
> >>>>>>>>
> >>>>>>>> Why do we have string printing and sysfs code in the hardware
> >>>>>>>> version specific backend in the first place?
> >>>>>>>>
> >>>>>>>
> >>>>>>> This is a callback meant for printing ASIC specific information to
> >>>>>>> sysfs node. The buffer passed in sysfs read is passed as it is
> >>>>>>> to the callback API.
> >>>>>>>
> >>>>>>>> That stuff needs to be implemented for each hardware generation
> >>>>>>>> and
> >>>>>>>> is now cluttered with sysfs buffer offset calculations.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Looks like the warning happened because of this usage.
> >>>>>>>
> >>>>>>> size = amdgpu_dpm_print_clock_levels(adev,
> >>>>>>> OD_SCLK, buf);
> >>>>>>> size += amdgpu_dpm_print_clock_levels(adev,
> >>>>>>> OD_MCLK,
> >>>>>>> buf+size);
> >>>>>>> size += amdgpu_dpm_print_clock_levels(adev,
> >>>>>>> OD_VDDC_CURVE, buf+size);
> >>>>>>> size += amdgpu_dpm_print_clock_levels(adev,
> >>>>>>> OD_VDDGFX_OFFSET, buf+size);
> >>>>>>> size += amdgpu_dpm_print_clock_levels(adev,
> >>>>>>> OD_RANGE,
> >>>>>>> buf+size);
> >>>>>>> size += amdgpu_dpm_print_clock_levels(adev,
> >>>>>>> OD_CCLK,
> >>>>>>> buf+size);
> >>>>>>>
> >>>>>>>
> >>>>>> [Yu, Lang]
> >>>>>> Yes. So it is fine we just fix the caller
> >>>>>> amdgpu_get_pp_od_clk_voltage like
> >>>>> following:
> >>>>>>
> >>>>>> static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
> >>>>>> struct device_attribute *attr,
> >>>>>> char *buf)
> >>>>>> {
> >>>>>> struct drm_device *ddev = dev_get_drvdata(dev);
> >>>>>> struct amdgpu_device *adev = drm_to_adev(ddev);
> >>>>>> ssize_t size, offset;
> >>>>>> int ret, i;
> >>>>>> char temp_buf[512];
> >>>>>> char clock_type[] = {OD_SCLK, OD_MCLK, OD_VDDC_CURVE,
> >>>>>> OD_VDDGFX_OFFSET, OD_RANGE, OD_CCLK};
> >>>>>>
> >>>>>> if (amdgpu_in_reset(adev))
> >>>>>> return -EPERM;
> >>>>>> if (adev->in_suspend && !adev->in_runpm)
> >>>>>> return -EPERM;
> >>>>>>
> >>>>>> ret = pm_runtime_get_sync(ddev->dev);
> >>>>>> if (ret < 0) {
> >>>>>> pm_runtime_put_autosuspend(ddev->dev);
> >>>>>> return ret;
> >>>>>> }
> >>>>>>
> >>>>>> offset = 0;
> >>>>>>
> >>>>>> if (adev->powerplay.pp_funcs->print_clock_levels) {
> >>>>>> for (i = 0; i < ARRAY_SIZE(clock_type); i++) {
> >>>>>> size = amdgpu_dpm_print_clock_levels(adev,
> >>>>> clock_type[i], buf);
> >>>>>> if (offset + size > PAGE_SIZE)
> >>>>>> break;
> >>>>>> memcpy(temp_buf + offset, buf, size);
> >>>>>> offset += size;
> >>>>>> }
> >>>>>> memcpy(buf, temp_buf, offset);
> >>>>>> size = offset;
> >>>>>> } else {
> >>>>>> size = sysfs_emit(buf, "\n");
> >>>>>> }
> >>>>>> pm_runtime_mark_last_busy(ddev->dev);
> >>>>>> pm_runtime_put_autosuspend(ddev->dev);
> >>>>>>
> >>>>>> return size;
> >>>>>> }
> >>>>>>
> >>>>> Prefer to avoid any extra stack or heap usage for buffer. Maybe
> >>>>> another arg to
> >>>>> pass offset along with buf?
> >>>>>
> >>>> [Yu, Lang]
> >>>> Actually, the buf address contains the offset(offset_in_page(buf)) .
> >>>
> >>> Though it's not a problem based on codeflow, static analysis tools
> >>> might complain.
> >>>
> >>>> Or we just rollback to sprintf/snprintf.
> >>>>
> >>>
> >>> snprintf with (PAGE_SIZE-size) may be simpler. I think Darren took
> >>> the effort to convert these, he may have some other ideas.
> The changes I made were intended to simply replace snprintf with
> sysfs_emit as per
>
> linux/Documentation/filesystems/sysfs.rst:246,247
>
> - show() should only use sysfs_emit() or sysfs_emit_at() when formatting
>
> the value to be returned to user space.
>
> I specifically tried not to change the design, but only as I didn't have
>
> background in Power Management.
>
> >>
> >> This is not what I meant. See from the design point of view the
> >> print_clock_levels() callback is the bad idea to begin with.
> >>
> >> What we should have instead is a callback which returns the clock as
> >> a value which is then printed in the amdgpu_get_pp_od_clk_voltage()
> >> function.
> >>
> >> This avoids passing around the buffer and remaining size everywhere
> >> and also guarantees that the sysfs have unified printing over all
> >> hardware generations.
> >>
> >
> > The scenario is one node used for multiple parameters - OD_SCLK,
> > OD_CCLK, OD_VDDGFX_OFFSET etc.(mostly to avoid cluttering sysfs with
> > lots of nodes). On top of it, the parameters supported (for ex: CCLK
> > is not valid on dGPUs), the number of levels supported etc. vary
> > across ASICs. There has to be multiple calls or the call needs to
> > return multiple values for a single parameter (for ex: up to 4, 8 or
> > 16 levels of GFXCLK depending on ASIC).
>
> Well exactly that is questionable design for sysfs.
>
> See the sysfs_emit() and sysfs_emit_at() functions are designed the way
> they are because you should have only one value per file, which is then
> printed at exactly one location.
>
> Take a look at the documentation for sysfs for more details.
>
> > I don't know the history of the callback, mostly it was considered
> > more efficient to print it directly rather than fetch and print.
> > Alex/Evan may know the details.
>
> Yeah, somebody with a bit more background in power management needs to
> take a closer look at this here. Just keep me looped in.
>
> Regards,
> Christian.
>
> >
> > Thanks,
> > Lijo
> >
> >> Regards,
> >> Christian.
> >>
>
More information about the amd-gfx
mailing list