[PATCH] drm/amd/display: port the workload profile setting logic into dm

Thu Mar 27 14:08:59 UTC 2025

On Thu, Mar 27, 2025 at 4:22 AM Feng, Kenneth <Kenneth.Feng at amd.com> wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> -----Original Message-----
> From: Alex Deucher <alexdeucher at gmail.com>
> Sent: Wednesday, March 26, 2025 11:08 PM
> To: Feng, Kenneth <Kenneth.Feng at amd.com>
> Cc: amd-gfx at lists.freedesktop.org; Wang, Yang(Kevin) <KevinYang.Wang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Wentland, Harry <Harry.Wentland at amd.com>
> Subject: Re: [PATCH] drm/amd/display: port the workload profile setting logic into dm
>
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
> On Wed, Mar 26, 2025 at 1:22 AM Kenneth Feng <kenneth.feng at amd.com> wrote:
> >
> > Port the workload profile setting logic into dm before MALL optimization.
> >
> > Background:
> > MALL optimization strategy has changed in the firmware.Previously,
> > firmware does not care what workload type it is, once there is a
> > request from DAL for MALL, firmware immediately trigger the MALL setting sequence on the SoC, so called D0i3.x idle power sequence.
> > Now, before the D0i3.x sequence starts, firmware always check if the
> > workload type is default, if it is not, then abort the D0i3.x sequence.
> >
> > Issue:
> > Due to this strategy change, the task is moved to driver to make sure
> > if gfx is really idle and if it is, reset the workload to default.
> > Without this task, when DAL's work task for MALL optimization tries to
> > do the optimization request to DMCUB->pmfw, the workload type is always 3D fullscreen or compute, then MALL will never be applied.
> >
> > Why:
> > The idle task for setting workload type back to default interval is 1
> > second currently. The DAL's work task to optimize MALL always starts
> > before the idle task for setting workload type back to default. There
> > is no way to ask the idle task in the base driver to reset the
> > workload type ahead of the DAL's MALL setting work task kicks off. There could be a workaround which sets the idle task interval to 10 millisecond. However, this causes some call trace issues in which the workqueues is flushed.
>
> That should already fixed by this commit:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de35994ecd2dd6148ab5a6c5050a1670a04dec77
>
> >
> > Side effect:
> > This solution is to port the logic in idle thread to DAL: check the
> > fence and make sure gfx is idle, then reset the workload type. It is
> > fine that when DAL's work task exits the MALL optimization, it does not set back the workload type to 3d fullscreen or compute since the work task in base driver can make sure the workload type can be set back once there are jobs in the ring.
>
> I guess this works because the workload ref count gets clamped to 0, but this call is not balanced.  It also doesn't handle the VIDEO profile that gets set when using VCN or the COMPUTE profile when KFD queues are active.  Those would also prevent the idle optimizations.
> Also what happens if the profile changes after DC optimizations are enabled?  Does that cause the optimizations to exit or will they stay intact until DC tells it to exit?
>
> So I think we have two options:
> 1. always disable the 3D, compute, video profiles when entering the DAL optimization. subsequently, additional job submissions may change the workload.  will that be a problem?
> 2. Add a helper to pause workload profiles while the DC optimization is active.  If the profile only has to be changed while enabling the DC optimization, we can just call it right before and right after the dc optimizations.  Something like the attached patches should be a good starting point.
>
> Alex
>
> Hi Alex,
> In the attached patch, I guess smu->pause_workload is not needed since workload can be switched after idle optimization is triggered. And smu_pause_power_profile may not need the bool flag since I still think that we don't need to take care of the imbalance of workload setting.
> That is said, we just need to set the workload to default when idle optimization is requested. When the idle optimization is cancelled from DAL, it doesn't need to restore the previous workload setting since the ring work will set it.
> Attached is my patch. It works on my system. Let me know your thoughts.

This will race with any command submissions unless you add a lock.  I
think you want something like:

/* switch to the bootup default profile */
amdgpu_dpm_pause_power_profile(adev, true);
dc_allow_idle_optimizations(dm->dc, true);
/* resume existing profiles */
amdgpu_dpm_pause_power_profile(adev, false);

Alex

> Thanks.
>
> Kenneth
>
> >
> > Signed-off-by: Kenneth Feng <kenneth.feng at amd.com>
> > ---
> >  .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c    | 29 ++++++++++++++++++-
> >  1 file changed, 28 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > index 36a830a7440f..2adb3b72ed05 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > @@ -244,6 +244,20 @@ static void amdgpu_dm_crtc_vblank_control_worker(struct work_struct *work)
> >         struct vblank_control_work *vblank_work =
> >                 container_of(work, struct vblank_control_work, work);
> >         struct amdgpu_display_manager *dm = vblank_work->dm;
> > +       u32 i, fences = 0;
> > +       int r;
> > +       enum PP_SMC_POWER_PROFILE profile;
> > +       struct amdgpu_device *adev = drm_to_adev(dm->ddev);
> > +
> > +       if (adev->gfx.num_gfx_rings)
> > +               profile = PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> > +       else
> > +               profile = PP_SMC_POWER_PROFILE_COMPUTE;
> > +
> > +       for (i = 0; i < AMDGPU_MAX_GFX_RINGS; ++i)
> > +               fences += amdgpu_fence_count_emitted(&adev->gfx.gfx_ring[i]);
> > +       for (i = 0; i < (AMDGPU_MAX_COMPUTE_RINGS * AMDGPU_MAX_GC_INSTANCES); ++i)
> > +               fences +=
> > + amdgpu_fence_count_emitted(&adev->gfx.compute_ring[i]);
> >
> >         mutex_lock(&dm->dc_lock);
> >
> > @@ -271,8 +285,21 @@ static void amdgpu_dm_crtc_vblank_control_worker(struct work_struct *work)
> >                         vblank_work->acrtc->dm_irq_params.allow_sr_entry);
> >         }
> >
> > -       if (dm->active_vblank_irq_count == 0)
> > +       if (dm->active_vblank_irq_count == 0) {
> > +               if (adev->gfx.num_gfx_rings && !fences && !atomic_read(&adev->gfx.total_submission_cnt)) {
> > +                       mutex_lock(&adev->gfx.workload_profile_mutex);
> > +                       if (adev->gfx.workload_profile_active) {
> > +                               r = amdgpu_dpm_switch_power_profile(adev, profile, false);
> > +                               if (r)
> > +                               dev_warn(adev->dev, "(%d) failed to disable %s power profile mode\n", r,
> > +                                                                       profile == PP_SMC_POWER_PROFILE_FULLSCREEN3D ?
> > +                                                                       "fullscreen 3D" : "compute");
> > +                               adev->gfx.workload_profile_active = false;
> > +                       }
> > +                       mutex_unlock(&adev->gfx.workload_profile_mutex);
> > +               }
> >                 dc_allow_idle_optimizations(dm->dc, true);
> > +       }
> >
> >         mutex_unlock(&dm->dc_lock);
> >
> > --
> > 2.34.1
> >