[Intel-gfx] [PATCH] drm/i915/display: Enable second VDSC engine for higher moderates
Kulkarni, Vandita
vandita.kulkarni at intel.com
Mon Jan 10 07:15:04 UTC 2022
Revisiting this thread after update from the bspec.
> -----Original Message-----
> From: Nikula, Jani <jani.nikula at intel.com>
> Sent: Tuesday, September 14, 2021 8:40 PM
> To: Kulkarni, Vandita <vandita.kulkarni at intel.com>; Lisovskiy, Stanislav
> <stanislav.lisovskiy at intel.com>
> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>; intel-
> gfx at lists.freedesktop.org; Navare, Manasi D <manasi.d.navare at intel.com>
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/display: Enable second VDSC
> engine for higher moderates
>
> On Tue, 14 Sep 2021, "Kulkarni, Vandita" <vandita.kulkarni at intel.com>
> wrote:
> >> -----Original Message-----
> >> From: Nikula, Jani <jani.nikula at intel.com>
> >> Sent: Tuesday, September 14, 2021 7:33 PM
> >> To: Lisovskiy, Stanislav <stanislav.lisovskiy at intel.com>
> >> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>; Kulkarni, Vandita
> >> <vandita.kulkarni at intel.com>; intel-gfx at lists.freedesktop.org;
> >> Navare, Manasi D <manasi.d.navare at intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH] drm/i915/display: Enable second VDSC
> >> engine for higher moderates
> >>
> >> On Tue, 14 Sep 2021, "Lisovskiy, Stanislav"
> >> <stanislav.lisovskiy at intel.com>
> >> wrote:
> >> > On Tue, Sep 14, 2021 at 04:04:25PM +0300, Lisovskiy, Stanislav wrote:
> >> >> On Tue, Sep 14, 2021 at 03:04:11PM +0300, Jani Nikula wrote:
> >> >> > On Tue, 14 Sep 2021, "Lisovskiy, Stanislav"
> >> <stanislav.lisovskiy at intel.com> wrote:
> >> >> > > On Tue, Sep 14, 2021 at 10:48:46AM +0300, Ville Syrjälä wrote:
> >> >> > >> On Tue, Sep 14, 2021 at 07:31:46AM +0000, Kulkarni, Vandita
> wrote:
> >> >> > >> > > -----Original Message-----
> >> >> > >> > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >> >> > >> > > Sent: Tuesday, September 14, 2021 12:59 PM
> >> >> > >> > > To: Kulkarni, Vandita <vandita.kulkarni at intel.com>
> >> >> > >> > > Cc: intel-gfx at lists.freedesktop.org; Nikula, Jani
> >> >> > >> > > <jani.nikula at intel.com>; Navare, Manasi D
> >> >> > >> > > <manasi.d.navare at intel.com>
> >> >> > >> > > Subject: Re: [Intel-gfx] [PATCH] drm/i915/display: Enable
> >> >> > >> > > second VDSC engine for higher moderates
> >> >> > >> > >
> >> >> > >> > > On Mon, Sep 13, 2021 at 08:09:23PM +0530, Vandita
> >> >> > >> > > Kulkarni
> >> wrote:
> >> >> > >> > > > Each VDSC operates with 1ppc throughput, hence enable
> >> >> > >> > > > the second VDSC engine when moderate is higher that the
> >> >> > >> > > > current
> >> cdclk.
> >> >> > >> > > >
> >> >> > >> > > > Signed-off-by: Vandita Kulkarni
> >> >> > >> > > > <vandita.kulkarni at intel.com>
> >> >> > >> > > > ---
> >> >> > >> > > > drivers/gpu/drm/i915/display/intel_dp.c | 12
> >> >> > >> > > > ++++++++++--
> >> >> > >> > > > 1 file changed, 10 insertions(+), 2 deletions(-)
> >> >> > >> > > >
> >> >> > >> > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> > >> > > > b/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> > >> > > > index 161c33b2c869..55878f65f724 100644
> >> >> > >> > > > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> > >> > > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> > >> > > > @@ -70,6 +70,7 @@
> >> >> > >> > > > #include "intel_tc.h"
> >> >> > >> > > > #include "intel_vdsc.h"
> >> >> > >> > > > #include "intel_vrr.h"
> >> >> > >> > > > +#include "intel_cdclk.h"
> >> >> > >> > > >
> >> >> > >> > > > #define DP_DPRX_ESI_LEN 14
> >> >> > >> > > >
> >> >> > >> > > > @@ -1291,10 +1292,13 @@ static int
> >> >> > >> > > > intel_dp_dsc_compute_config(struct
> >> >> > >> > > intel_dp *intel_dp,
> >> >> > >> > > > struct
> drm_connector_state
> >> *conn_state,
> >> >> > >> > > > struct link_config_limits
> *limits) {
> >> >> > >> > > > + struct intel_cdclk_state *cdclk_state;
> >> >> > >> > > > struct intel_digital_port *dig_port =
> >> dp_to_dig_port(intel_dp);
> >> >> > >> > > > struct drm_i915_private *dev_priv =
> to_i915(dig_port-
> >> >> > >> > > >base.base.dev);
> >> >> > >> > > > const struct drm_display_mode *adjusted_mode =
> >> >> > >> > > > &pipe_config->hw.adjusted_mode;
> >> >> > >> > > > + struct intel_atomic_state *state =
> >> >> > >> > > > +
> to_intel_atomic_state(pipe_config-
> >> >> > >> > > >uapi.state);
> >> >> > >> > > > int pipe_bpp;
> >> >> > >> > > > int ret;
> >> >> > >> > > >
> >> >> > >> > > > @@ -1373,12 +1377,16 @@ static int
> >> >> > >> > > > intel_dp_dsc_compute_config(struct
> >> >> > >> > > intel_dp *intel_dp,
> >> >> > >> > > > }
> >> >> > >> > > > }
> >> >> > >> > > >
> >> >> > >> > > > + cdclk_state = intel_atomic_get_cdclk_state(state);
> >> >> > >> > > > + if (IS_ERR(cdclk_state))
> >> >> > >> > > > + return PTR_ERR(cdclk_state);
> >> >> > >> > > > +
> >> >> > >> > > > /*
> >> >> > >> > > > * VDSC engine operates at 1 Pixel per clock, so if
> >> >> > >> > > > peak pixel
> >> rate
> >> >> > >> > > > - * is greater than the maximum Cdclock and if slice
> count is
> >> even
> >> >> > >> > > > + * is greater than the current Cdclock and if slice
> >> >> > >> > > > +count is even
> >> >> > >> > > > * then we need to use 2 VDSC instances.
> >> >> > >> > > > */
> >> >> > >> > > > - if (adjusted_mode->crtc_clock > dev_priv-
> >max_cdclk_freq
> >> ||
> >> >> > >> > > > + if (adjusted_mode->crtc_clock >
> >> >> > >> > > > +cdclk_state->actual.cdclk ||
> >> >> > >> > >
> >> >> > >> > > This is wrong. We compute the cdclk based on the
> >> >> > >> > > requirements of the mode/etc., not the other way around.
> >> >> > >
> >> >> > > According to BSpec guideline, we decide whether we enable or
> >> >> > > disable second VDSC engine, based on that condition. As I
> >> >> > > understand that one is about DSC config calculation, based on
> >> >> > > CDCLK
> >> which was calculated.
> >> >> >
> >> >> > Point is, at the time compute_config gets called, what
> >> >> > guarantees are there that cdclk_state->actual.cdclk contains
> anything useful?
> >> >> > This is the design we have.
> >> >>
> >> >> That is actually good question, was willing to check that as well.
> >> >>
> >> >> >
> >> >> > > If we bump up CDCLK, to avoid this, will we even then use a
> >> >> > > second
> >> VDSC ever?
> >> >> >
> >> >> > I think we'll eventually need better logic than unconditionally
> >> >> > bumping to max, and it needs to take *both* the cdclk and the
> >> >> > number of dsc engines into account. The referenced bspec only
> >> >> > has the vdsc clock perspective, not overall perspective.
> >> >>
> >> >> What we need to clarify here is that how this is supposed to work
> >> >> in
> >> theory.
> >> >> Basically same issue can be fixed by both increasing the CDCLK or
> >> >> enabling 2nd VDSC engine.
> >> >> There should be some guideline telling us, how to prioritize.
> >> >> From overall perspective as I understand, by default, we are able
> >> >> to keep CDCLK 2 times less than pixel rate(see
> >> >> intel_pixel_rate_to_cdclk), however due to that VDSC limitation
> >> >> that it can use only 1 ppc this becomes, not applicable anymore(at
> >> >> least as of BSpec 49259), so we have to increase amount of VDSC
> >> >> instances
> >> then.
> >> >>
> >> >> So the question is now - what is more optimal here?
> >> >> Also if we bump up CDCLK(which we have done many times already in
> >> >> fact), we then need to add some logic to intel_compute_min_cdclk
> >> >> to check if we are using DSC or not, because otherwise we don't
> >> >> really need
> >> to do that.
> >>
> >> intel_compute_min_cdclk() already needs to be dsc aware when slice
> >> count is 1 and we can't use two dsc engines anyway. See the recent
> >> commit fe01883fdcef ("drm/i915: Get proper min cdclk if vDSC enabled").
> >>
> >> Looking again, I'm not sure that does the right decision for when
> >> dsc.slice_count > 1, but dsc.split == false. It should probably use
> >> dsc.split for the decision.
> >>
> >> >>
> >> >> Stan
> >> >
> >> > Checked and indeed, encoder->compute_config is called way before,
> >> > basically CDCLK calculation is called almost in the end of
> >> > atomic_check, so in compute_config, there would be an old CDCLK
> >> > value copied from previous cdclk state, but not the last one.
> >> >
> >> > Vandita, this means we actually can't do it that way, if you want
> >> > to do anything with VDSC based on CDCLK this has to be done _after_
> >> > intel_compute_min_cdclk was called. Which is not very sweet, I guess.
> >> >
> >> > So as of current architecture, it seems that the easiest way is
> >> > indeed to bump the CDCLK or we need to figure the way how to enable
> >> > 2nd VDSC somewhere else, after CDCLK was calculated.
> >>
> >> Alternatively, we could use two dsc engines more aggressively, but
> >> that decision currently can't take overall chosen cdclk into account.
> >>
> >> We'll end up sometimes unnecessarily using a too high cdclk or two
> >> dsc engines, just have to pick the poison.
> >>
> >> I think trying to do dsc decisions after intel_compute_min_cdclk()
> >> gets way too complicated.
> >
> > In this case, can we just use the 2nd VDSC engine if slice_count is 2 or
> more?
> > Which would mean we always operate in joiner enabled mode(small
> > joiner) of all the compression modes of operation mentioned in the
> > table bspec: 49259 Because we are still going to hit the max cdclk restriction
> for higher resolutions, and many lower resolutions wouldn’t need max cdclk.
> > And eventually once we have more details on cd clk vs 2VDSC engine we
> > could add the logic to choose one over the other?
> >
> > I see that in case of DSI we do split = true, for slice_count > 1 but that
> would need a different set of checks, thats a TBD.
> >
> > Or Do you suggest I just do this for now max cdclk when slice_count =1
> > (what we are doing now) replace with compression = true and split =
> > false
>
> I think the check in intel_compute_min_cdclk() should be:
>
> if (crtc_state->dsc.compression_enable && !crtc_state-
> >dsc.dsc_split)
>
> That's a separate change.
>
> Enabling two dsc engines more aggressively... I don't mind doing it
> unconditionally when slice count > 1 for starters. But I think we'll need to
> improve this going forward, including fixing the mode valid checks etc. as
> we've discussed.
Design recommendation is to use 2 VDSC instances while meeting the following constraint so that cdclk can stay as low as possible.
DP/HDMI PPR spec provided slice size < DPCD provided MaxSliceWidth
Thanks,
Vandita
>
> Ville, any objections?
>
> BR,
> Jani.
>
>
> >
> > Thanks,
> > Vandita
> >>
> >> BR,
> >> Jani
> >>
> >>
> >>
> >>
> >> >
> >> > Stan
> >> >
> >> >>
> >> >> >
> >> >> > BR,
> >> >> > Jani.
> >> >> >
> >> >> > > Another thing is that probably enabling second VDSC is cheaper
> >> >> > > in terms of power consumption, than bumping up the CDCLK.
> >> >> > >
> >> >> > > Stan
> >> >> > >
> >> >> > >> >
> >> >> > >> > Okay , So you suggest that we set the cd clock to max when
> >> >> > >> > we
> >> have such requirement, than enabling the second engine?
> >> >> > >>
> >> >> > >> That seems like the easiest solution. Another option might be
> >> >> > >> to come up with some lower dotclock limit for the use of the
> >> >> > >> second vdsc. But not sure we know where the tipping point is
> >> >> > >> wrt. powr
> >> consumption.
> >> >> > >>
> >> >> > >> --
> >> >> > >> Ville Syrjälä
> >> >> > >> Intel
> >> >> >
> >> >> > --
> >> >> > Jani Nikula, Intel Open Source Graphics Center
> >>
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center
>
> --
> Jani Nikula, Intel Open Source Graphics Center
More information about the Intel-gfx
mailing list