[Intel-gfx] [PATCH] drm/i915/display: Enable second VDSC engine for higher moderates

Tue Sep 14 15:10:07 UTC 2021

On Tue, 14 Sep 2021, "Kulkarni, Vandita" <vandita.kulkarni at intel.com> wrote:
>> -----Original Message-----
>> From: Nikula, Jani <jani.nikula at intel.com>
>> Sent: Tuesday, September 14, 2021 7:33 PM
>> To: Lisovskiy, Stanislav <stanislav.lisovskiy at intel.com>
>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>; Kulkarni, Vandita
>> <vandita.kulkarni at intel.com>; intel-gfx at lists.freedesktop.org; Navare,
>> Manasi D <manasi.d.navare at intel.com>
>> Subject: Re: [Intel-gfx] [PATCH] drm/i915/display: Enable second VDSC
>> engine for higher moderates
>> 
>> On Tue, 14 Sep 2021, "Lisovskiy, Stanislav" <stanislav.lisovskiy at intel.com>
>> wrote:
>> > On Tue, Sep 14, 2021 at 04:04:25PM +0300, Lisovskiy, Stanislav wrote:
>> >> On Tue, Sep 14, 2021 at 03:04:11PM +0300, Jani Nikula wrote:
>> >> > On Tue, 14 Sep 2021, "Lisovskiy, Stanislav"
>> <stanislav.lisovskiy at intel.com> wrote:
>> >> > > On Tue, Sep 14, 2021 at 10:48:46AM +0300, Ville Syrjälä wrote:
>> >> > >> On Tue, Sep 14, 2021 at 07:31:46AM +0000, Kulkarni, Vandita wrote:
>> >> > >> > > -----Original Message-----
>> >> > >> > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
>> >> > >> > > Sent: Tuesday, September 14, 2021 12:59 PM
>> >> > >> > > To: Kulkarni, Vandita <vandita.kulkarni at intel.com>
>> >> > >> > > Cc: intel-gfx at lists.freedesktop.org; Nikula, Jani
>> >> > >> > > <jani.nikula at intel.com>; Navare, Manasi D
>> >> > >> > > <manasi.d.navare at intel.com>
>> >> > >> > > Subject: Re: [Intel-gfx] [PATCH] drm/i915/display: Enable
>> >> > >> > > second VDSC engine for higher moderates
>> >> > >> > >
>> >> > >> > > On Mon, Sep 13, 2021 at 08:09:23PM +0530, Vandita Kulkarni
>> wrote:
>> >> > >> > > > Each VDSC operates with 1ppc throughput, hence enable the
>> >> > >> > > > second VDSC engine when moderate is higher that the current
>> cdclk.
>> >> > >> > > >
>> >> > >> > > > Signed-off-by: Vandita Kulkarni
>> >> > >> > > > <vandita.kulkarni at intel.com>
>> >> > >> > > > ---
>> >> > >> > > >  drivers/gpu/drm/i915/display/intel_dp.c | 12 ++++++++++--
>> >> > >> > > >  1 file changed, 10 insertions(+), 2 deletions(-)
>> >> > >> > > >
>> >> > >> > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
>> >> > >> > > > b/drivers/gpu/drm/i915/display/intel_dp.c
>> >> > >> > > > index 161c33b2c869..55878f65f724 100644
>> >> > >> > > > --- a/drivers/gpu/drm/i915/display/intel_dp.c
>> >> > >> > > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
>> >> > >> > > > @@ -70,6 +70,7 @@
>> >> > >> > > >  #include "intel_tc.h"
>> >> > >> > > >  #include "intel_vdsc.h"
>> >> > >> > > >  #include "intel_vrr.h"
>> >> > >> > > > +#include "intel_cdclk.h"
>> >> > >> > > >
>> >> > >> > > >  #define DP_DPRX_ESI_LEN 14
>> >> > >> > > >
>> >> > >> > > > @@ -1291,10 +1292,13 @@ static int
>> >> > >> > > > intel_dp_dsc_compute_config(struct
>> >> > >> > > intel_dp *intel_dp,
>> >> > >> > > >  				       struct drm_connector_state
>> *conn_state,
>> >> > >> > > >  				       struct link_config_limits *limits)  {
>> >> > >> > > > +	struct intel_cdclk_state *cdclk_state;
>> >> > >> > > >  	struct intel_digital_port *dig_port =
>> dp_to_dig_port(intel_dp);
>> >> > >> > > >  	struct drm_i915_private *dev_priv = to_i915(dig_port-
>> >> > >> > > >base.base.dev);
>> >> > >> > > >  	const struct drm_display_mode *adjusted_mode =
>> >> > >> > > >  		&pipe_config->hw.adjusted_mode;
>> >> > >> > > > +	struct intel_atomic_state *state =
>> >> > >> > > > +				to_intel_atomic_state(pipe_config-
>> >> > >> > > >uapi.state);
>> >> > >> > > >  	int pipe_bpp;
>> >> > >> > > >  	int ret;
>> >> > >> > > >
>> >> > >> > > > @@ -1373,12 +1377,16 @@ static int
>> >> > >> > > > intel_dp_dsc_compute_config(struct
>> >> > >> > > intel_dp *intel_dp,
>> >> > >> > > >  		}
>> >> > >> > > >  	}
>> >> > >> > > >
>> >> > >> > > > +	cdclk_state = intel_atomic_get_cdclk_state(state);
>> >> > >> > > > +	if (IS_ERR(cdclk_state))
>> >> > >> > > > +		return PTR_ERR(cdclk_state);
>> >> > >> > > > +
>> >> > >> > > >  	/*
>> >> > >> > > >  	 * VDSC engine operates at 1 Pixel per clock, so if peak pixel
>> rate
>> >> > >> > > > -	 * is greater than the maximum Cdclock and if slice count is
>> even
>> >> > >> > > > +	 * is greater than the current Cdclock and if slice
>> >> > >> > > > +count is even
>> >> > >> > > >  	 * then we need to use 2 VDSC instances.
>> >> > >> > > >  	 */
>> >> > >> > > > -	if (adjusted_mode->crtc_clock > dev_priv->max_cdclk_freq
>> ||
>> >> > >> > > > +	if (adjusted_mode->crtc_clock >
>> >> > >> > > > +cdclk_state->actual.cdclk ||
>> >> > >> > >
>> >> > >> > > This is wrong. We compute the cdclk based on the
>> >> > >> > > requirements of the mode/etc., not the other way around.
>> >> > >
>> >> > > According to BSpec guideline, we decide whether we enable or
>> >> > > disable second VDSC engine, based on that condition. As I
>> >> > > understand that one is about DSC config calculation, based on CDCLK
>> which was calculated.
>> >> >
>> >> > Point is, at the time compute_config gets called, what guarantees
>> >> > are there that cdclk_state->actual.cdclk contains anything useful?
>> >> > This is the design we have.
>> >>
>> >> That is actually good question, was willing to check that as well.
>> >>
>> >> >
>> >> > > If we bump up CDCLK, to avoid this, will we even then use a second
>> VDSC ever?
>> >> >
>> >> > I think we'll eventually need better logic than unconditionally
>> >> > bumping to max, and it needs to take *both* the cdclk and the
>> >> > number of dsc engines into account. The referenced bspec only has
>> >> > the vdsc clock perspective, not overall perspective.
>> >>
>> >> What we need to clarify here is that how this is supposed to work in
>> theory.
>> >> Basically same issue can be fixed by both increasing the CDCLK or
>> >> enabling 2nd VDSC engine.
>> >> There should be some guideline telling us, how to prioritize.
>> >> From overall perspective as I understand, by default, we are able to
>> >> keep CDCLK 2 times less than pixel rate(see
>> >> intel_pixel_rate_to_cdclk), however due to that VDSC limitation that
>> >> it can use only 1 ppc this becomes, not applicable anymore(at least
>> >> as of BSpec 49259), so we have to increase amount of VDSC instances
>> then.
>> >>
>> >> So the question is now - what is more optimal here?
>> >> Also if we bump up CDCLK(which we have done many times already in
>> >> fact), we then need to add some logic to intel_compute_min_cdclk to
>> >> check if we are using DSC or not, because otherwise we don't really need
>> to do that.
>> 
>> intel_compute_min_cdclk() already needs to be dsc aware when slice count
>> is 1 and we can't use two dsc engines anyway. See the recent commit
>> fe01883fdcef ("drm/i915: Get proper min cdclk if vDSC enabled").
>> 
>> Looking again, I'm not sure that does the right decision for when
>> dsc.slice_count > 1, but dsc.split == false. It should probably use dsc.split for
>> the decision.
>> 
>> >>
>> >> Stan
>> >
>> > Checked and indeed, encoder->compute_config is called way before,
>> > basically CDCLK calculation is called almost in the end of
>> > atomic_check, so in compute_config, there would be an old CDCLK value
>> > copied from previous cdclk state, but not the last one.
>> >
>> > Vandita, this means we actually can't do it that way, if you want to
>> > do anything with VDSC based on CDCLK this has to be done _after_
>> > intel_compute_min_cdclk was called. Which is not very sweet, I guess.
>> >
>> > So as of current architecture, it seems that the easiest way is indeed
>> > to bump the CDCLK or we need to figure the way how to enable 2nd VDSC
>> > somewhere else, after CDCLK was calculated.
>> 
>> Alternatively, we could use two dsc engines more aggressively, but that
>> decision currently can't take overall chosen cdclk into account.
>> 
>> We'll end up sometimes unnecessarily using a too high cdclk or two dsc
>> engines, just have to pick the poison.
>> 
>> I think trying to do dsc decisions after intel_compute_min_cdclk() gets way
>> too complicated.
>
> In this case, can we just use the 2nd VDSC engine if slice_count is 2 or more?
> Which would mean we always operate in joiner enabled mode(small joiner) of all the compression modes of operation
> mentioned in the table bspec: 49259
> Because we are still going to hit the max cdclk restriction for higher resolutions, and many lower resolutions wouldn’t need max cdclk.
> And eventually once we have more details on cd clk vs 2VDSC engine we could add
> the logic to choose one over the other?
>
> I see that in case of DSI we do split = true, for slice_count > 1 but that would need a different set of checks, thats a TBD.
>
> Or Do you suggest I just do this for now max cdclk when slice_count =1  (what we are doing now) replace with compression = true and split = false

I think the check in intel_compute_min_cdclk() should be:

	if (crtc_state->dsc.compression_enable && !crtc_state->dsc.dsc_split)

That's a separate change.

Enabling two dsc engines more aggressively... I don't mind doing it
unconditionally when slice count > 1 for starters. But I think we'll
need to improve this going forward, including fixing the mode valid
checks etc. as we've discussed.

Ville, any objections?

BR,
Jani.

>  
> Thanks,
> Vandita
>> 
>> BR,
>> Jani
>> 
>> 
>> 
>> 
>> >
>> > Stan
>> >
>> >>
>> >> >
>> >> > BR,
>> >> > Jani.
>> >> >
>> >> > > Another thing is that probably enabling second VDSC is cheaper in
>> >> > > terms of power consumption, than bumping up the CDCLK.
>> >> > >
>> >> > > Stan
>> >> > >
>> >> > >> >
>> >> > >> > Okay , So you suggest that we set the cd clock to max when we
>> have such requirement, than enabling the second engine?
>> >> > >>
>> >> > >> That seems like the easiest solution. Another option might be to
>> >> > >> come up with some lower dotclock limit for the use of the second
>> >> > >> vdsc. But not sure we know where the tipping point is wrt. powr
>> consumption.
>> >> > >>
>> >> > >> --
>> >> > >> Ville Syrjälä
>> >> > >> Intel
>> >> >
>> >> > --
>> >> > Jani Nikula, Intel Open Source Graphics Center
>> 
>> --
>> Jani Nikula, Intel Open Source Graphics Center

-- 
Jani Nikula, Intel Open Source Graphics Center