[Intel-gfx] [PATCH 2/6] drm/i915: Fix bw atomic check when switching between SAGV vs. no SAGV

Tue Feb 15 16:33:42 UTC 2022

On Tue, Feb 15, 2022 at 01:26:50PM +0200, Ville Syrjälä wrote:
> On Tue, Feb 15, 2022 at 01:02:48PM +0200, Lisovskiy, Stanislav wrote:
> > On Tue, Feb 15, 2022 at 12:10:19PM +0200, Ville Syrjälä wrote:
> > > On Tue, Feb 15, 2022 at 10:59:57AM +0200, Lisovskiy, Stanislav wrote:
> > > > On Mon, Feb 14, 2022 at 10:26:39PM +0200, Ville Syrjälä wrote:
> > > > > On Mon, Feb 14, 2022 at 07:03:05PM +0200, Lisovskiy, Stanislav wrote:
> > > > > > On Mon, Feb 14, 2022 at 12:24:57PM +0200, Ville Syrjälä wrote:
> > > > > > > On Mon, Feb 14, 2022 at 12:05:36PM +0200, Lisovskiy, Stanislav wrote:
> > > > > > > > On Mon, Feb 14, 2022 at 11:18:07AM +0200, Ville Syrjala wrote:
> > > > > > > > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > > > > > > 
> > > > > > > > > If the only thing that is changing is SAGV vs. no SAGV but
> > > > > > > > > the number of active planes and the total data rates end up
> > > > > > > > > unchanged we currently bail out of intel_bw_atomic_check()
> > > > > > > > > early and forget to actually compute the new WGV point
> > > > > > > > > mask and thus won't actually enable/disable SAGV as requested.
> > > > > > > > > This ends up poorly if we end up running with SAGV enabled
> > > > > > > > > when we shouldn't. Usually ends up in underruns.
> > > > > > > > > To fix this let's go through the QGV point mask computation
> > > > > > > > > if anyone else already added the bw state for us.
> > > > > > > > 
> > > > > > > > Haven't been looking this in a while. Despite we have been
> > > > > > > > looking like few revisions together still some bugs :(
> > > > > > > > 
> > > > > > > > I thought SAGV vs No SAGV can't change if active planes 
> > > > > > > > or data rate didn't change? Because it means we probably
> > > > > > > > still have same ddb allocations, which means SAGV state
> > > > > > > > will just stay the same.
> > > > > > > 
> > > > > > > SAGV can change due to watermarks/ddb allocations. The easiest
> > > > > > > way to trip this up is to try to use the async flip wm0/ddb 
> > > > > > > optimization. That immediately forgets to turn off SAGV and
> > > > > > > we get underruns, whcih is how I noticed this. And I don't
> > > > > > > immediately see any easy proof that this couldn't also happen
> > > > > > > due to some other plane changes.
> > > > > > 
> > > > > > Thats the way it was initially implemented even before SAGV was added.
> > > > > 
> > > > > Yeah, it wasn't a problem as long as SAGV was not enabled.
> > > > > 
> > > > > > I think it can be dated back to the very first bw check was implemented.
> > > > > > 
> > > > > > commit c457d9cf256e942138a54a2e80349ee7fe20c391
> > > > > > Author: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > > > Date:   Fri May 24 18:36:14 2019 +0300
> > > > > > 
> > > > > >     drm/i915: Make sure we have enough memory bandwidth on ICL
> > > > > > 
> > > > > > +int intel_bw_atomic_check(struct intel_atomic_state *state)
> > > > > > +{
> > > > > > +       struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> > > > > > +       struct intel_crtc_state *new_crtc_state, *old_crtc_state;
> > > > > > +       struct intel_bw_state *bw_state = NULL;
> > > > > > +       unsigned int data_rate, max_data_rate;
> > > > > > +       unsigned int num_active_planes;
> > > > > > +       struct intel_crtc *crtc;
> > > > > > +       int i;
> > > > > > +
> > > > > > +       /* FIXME earlier gens need some checks too */
> > > > > > +       if (INTEL_GEN(dev_priv) < 11)
> > > > > > +               return 0;
> > > > > > +
> > > > > > +       for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
> > > > > > +                                           new_crtc_state, i) {
> > > > > > +               unsigned int old_data_rate =
> > > > > > +                       intel_bw_crtc_data_rate(old_crtc_state);
> > > > > > +               unsigned int new_data_rate =
> > > > > > +                       intel_bw_crtc_data_rate(new_crtc_state);
> > > > > > +               unsigned int old_active_planes =
> > > > > > +                       intel_bw_crtc_num_active_planes(old_crtc_state);
> > > > > > +               unsigned int new_active_planes =
> > > > > > +                       intel_bw_crtc_num_active_planes(new_crtc_state);
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Avoid locking the bw state when
> > > > > > +                * nothing significant has changed.
> > > > > > +                */
> > > > > > +               if (old_data_rate == new_data_rate &&
> > > > > > +                   old_active_planes == new_active_planes)
> > > > > > +                       continue;
> > > > > > +
> > > > > > +               bw_state  = intel_atomic_get_bw_state(state);
> > > > > > +               if (IS_ERR(bw_state))
> > > > > > +                       return PTR_ERR(bw_state);
> > > > > > 
> > > > > > However, what can cause watermarks/ddb to change, besides plane state change
> > > > > > and/or active planes change? We change watermarks, when we change ddb allocations
> > > > > > and we change ddb allocations when active planes had changed and/or data rate
> > > > > > had changed.
> > > > > 
> > > > > The bw code only cares about the aggregate numbers from all the planes.
> > > > > The planes could still change in some funny way where eg. some plane
> > > > > frees up some bandwidth, but the other planes gobble up the exact same
> > > > > amount and thus the aggregate numbers the bw atomic check cares about
> > > > > do not change but the watermarks/ddb do.
> > > > > 
> > > > > And as mentiioned, the async flip wm0/ddb optimization makes this trivial
> > > > > to trip up since it will want to disable SAGV as there is not enough ddb
> > > > > for the SAGV watermark. And async flip specifically isn't even allowed
> > > > > to change anything that would affect the bandwidth utilization, and neither
> > > > > is it allowed to enable/disable planes.
> > > > 
> > > > I think the whole idea of setting ddb to minimum in case of async flip optimization
> > > > was purely our idea - BSpec/HSD only mentions forbidding wm levels > 0 in case of async
> > > > flip, however there is nothing about limiting ddb allocations.
> > > 
> > > Reducing just the watermark doesn't really make sense 
> > > if the goal is to keep the DBUF level to a minimum. Also
> > > I don't think there is any proper docs for this thing. The
> > > only thing we have just has some vague notes about using
> > > "minimum watermarks", whatever that means.
> > 
> > Was it the goal? I thought limiting watermarks would by itself also
> > limit package C states, thus affecting memory clocks and latency.
> > Because it really doesn't say anything about keeping Dbuf allocations
> > to a minimum. 
> 
> The goal is to miminize the amount of data in the FIFO.
> 
> > 
> > > 
> > > > 
> > > > Was a bit suspicious about that whole change, to be honest - and yep, now it seems to
> > > > cause some unexpected side effects.
> > > 
> > > The bw_state vs. SAGV bug is there regardless of the wm0 optimization.
> > 
> > I agree there is a bug. The bug is such that initial bw checks were relying
> > on total data rate + active planes comparison, while it should have accounted
> > data rate per plane usage.
> > 
> > This should have been changed in SAGV patches, but probably had gone
> > unnoticed both by you and me.
> > 
> > > 
> > > Also the SAGV watermark is not the minimum watermark (if that is
> > > the doc really means by that), the normal WM0 is the minimum watermark.
> > > So even if we interpret the doc to say that we should just disable all
> > > watermark levels except the smallest one (normal WM0) without changing
> > > the ddb allocations we would still end up disabling SAGV.
> > 
> > Thats actually a good question. Did they mean, disable all "regular" wm levels
> > or the SAGV one also? Probably they meant what you say, but would be nice to know
> > exactly.
> 
> They said neither. It's just "program minimum watermarks" which
> could mean anything really. They do explicitly say "DBUF level
> can also adversely affect flip performance." which I think is
> the whole point of this exercise.
> 
> > 
> > Anyway my point here is that, we probably shouldn't use new_bw_state as a way to 
> > check that plane allocations had changed. Thats just confusing.
> 
> We are not checking if plane allocations have changed. We are
> trying to determine if anything in the bw_state has changed.
> If we have said state already then something in it may have 
> changed and we have to recalculate anything that may depend
> on those changed things, namely pipe_sagv_reject->qgv_point_mask.

I think it is just not very intuitive that we use the fact whether
we can get new_bw_state or not, as a way to check if something had
changed.
Would be nice to put it in somekind of a wrapper like "has_new_bw_state"
or "bw_state_changed". Because for anyone not quite familiar with
that state paradigm we use, that would look pretty confusing that first
we get new_bw_state using intel_atomic_get_new_bw_state, then immediately
override it with intel_atomic_get_bw_state.
And whether we can get new_bw_state or not is just acting like a check,
that we don't have anything changed in bw_state.

Moreover indeed ideally intel_bw_atomic_check should probably handle all
that sagv stuff as well, i.e I would suggest moving pipe_reject_mask setting,
based skl_compute_wm results to that function.
I don't see any issue here because in skl_compute_wm we just calculate the 
sagv wm, then in intel_bw_atomic_check we just call intel_compute_sagv_mask,
which then calls tgl_crtc_can_enable_sagv for each crtc and sets this mask.

I think by boing this in intel_bw_atomic_check we would achieve both, what
you were willing to do, plus it would be more obvious, why things are happening
that way.

Stan

> 
> I think ideally we'd not even modify the bw_state directly from the
> watermark code and we'd instead defer that to bw atomic check entirely.
> But this SAGV vs. DDB business is your typical chicken vs. egg situation,
> so I'm not sure that is possible to do. Would need to spend a few minutes
> thinking about it I guess.
> 
> > 
> > May be for you as i915 guru, thats obvious however not for someone else, who might
> > touch the code and we are doing open source here.
> > 
> > Can we just add some check which explicitly does per plane data rate checks?
> 
> There is nothing interesting about per-plane data rates.
> 
> > So that we know bail out from that first cycle not only when total_data_rate/active planes
> > had changed, but we check per plane data rate? 
> > That might actually save us also in future, if we ever get into such situation, when
> > bw_state doesn't change, but ddb allocations do.
> > 
> > I know you might say it shouldn't happen, but there is always some new stuff coming.
> > 
> > Stan
> > 
> > > 
> > > > Also we are now forcing the recalculation to be done always no matter what and using
> > > > new bw state for that in a bit counterintuitive way, which I don't like. 
> > > > Not even sure that will always work, as we are not guaranteed to get a non-NULL
> > > > new_bw_state object from calling intel_atomic_get_new_bw_state, for that purpose we
> > > > typically call intel_atomic_get_bw_state, which is supposed to do that and its called only
> > > > here and in cause of CDCLK recalculation, which is called in intel_cdclk_atomic_check and
> > > > done right after this one.
> > > 
> > > If there is no bw_state then bw_state->pipe_sagv_reject can't have
> > > changed and there is nothing to recalculate.
> > > 
> > > > 
> > > > So if we haven't called intel_atomic_get_bw_state beforehand, which we didn't because there are
> > > > 2 places, where new bw state was supposed to be created to be usable by intel_atomic_get_new_bw_state
> > > > - I think, we will(or might) get a NULL here, because intel_atomic_get_bw_state hasn't been called yet.
> > > 
> > > Yes, NULL is perfectly fine.
> > > 
> > > -- 
> > > Ville Syrjälä
> > > Intel
> 
> -- 
> Ville Syrjälä
> Intel