[Intel-gfx] [PATCH v2] drm/i915: Enable second dbuf slice for ICL
Lisovskiy, Stanislav
stanislav.lisovskiy at intel.com
Thu Oct 17 08:05:39 UTC 2019
On Wed, 2019-10-16 at 16:41 -0700, Matt Roper wrote:
> On Wed, Oct 09, 2019 at 10:39:08AM +0300, Stanislav Lisovskiy wrote:
> > Also implemented algorithm for choosing DBuf slice configuration
> > based on active pipes, pipe ratio as stated in BSpec 12716.
> >
> > Now pipe allocation still stays proportional to pipe width as
> > before,
> > however within allowed DBuf slice for this particular
> > configuration.
> >
> > v2: Remove unneeded check from commit as ddb enabled slices might
> > now differ from hw state.
>
> I can't seem to find v1 of this patch in my archives; can you
> elaborate
> on this?
Hi, thanks for good points in review. I initially submitted v1 only for
TryBot, then figured out some issue and then sent a patch to that
mailing list.
> It looks like a bit of a mismatch in terms of how we're
> interpreting 'ddb.enabled_slices' in different parts of the driver.
> During hw readout we're treating the value as the number of powered
> up
> slices (which will always be 2 since we always power them both up in
> icl_display_core_init -> icl_dbuf_enable even when we're not going to
> use both), whereas in the atomic check code we're interpreting the
> value
> as the number of slices we want to distribute our plane allocations
> over. We may want to break these out into two separate fields
> (powered
> slices vs utilized slices) that we track separately.
Had exactly same concern, I wanted to always treat this field as number
of slices we have available, however if we are going to enable/disable
that dynamically probably the only way is to split it.
>
> >
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy at intel.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_display.c | 6 -
> > drivers/gpu/drm/i915/intel_pm.c | 208
> > ++++++++++++++++++-
> > 2 files changed, 202 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index 1a533ccdb54f..4683731ac1ca 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -12989,12 +12989,6 @@ static void verify_wm_state(struct
> > intel_crtc *crtc,
> > skl_ddb_get_hw_state(dev_priv, &hw->ddb);
> > sw_ddb = &dev_priv->wm.skl_hw.ddb;
> >
> > - if (INTEL_GEN(dev_priv) >= 11 &&
> > - hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > - DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > %u)\n",
> > - sw_ddb->enabled_slices,
> > - hw->ddb.enabled_slices);
> > -
>
> Related to the comment above, we probably do want to make sure that
> the
> number of powered up dbuf slices matches what we expect (which today
> is
> always 2, but that might change in the future if we try to be more
> intelligent and power down the second slice when it isn't needed)
>
> If we're already confirming that the planes' hw/sw DDB allocations
> themselves match farther down this function, then we're effectively
> already checking the distribution of planes between the two slices.
>
> > /* planes */
> > for_each_universal_plane(dev_priv, pipe, plane) {
> > struct skl_plane_wm *hw_plane_wm, *sw_plane_wm;
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index bfcf03ab5245..0fbeea61299f 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3616,7 +3616,7 @@ static u8
> > intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
> > * only that 1 slice enabled until we have a proper way for on-
> > demand
> > * toggling of the second slice.
> > */
>
>
> This comment is/was also outdated. Also note that this change is
> going
> to impact TGL as well even though you haven't added the TGL handling
> for
> slice assignment yet.
>
> > - if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > + if ((num_active > 1 || total_data_bw >= GBps(12))) {
>
> Where does the 12 GBps number come from? I know the comment above
> this
> says that's the maximum BW supported by a single DBuf slice, but I
> can't
> find where this is mentioned in the bspec (and I'm not sure if that
> would apply to all gen11+ platforms or whether that was an ICL-
> specific
> number from when the comment/code was first written).
Had same question actually - neither me or Ville found any mentioning
about 12 GB/s in bspec, so decided to leave it as is before at least
we figure out where it came from.
>
> > ddb->enabled_slices = 2;
> > } else {
> > ddb->enabled_slices = 1;
> > @@ -3831,6 +3831,35 @@ static u16 intel_get_ddb_size(struct
> > drm_i915_private *dev_priv,
> > return ddb_size;
> > }
> >
> > +/*
> > + * Calculate initial DBuf slice offset, based on slice size
> > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > + * offset would be 1024)
> > + */
> > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > + u32 slice_size, u32
> > ddb_size)
> > +{
> > + u32 offset = 0;
> > +
> > + if (!dbuf_slice_mask)
> > + return 0;
> > +
> > + while (!(dbuf_slice_mask & 1)) {
> > + dbuf_slice_mask >>= 1;
> > + offset += slice_size;
> > + if (offset >= ddb_size)
>
>
> We can probably move this call down into the function below. AFAICS,
> pipe ratio doesn't matter anymore once we get to TGL and doesn't
> matter
> on earlier platforms that only have one slice, so there's no need
> to calculate it if we aren't going to use it.
Good point, agree it is probably not so common to be used here.
Also thanks for spotting that I forgot to invert pipe_ratio value.
>
> > +
> > + /*
> > + * Get allowed DBuf slices for correspondent pipe and platform.
> > + */
> > + dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > for_pipe,
> > + active_pipes,
> > pipe_ratio);
> > +
> > + DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > + dbuf_slice_mask,
> > + for_pipe, active_pipes);
> > +
> > + /*
> > + * Figure out at which DBuf slice we start, i.e if we start at
> > Dbuf S2
> > + * and slice size is 1024, the offset would be 1024
> > + */
> > + offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > + slice_size, ddb_size);
> > +
> > + /*
> > + * Figure out total size of allowed DBuf slices, which is
> > basically
> > + * a number of allowed slices for that pipe multiplied by slice
> > size.
> > + * We might still have some dbuf slices disabled in case if
> > those
> > + * are not needed based on bandwidth requirements and
> > num_active pipes,
> > + * so stick to real ddb size if it happens to be less. Inside
> > of this
> > + * range ddb entries are still allocated in proportion to
> > display width.
> > + */
> > + ddb_range_size = min(hweight8(dbuf_slice_mask) * slice_size,
> > + (unsigned int)ddb_size);
> > +
> > /*
> > * Watermark/ddb requirement highly depends upon width of the
> > * framebuffer, So instead of allocating DDB equally among
> > pipes
> > @@ -3890,10 +3967,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> > &crtc_state->base.adjusted_mode;
> > enum pipe pipe = crtc->pipe;
> > int hdisplay, vdisplay;
> > + uint_fixed_16_16_t ratio =
> > skl_get_pipe_ratio(crtc_state);
> > + u32 pipe_dbuf_slice_mask =
> > i915_get_allowed_dbuf_mask(dev_priv,
> > + p
> > ipe,
> > + a
> > ctive_pipes,
> > + r
> > atio);
>
> Minor nitpick, but the lines here are a bit over 80 characters. You
> may
> want to break the line after the equals sign.
>
> >
> > if (!crtc_state->base.enable)
> > continue;
> >
> > + /*
> > + * According to BSpec pipe can share one dbuf slice
> > with another pipes or pipe can use
> > + * multiple dbufs, in both cases we account for other
> > pipes only if they have
> > + * exactly same mask.
> > + */
>
> Some more long lines that need to be wrapped.
>
> > + if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > + continue;
> > +
> > drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > &vdisplay);
> > total_width += hdisplay;
> >
> > @@ -3903,8 +3993,11 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> > pipe_width = hdisplay;
> > }
> >
> > - alloc->start = ddb_size * width_before_pipe / total_width;
> > - alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > total_width;
> > + alloc->start = offset + ddb_range_size * width_before_pipe /
> > total_width;
> > + alloc->end = offset + ddb_range_size * (width_before_pipe +
> > pipe_width) / total_width;
> > +
> > + DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > + alloc->start, alloc->end);
> > }
> >
> > static int skl_compute_wm_params(const struct intel_crtc_state
> > *crtc_state,
> > @@ -4255,6 +4348,109 @@ skl_get_total_relative_data_rate(struct
> > intel_crtc_state *crtc_state,
> > return total_data_rate;
> > }
> >
> > +uint_fixed_16_16_t
> > +skl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
>
> I think this should be named icl_get_pipe_ratio() given that it comes
> from a gen11-specific page of the bspec?
>
> > +{
> > + struct drm_plane *plane;
> > + const struct drm_plane_state *drm_plane_state;
> > + uint_fixed_16_16_t pipe_downscale;
> > + uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > +
> > + if (!crtc_state->base.enable)
>
> This function only gets called from
> skl_ddb_get_pipe_allocation_limits()
> which tests crtc_state->base.active at the beginning and bails out if
> the CRTC isn't active. active && !enable shouldn't be possible, so
> I'd
> add a WARN_ON() here to make that assertion clear.
>
> > + return max_downscale;
> > +
> > + drm_atomic_crtc_state_for_each_plane_state(plane,
> > drm_plane_state, &crtc_state->base) {
> > + uint_fixed_16_16_t plane_downscale;
> > + const struct intel_plane_state *plane_state =
> > + to_intel_plane_state(drm_plane_state);
> > +
> > + if (!intel_wm_plane_visible(crtc_state, plane_state))
> > + continue;
> > +
> > + plane_downscale =
> > skl_plane_downscale_amount(crtc_state, plane_state);
> > +
> > + max_downscale = max_fixed16(plane_downscale,
> > max_downscale);
> > + }
> > +
> > + pipe_downscale = skl_pipe_downscale_amount(crtc_state);
> > +
> > + pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > +
> > + return pipe_downscale;
>
> Wouldn't the pipe ratio be 1/pipe_downscale?
>
>
> > +}
> > +
> > +#define DBUF_S1_BIT 1
> > +#define DBUF_S2_BIT 2
> > +
> > +struct dbuf_slice_conf_entry {
> > + u32 dbuf_mask[I915_MAX_PIPES];
> > + u32 active_pipes;
> > +};
> > +
> > +
> > +/*
> > + * Table taken from Bspec 12716
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +{ { DBUF_S1_BIT | DBUF_S2_BIT, 0, 0, 0 }, BIT(PIPE_A) },
> > +{ { DBUF_S1_BIT, 0, 0, 0 }, BIT(PIPE_A) },
> > +{ { 0, DBUF_S1_BIT | DBUF_S2_BIT, 0, 0 }, BIT(PIPE_B) },
> > +{ { 0, DBUF_S1_BIT, 0, 0 }, BIT(PIPE_B) },
> > +{ { 0, 0, DBUF_S1_BIT | DBUF_S2_BIT }, BIT(PIPE_C) },
> > +{ { 0, 0, DBUF_S2_BIT, 0 }, BIT(PIPE_C) },
> > +{ { DBUF_S1_BIT, DBUF_S2_BIT, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) },
> > +{ { DBUF_S1_BIT, 0, DBUF_S2_BIT, 0 }, BIT(PIPE_A) | BIT(PIPE_C) },
> > +{ { 0, DBUF_S1_BIT, DBUF_S2_BIT, 0 }, BIT(PIPE_B) | BIT(PIPE_C) },
> > +{ { DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, 0 }, BIT(PIPE_A) |
> > BIT(PIPE_B) | BIT(PIPE_C) },
>
> You might want to align some of the columns here to make it slightly
> easier to read. And even though the bspec puts the pipes in the
> final
> column, I think it would be more natural for readability to move that
> first and/or put it on a line by itself. You've already got a line
> here
> that exceeds 80 characters by a bit and once you add a TGL table with
> four pipes you're going to have even longer lines.
>
>
> > +};
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > + u32 active_pipes,
> > + uint_fixed_16_16_t pipe_ratio)
> > +{
> > + int i;
> > + for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > + if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > {
> > + /*
> > + * According to BSpec 12716: if pipe_ratio >=
> > 88.8
> > + * use single pipe, even if two are accessible.
> > + */
> > + if (pipe_ratio.val >= div_fixed16(888, 10).val)
> > + ++i;
> > + return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > + }
> > + }
> > + return 0;
> > +}
> > +
> > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > + int pipe, u32 active_pipes,
> > + uint_fixed_16_16_t pipe_ratio)
> > +{
> > + if (IS_GEN(dev_priv, 11))
> > + return icl_get_allowed_dbuf_mask(pipe,
> > + active_pipes,
> > + pipe_ratio);
> > + /*
> > + * For anything else just return one slice yet.
> > + * Should be extended for other platforms.
> > + */
>
> Note that you've already adjusted the DDB size for TGL in
> intel_get_ddb_size(), so we probably want to add TGL's table at the
> same
> time as the gen11 table.
>
>
> Matt
>
> > + return DBUF_S1_BIT;
> > +}
> > +
> > static u64
> > icl_get_total_relative_data_rate(struct intel_crtc_state
> > *crtc_state,
> > u64 *plane_data_rate)
> > --
> > 2.17.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
More information about the Intel-gfx
mailing list