[Intel-gfx] [PATCH] drm/i915: program wm blocks to at least blocks required per line

Wed Apr 6 12:48:02 UTC 2022

On Mon, Apr 04, 2022 at 04:49:18PM +0300, Vinod Govindapillai wrote:
> In configurations with single DRAM channel, for usecases like
> 4K 60 Hz, FIFO underruns are observed quite frequently. Looks
> like the wm0 watermark values need to bumped up because the wm0
> memory latency calculations are probably not taking the DRAM
> channel's impact into account.
> 
> As per the Bspec 49325, if the ddb allocation can hold at least
> one plane_blocks_per_line we should have selected method2.
> Assuming that modern HW versions have enough dbuf to hold
> at least one line, set the wm blocks to equivalent to blocks
> per line.
> 
> cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> cc: Stanislav Lisovskiy <stanislav.lisovskiy at intel.com>
> 
> Signed-off-by: Vinod Govindapillai <vinod.govindapillai at intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 8824f269e5f5..ae28a8c63ca4 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5474,7 +5474,24 @@ static void skl_compute_plane_wm(const struct intel_crtc_state *crtc_state,
>  		}
>  	}
>  
> -	blocks = fixed16_to_u32_round_up(selected_result) + 1;
> +	/*
> +	 * Lets have blocks at minimum equivalent to plane_blocks_per_line
> +	 * as there will be at minimum one line for lines configuration.
> +	 *
> +	 * As per the Bspec 49325, if the ddb allocation can hold at least
> +	 * one plane_blocks_per_line, we should have selected method2 in
> +	 * the above logic. Assuming that modern versions have enough dbuf
> +	 * and method2 guarantees blocks equivalent to at least 1 line,
> +	 * select the blocks as plane_blocks_per_line.
> +	 *
> +	 * TODO: Revisit the logic when we have better understanding on DRAM
> +	 * channels' impact on the level 0 memory latency and the relevant
> +	 * wm calculations.
> +	 */
> +	blocks = skl_wm_has_lines(dev_priv, level) ?
> +			max_t(u32, fixed16_to_u32_round_up(selected_result) + 1,
> +				  fixed16_to_u32_round_up(wp->plane_blocks_per_line)) :
> +			fixed16_to_u32_round_up(selected_result) + 1;

That's looks rather convoluted.

  blocks = fixed16_to_u32_round_up(selected_result) + 1;
+ /* blah */
+ if (has_lines)
+	blocks = max(blocks, fixed16_to_u32_round_up(wp->plane_blocks_per_line));

Also since Art said nothing like this should actually be needed
I think the comment should make it a bit more clear that this
is just a hack to work around the underruns with some single
memory channel configurations.

>  	lines = div_round_up_fixed16(selected_result,
>  				     wp->plane_blocks_per_line);
>  
> -- 
> 2.25.1

-- 
Ville Syrjälä
Intel