[PATCH v4 7/9] drm/msm/dpu: handle perf mode in _dpu_core_perf_crtc_update_bus()

Abhinav Kumar quic_abhinavk at quicinc.com
Fri Jan 17 20:28:17 UTC 2025



On 1/15/2025 5:14 PM, Dmitry Baryshkov wrote:
> On Wed, Jan 15, 2025 at 04:40:39PM -0800, Abhinav Kumar wrote:
>>
>>
>> On 1/15/2025 4:32 PM, Dmitry Baryshkov wrote:
>>> On Wed, Jan 15, 2025 at 11:41:27AM -0800, Abhinav Kumar wrote:
>>>>
>>>>
>>>> On 1/15/2025 12:27 AM, Dmitry Baryshkov wrote:
>>>>> On Tue, Jan 14, 2025 at 01:18:26PM -0800, Abhinav Kumar wrote:
>>>>>>
>>>>>>
>>>>>> On 1/14/2025 3:10 AM, Dmitry Baryshkov wrote:
>>>>>>> On Mon, Jan 13, 2025 at 07:38:16PM -0800, Abhinav Kumar wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 1/5/2025 7:07 PM, Dmitry Baryshkov wrote:
>>>>>>>>> Move perf mode handling for the bandwidth to
>>>>>>>>> _dpu_core_perf_crtc_update_bus() rather than overriding per-CRTC data
>>>>>>>>> and then aggregating known values.
>>>>>>>>>
>>>>>>>>> Note, this changes the fix_core_ab_vote. Previously it would be
>>>>>>>>> multiplied per the CRTC number, now it will be used directly for
>>>>>>>>> interconnect voting. This better reflects user requirements in the case
>>>>>>>>> of different resolutions being set on different CRTCs: instead of using
>>>>>>>>> the same bandwidth for each CRTC (which is incorrect) user can now
>>>>>>>>> calculate overall bandwidth required by all outputs and use that value.
>>>>>>>>>
>>>>>>>>
>>>>>>>> There are two things this change is doing:
>>>>>>>>
>>>>>>>> 1) Dropping the core_clk_rate setting because its already handled inside
>>>>>>>> _dpu_core_perf_get_core_clk_rate() and hence dpu_core_perf_crtc_update()
>>>>>>>> will still work.
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>> 2) Then this part of moving the ab/ib setting to
>>>>>>>> _dpu_core_perf_crtc_update_bus().
>>>>>>>>
>>>>>>>> Can we split this into two changes so that its clear that dropping
>>>>>>>> core_clk_rate setting in this change will not cause an issue.
>>>>>>>
>>>>>>> Ack
>>>>>>>
>>>>>>
>>>>>> Actually I think this is incorrect.
>>>>>>
>>>>>> If the user puts in an incorrect value beyond the bounds, earlier the code
>>>>>> will reject that by failing the in _dpu_core_perf_calc_crtc().
>>>>>
>>>>> This function doesn't perform any validation nor returns an error code.
>>>>> Probably you've meant some other function.
>>>>>
>>>>
>>>> Sorry, let me explain a little more to complete the flow I am seeing.
>>>>
>>>> _dpu_core_perf_calc_crtc() gets called by dpu_core_perf_crtc_check().
>>>>
>>>> That one checks against erroneous values.
>>>>
>>>>                   if (!threshold) {
>>>>                           DPU_ERROR("no bandwidth limits specified\n");
>>>>                           return -E2BIG;
>>>>                   } else if (bw > threshold) {
>>>>                           DPU_ERROR("exceeds bandwidth: %ukb > %ukb\n", bw,
>>>>                                           threshold);
>>>>                           return -E2BIG;
>>>>                   }
>>>
>>> Here we are checking that the selected set of modes doesn't overload
>>> defined platform requirements. However I think that it should be
>>> possible for the user to attempt to overcome predefined bandwidth
>>> limitations in attempt to debug the issue. ICC framework handles that
>>> perfectly (and if you check, until the sync_state is reached all BW's
>>> are assumed to be UINT_MAX). Maybe I should document it in the commit
>>> message that after this commit forced BWs are not a subject to the
>>> catalog limitations.
>>>
>>
>> hmmm, yes this was the validation I was referring to.
>>
>> I didnt get why a user should be allowed to go beyond the platform limits,
>> what purpose does that serve , its not leading to any conclusion or towards
>> the resolution of the issue. With the platform validation not only we are
>> enforcing the limits but also making sure that random values given by the
>> user dont cause more harm than good.
> 
> If debugfs files are being used to overwrite the data, then the user is
> an advanced user. Possible usage cases might include explicitly
> overclocking the platform, performing validation checks or just
> attempting to understand the underfill issues. Thus I belive the
> advanced user should be given a power to shoot their leg by specifying
> hugher values than specified in the catalog. As I wrote, ICC driver
> already uses UINT_MAX for bandwidth values during the system bootup.
> RPM(h) will enforce bandwidth limitations on those votes.
> 

As per our discussion, there are two benefits of going beyond dpu 
advertised platform limits:

1) Making sure the catalog limits are indeed correct
2) If (1) is not an issue, then it allows a developer to go beyond the 
value and see whether this makes any difference to the issue

Although (2) makes it outside the scope of display debugging really, its 
still a datapoint.

So with the commit text adjusted to note this change and with the patch 
split discussed ealier in this thread, we can go ahead with this.

Thanks

>>
>>>>
>>>>>>
>>>>>> Now, if we move it to _dpu_core_perf_crtc_update_bus(), this is beyond the
>>>>>> check phase so incorrect values cannot be rejected.
>>>>>>
>>>>>> So we will still need to preserve overriding the values in
>>>>>> _dpu_core_perf_calc_crtc().
>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov at linaro.org>
>>>>>>>>> ---
>>>>>>>>>       drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 40 +++++++++++++--------------
>>>>>>>>>       1 file changed, 19 insertions(+), 21 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
>>>>>>>>> index 70f43e8359caee2082f2ca9944a17a6a20aa3d49..7ff3405c6867556a8dc776783b91f1da6c86ef3f 100644
>>>>>>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
>>>>>>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
>>>>>>>>> @@ -118,22 +118,9 @@ static void _dpu_core_perf_calc_crtc(const struct dpu_core_perf *core_perf,
>>>>>>>>>       		return;
>>>>>>>>>       	}
>>>>>>>>> -	memset(perf, 0, sizeof(struct dpu_core_perf_params));
>>>>>>>>> -
>>>>>>>>> -	if (core_perf->perf_tune.mode == DPU_PERF_MODE_MINIMUM) {
>>>>>>>>> -		perf->bw_ctl = 0;
>>>>>>>>> -		perf->max_per_pipe_ib = 0;
>>>>>>>>> -		perf->core_clk_rate = 0;
>>>>>>>>> -	} else if (core_perf->perf_tune.mode == DPU_PERF_MODE_FIXED) {
>>>>>>>>> -		perf->bw_ctl = core_perf->fix_core_ab_vote * 1000ULL;
>>>>>>>>> -		perf->max_per_pipe_ib = core_perf->fix_core_ib_vote;
>>>>>>>>> -		perf->core_clk_rate = core_perf->fix_core_clk_rate;
>>>>>>>>> -	} else {
>>>>>>>>> -		perf->bw_ctl = _dpu_core_perf_calc_bw(perf_cfg, crtc);
>>>>>>>>> -		perf->max_per_pipe_ib = perf_cfg->min_dram_ib;
>>>>>>>>> -		perf->core_clk_rate = _dpu_core_perf_calc_clk(perf_cfg, crtc, state);
>>>>>>>>> -	}
>>>>>>>>> -
>>>>>>>>> +	perf->bw_ctl = _dpu_core_perf_calc_bw(perf_cfg, crtc);
>>>>>>>>> +	perf->max_per_pipe_ib = perf_cfg->min_dram_ib;
>>>>>>>>> +	perf->core_clk_rate = _dpu_core_perf_calc_clk(perf_cfg, crtc, state);
>>>>>>>>>       	DRM_DEBUG_ATOMIC(
>>>>>>>>>       		"crtc=%d clk_rate=%llu core_ib=%u core_ab=%u\n",
>>>>>>>>>       			crtc->base.id, perf->core_clk_rate,
>>>>>>>>> @@ -222,18 +209,29 @@ static int _dpu_core_perf_crtc_update_bus(struct dpu_kms *kms,
>>>>>>>>>       {
>>>>>>>>>       	struct dpu_core_perf_params perf = { 0 };
>>>>>>>>>       	int i, ret = 0;
>>>>>>>>> -	u64 avg_bw;
>>>>>>>>> +	u32 avg_bw;
>>>>>>>>> +	u32 peak_bw;
>>>>>>>>>       	if (!kms->num_paths)
>>>>>>>>>       		return 0;
>>>>>>>>> -	dpu_core_perf_aggregate(crtc->dev, dpu_crtc_get_client_type(crtc), &perf);
>>>>>>>>> +	if (kms->perf.perf_tune.mode == DPU_PERF_MODE_MINIMUM) {
>>>>>>>>> +		avg_bw = 0;
>>>>>>>>> +		peak_bw = 0;
>>>>>>>>> +	} else if (kms->perf.perf_tune.mode == DPU_PERF_MODE_FIXED) {
>>>>>>>>> +		avg_bw = kms->perf.fix_core_ab_vote;
>>>>>>>>> +		peak_bw = kms->perf.fix_core_ib_vote;
>>>>>>>>> +	} else {
>>>>>>>>> +		dpu_core_perf_aggregate(crtc->dev, dpu_crtc_get_client_type(crtc), &perf);
>>>>>>>>> +
>>>>>>>>> +		avg_bw = div_u64(perf.bw_ctl, 1000); /*Bps_to_icc*/
>>>>>>>>> +		peak_bw = perf.max_per_pipe_ib;
>>>>>>>>> +	}
>>>>>>>>> -	avg_bw = perf.bw_ctl;
>>>>>>>>> -	do_div(avg_bw, (kms->num_paths * 1000)); /*Bps_to_icc*/
>>>>>>>>> +	avg_bw /= kms->num_paths;
>>>>>>>>>       	for (i = 0; i < kms->num_paths; i++)
>>>>>>>>> -		icc_set_bw(kms->path[i], avg_bw, perf.max_per_pipe_ib);
>>>>>>>>> +		icc_set_bw(kms->path[i], avg_bw, peak_bw);
>>>>>>>>>       	return ret;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>
>>>>>
>>>
> 


More information about the dri-devel mailing list