[Freedreno] [PATCH v2] drm/msm/dpu: improve DSC allocation
Dmitry Baryshkov
dmitry.baryshkov at linaro.org
Wed Dec 6 09:35:54 UTC 2023
On 05/12/2023 22:51, Kuogee Hsieh wrote:
>
> On 12/5/2023 11:23 AM, Dmitry Baryshkov wrote:
>> On Tue, 5 Dec 2023 at 20:12, Kuogee Hsieh <quic_khsieh at quicinc.com>
>> wrote:
>>>
>>> On 12/4/2023 4:23 PM, Dmitry Baryshkov wrote:
>>>> On Tue, 5 Dec 2023 at 01:55, Kuogee Hsieh <quic_khsieh at quicinc.com>
>>>> wrote:
>>>>> A DCE (Display Compression Engine) contains two DSC hard slice
>>>>> encoders. Each DCE start with even DSC encoder index followed by
>>>>> an odd DSC encoder index. Each encoder can work independently.
>>>>> But Only two DSC encoders from same DCE can be paired to work
>>>>> together to support merge mode. In addition, the DSC with even
>>>>> index have to mapping to even pingpong index and DSC with odd
>>>>> index have to mapping to odd pingpong index at its data path.
>>>>> This patch improve DSC allocation mechanism with consideration
>>>>> of above factors.
>>>>>
>>>>> Changes in V2:
>>>>> -- split _dpu_rm_reserve_dsc() into _dpu_rm_reserve_dsc_single() and
>>>>> _dpu_rm_reserve_dsc_pair()
>>>> Please don't send the new iteration of the patch if the discussion
>>>> is ongoing.
>>> Got it, thanks.
>>>> Quoting v1 review:
>>>>
>>>> Are the limitations (odd:odd, allocation in pairs, etc) applicable to
>>>> v1.1 encoders?
>>>>
>>>> I assume that at least 'allocate two consecutive DSC for DSC merge' is
>>>> not applicable, since there are no separate DCE units.
>>> yes, you are correct in the hardware point of view.
>>>
>>> However, at software, we can think think of dsc index 0 and 1 are bound
>>> to DCE #1, index 2 and 3 are bound to DCE #2 and etc in regardless of
>>> v1.1 or v1.2.
>>>
>>> By doing that,this dsc allocation algorithm should be able to apply to
>>> both.
>>>
>>> There is no case to have dsc index 1 and dsc index 2 bind together (skip
>>> dsc index 0) to support merge mode.
>> Yes. However this might cause issues on the platforms which have DSI,
>> DP and just two DSC encoders. Consider RM allocating two odd (or two
>> even) PP blocks. One for DSI, one for DP. Then if we need DSC on both
>> interfaces, the RM won't be able to allocate it.
>
>
> I am not sure this case is possible.
>
> DSC + pingpong allocation is base on Layer mixer which is allocated
> sequentially.
Not sequentially, but also in pairs. Yes, LM_5 (a pair for LM_2 on
sdm845) is connected to PINGPONG_3. However all this doesn't make things
easier to understand or to snoop bugs. I'd prefer to keep a simple
allocation code for older DSC blocks. Especially since I might have to
touch LM allocation for MDP 1.x platforms.
> ex, first lm --> pingpong --> dsc allocate completed then followed by
> next lm --> pingpong --> dsc allocation.
>
> therefore it is not possible to have case with two odd pingpong index to
> map two odd dsc index.
>
> With this algorithm, there is one case (below) which can not be
> supported is,
>
> dsc_0 for pingpong-0 of stand alone mode + dsc-1 and dsc-2 for
> pingpong-1 and ping pong-2 to support merge mode for DSC v1.1.
>
> However there is no hardware configuration which only have 3 or 5 dsc
> encoders due to dsc always come in pair except some low end chip which
> mdp come with only one dsc encoder.
>
>
>
>
>>
>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh at quicinc.com>
>>>>> ---
>>>>> drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 171
>>>>> ++++++++++++++++++++++++++++++---
>>>>> 1 file changed, 156 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>>> index 17ecf23..dafe1ee 100644
>>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>>> @@ -470,33 +470,174 @@ static int _dpu_rm_reserve_ctls(
>>>>> return 0;
>>>>> }
>>>>>
>>>>> -static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
>>>>> +static int _dpu_rm_reserve_dsc_single(struct dpu_rm *rm,
>>>>> struct dpu_global_state
>>>>> *global_state,
>>>>> - struct drm_encoder *enc,
>>>>> + uint32_t enc_id,
>>>>> const struct msm_display_topology
>>>>> *top)
>>>>> {
>>>>> - int num_dsc = top->num_dsc;
>>>>> - int i;
>>>>> + int num_dsc = 0;
>>>>> + int i, pp_idx;
>>>>> + int dsc_idx[DSC_MAX - DSC_0];
>>>>> + uint32_t pp_to_enc_id[PINGPONG_MAX - PINGPONG_0];
You don't need pp_to_enc_id copy in the _single usecase. Just use the
exisiting rm data.
>>>>> + int pp_max = PINGPONG_MAX - PINGPONG_0;
inline
>>>>> +
>>>>> + for (i = 0; i < DSC_MAX - DSC_0; i++)
>>>>> + dsc_idx[i] = 0;
You know, this is called memset.
>>>>> +
>>>>> + /* fill working copy with pingpong list */
>>>>> + memcpy(pp_to_enc_id, global_state->pingpong_to_enc_id,
>>>>> sizeof(pp_to_enc_id));
>>>>> +
>>>>> + for (i = 0; i < ARRAY_SIZE(rm->dsc_blks) && num_dsc >=
>>>>> top->num_dsc; i++) {
>>>>> + if (!rm->dsc_blks[i])
>>>>> + continue;
>>>>>
>>>>> - /* check if DSC required are allocated or not */
>>>>> - for (i = 0; i < num_dsc; i++) {
>>>>> - if (!rm->dsc_blks[i]) {
>>>>> - DPU_ERROR("DSC %d does not exist\n", i);
>>>>> - return -EIO;
>>>>> + if (global_state->dsc_to_enc_id[i]) /* used */
>>>>> + continue;
>>>>> +
>>>>> + /*
>>>>> + * find the pingpong index which had been reserved
>>>>> + * previously at layer mixer allocation
>>>>> + */
>>>>> + for (pp_idx = 0; pp_idx < pp_max; pp_idx++) {
>>>>> + if (pp_to_enc_id[pp_idx] == enc_id)
!reserved_by_other()
>>>>> + break;
>>>>> }
>>>>>
>>>>> - if (global_state->dsc_to_enc_id[i]) {
>>>>> - DPU_ERROR("DSC %d is already allocated\n", i);
>>>>> - return -EIO;
>>>>> + /*
>>>>> + * dsc even index must map to pingpong even index
>>>>> + * dsc odd index must map to pingpong odd index
>>>>> + */
>>>>> + if ((i & 0x01) != (pp_idx & 0x01))
>>>>> + continue;
>>>>> +
>>>>> + dsc_idx[num_dsc++] = i + 1; /* found, start
>>>>> from 1 */
It's not +1. It is DSC_0 + i.
>>>>> + }
>>>>> +
>>>>> + if (num_dsc < top->num_dsc) {
>>>>> + DPU_ERROR("DSC allocation failed num_dsc=%d
>>>>> required=%d\n",
>>>>> + num_dsc,
>>>>> top->num_dsc);
>>>>> + return -ENAVAIL;
>>>>> + }
>>>>> +
>>>>> + /* reserve dsc */
>>>>> + for (i = 0; i < top->num_dsc; i++) {
>>>>> + int j;
>>>>> +
>>>>> + j = dsc_idx[i];
>>>>> + if (j)
>>>>> + global_state->dsc_to_enc_id[j-1] = enc_id;
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> +static int _dpu_rm_reserve_dsc_pair(struct dpu_rm *rm,
>>>>> + struct dpu_global_state *global_state,
>>>>> + uint32_t enc_id,
>>>>> + const struct msm_display_topology *top)
>>>>> +{
>>>>> + int num_dsc = 0;
>>>>> + int i, pp_idx;
>>>>> + int dsc_idx[DSC_MAX - DSC_0];
>>>>> + uint32_t pp_to_enc_id[PINGPONG_MAX - PINGPONG_0];
>>>>> + int pp_max = PINGPONG_MAX - PINGPONG_0;
>>>>> +
>>>>> + for (i = 0; i < DSC_MAX - DSC_0; i++)
>>>>> + dsc_idx[i] = 0;
>>>>> +
>>>>> + /* fill working copy with pingpong list */
>>>>> + memcpy(pp_to_enc_id, global_state->pingpong_to_enc_id,
>>>>> sizeof(pp_to_enc_id));
>>>>> +
>>>>> + for (i = 0; i < ARRAY_SIZE(rm->dsc_blks) && num_dsc >=
>>>>> top->num_dsc; i++) {
Much easier. Just do i+= 2 here and in the loop check that the second
half of DCE is not used.
>>>>> + if (!rm->dsc_blks[i])
>>>>> + continue;
>>>>> +
>>>>> + if (global_state->dsc_to_enc_id[i]) { /* used */
>>>>> + /* consective dsc index to be paired */
>>>>> + if (num_dsc) { /* already start pairing,
>>>>> re start search */
>>>>> + num_dsc = 0;
>>>>> + /* fill working copy with pingpong
>>>>> list */
>>>>> + memcpy(pp_to_enc_id,
>>>>> global_state->pingpong_to_enc_id,
>>>>> +
>>>>> sizeof(pp_to_enc_id));
>>>>> + }
>>>>> + continue;
>>>>> + }
>>>>> +
>>>>> + /* odd index can not become start of pairing */
>>>>> + if (i & 0x01 && !num_dsc)
>>>>> + continue;
>>>>> +
>>>>> + /*
>>>>> + * find the pingpong index which had been reserved
>>>>> + * previously at layer mixer allocation
>>>>> + */
>>>>> + for (pp_idx = 0; pp_idx < pp_max; pp_idx++) {
>>>>> + if (pp_to_enc_id[pp_idx] == enc_id)
>>>>> + break;
>>>>> }
>>>>> +
>>>>> + /*
>>>>> + * dsc even index must map to pingpong even index
>>>>> + * dsc odd index must map to pingpong odd index
>>>>> + */
>>>>> + if ((i & 0x01) != (pp_idx & 0x01))
>>>>> + continue;
>>>>> +
>>>>> + /*
>>>>> + * delete pp_idx so that next pp_idx can be paired
>>>>> with
>>>>> + * next dsc_idx
>>>>> + */
>>>>> + pp_to_enc_id[pp_idx] = 0xffff;
>>>>> +
>>>>> + dsc_idx[num_dsc++] = i + 1; /* found, start
>>>>> from 1 */
>>>>> }
>>>>>
>>>>> - for (i = 0; i < num_dsc; i++)
>>>>> - global_state->dsc_to_enc_id[i] = enc->base.id;
>>>>> + if (num_dsc < top->num_dsc) {
>>>>> + DPU_ERROR("DSC allocation failed num_dsc=%d
>>>>> required=%d\n",
>>>>> + num_dsc,
>>>>> top->num_dsc);
>>>>> + return -ENAVAIL;
>>>>> + }
>>>>> +
>>>>> + /* reserve dsc */
>>>>> + for (i = 0; i < top->num_dsc; i++) {
>>>>> + int j;
>>>>> +
>>>>> + j = dsc_idx[i];
>>>>> + if (j)
>>>>> + global_state->dsc_to_enc_id[j-1] = enc_id;
>>>>> + }
>>>>>
>>>>> return 0;
>>>>> }
>>>>>
>>>>> +static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
>>>>> + struct dpu_global_state *global_state,
>>>>> + uint32_t enc_id,
>>>>> + const struct msm_display_topology *top)
>>>>> +{
>>>>> + if (!top->num_dsc || !top->num_intf)
>>>>> + return 0;
>>>>> +
>>>>> + /*
>>>>> + * Truth:
>>>>> + * 1) every layer mixer only connects to one pingpong
>>>>> + * 2) no pingpong split -- which is two layer mixers shared
>>>>> one pingpong
>>>>> + * 3) each DSC engine contains two dsc encoders
>>>>> + * -- index(0,1), index (2,3),... etc
>>>> Does this apply to v1.1 encoders?
>>>>
>>>>> + * 4) dsc pair can only happens with same DSC engine
>>>>> + * 5) odd pingpong connect to odd dsc
>>>>> + * 6) even pingpong connect to even dsc
>>>>> + * 7) pair: encoder +--> pp_idx_0 --> dsc_idx_0
>>>>> + +--> pp_idx_1 --> dsc_idx_1
>>>>> + */
>>>>> +
>>>>> + /* num_dsc should be either 1, 2 or 4 */
>>>>> + if (top->num_dsc > top->num_intf) /* merge mode */
>>>>> + return _dpu_rm_reserve_dsc_pair(rm, global_state,
>>>>> enc_id, top);
>>>>> + else
>>>>> + return _dpu_rm_reserve_dsc_single(rm, global_state,
>>>>> enc_id, top);
Can we end up with num_dsc = 2, num_intf = 2 and
_dpu_rm_reserve_dsc_single() allocating two DSC blocks despite its name?
>>>>> +}
>>>>> +
>>>>> static int _dpu_rm_make_reservation(
>>>>> struct dpu_rm *rm,
>>>>> struct dpu_global_state *global_state,
>>>>> @@ -518,7 +659,7 @@ static int _dpu_rm_make_reservation(
>>>>> return ret;
>>>>> }
>>>>>
>>>>> - ret = _dpu_rm_reserve_dsc(rm, global_state, enc,
>>>>> &reqs->topology);
>>>>> + ret = _dpu_rm_reserve_dsc(rm, global_state, enc->base.id,
>>>>> &reqs->topology);
Is there any reason to change this?
>>>>> if (ret)
>>>>> return ret;
>>>>>
>>>>> --
>>>>> 2.7.4
>>>>>
>>
>>
--
With best wishes
Dmitry
More information about the Freedreno
mailing list