[Freedreno] [RFC PATCH 1/2] drm/msm/dpu: add dsc helper functions

Jessica Zhang quic_jesszhan at quicinc.com
Thu Mar 16 01:28:20 UTC 2023



On 2/27/2023 1:24 PM, Abhinav Kumar wrote:
> 
> 
> On 2/27/2023 11:25 AM, Dmitry Baryshkov wrote:
>> 27 февраля 2023 г. 19:59:35 GMT+02:00, Abhinav Kumar 
>> <quic_abhinavk at quicinc.com> пишет:
>>>
>>>
>>> On 2/27/2023 4:45 AM, Dmitry Baryshkov wrote:
>>>> On Mon, 27 Feb 2023 at 01:49, Abhinav Kumar 
>>>> <quic_abhinavk at quicinc.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2/26/2023 5:09 AM, Dmitry Baryshkov wrote:
>>>>>> On 26/02/2023 02:47, Abhinav Kumar wrote:
>>>>>>> Hi Dmitry
>>>>>>>
>>>>>>> On 2/25/2023 7:23 AM, Dmitry Baryshkov wrote:
>>>>>>>> On 25/02/2023 02:36, Abhinav Kumar wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2/24/2023 3:53 PM, Dmitry Baryshkov wrote:
>>>>>>>>>> On Sat, 25 Feb 2023 at 00:26, Abhinav Kumar
>>>>>>>>>> <quic_abhinavk at quicinc.com> wrote:
>>>>>>>>>>> On 2/24/2023 1:36 PM, Dmitry Baryshkov wrote:
>>>>>>>>>>>> 24 февраля 2023 г. 23:23:03 GMT+02:00, Abhinav Kumar
>>>>>>>>>>>> <quic_abhinavk at quicinc.com> пишет:
>>>>>>>>>>>>> On 2/24/2023 1:13 PM, Dmitry Baryshkov wrote:
>>>>>>>>>>>>>> On 24/02/2023 21:40, Kuogee Hsieh wrote:
>>>>>>>>>>>>>>> Add DSC helper functions based on DSC configuration profiles
>>>>>>>>>>>>>>> to produce
>>>>>>>>>>>>>>> DSC related runtime parameters through both table look up 
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> runtime
>>>>>>>>>>>>>>> calculation to support DSC on DPU.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are 6 different DSC configuration profiles are 
>>>>>>>>>>>>>>> supported
>>>>>>>>>>>>>>> currently.
>>>>>>>>>>>>>>> DSC configuration profiles are differiented by 5 keys, DSC
>>>>>>>>>>>>>>> version (V1.1),
>>>>>>>>>>>>>>> chroma (444/422/420), colorspace (RGB/YUV), bpc(8/10),
>>>>>>>>>>>>>>> bpp (6/7/7.5/8/9/10/12/15) and SCR (0/1).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Only DSC version V1.1 added and V1.2 will be added later.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These helpers should go to
>>>>>>>>>>>>>> drivers/gpu/drm/display/drm_dsc_helper.c
>>>>>>>>>>>>>> Also please check that they can be used for i915 or for 
>>>>>>>>>>>>>> amdgpu
>>>>>>>>>>>>>> (ideally for both of them).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, it cannot. So each DSC encoder parameter is calculated 
>>>>>>>>>>>>> based
>>>>>>>>>>>>> on the HW core which is being used.
>>>>>>>>>>>>>
>>>>>>>>>>>>> They all get packed to the same DSC structure which is the
>>>>>>>>>>>>> struct drm_dsc_config but the way the parameters are 
>>>>>>>>>>>>> computed is
>>>>>>>>>>>>> specific to the HW.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This DPU file helper still uses the drm_dsc_helper's
>>>>>>>>>>>>> drm_dsc_compute_rc_parameters() like all other vendors do but
>>>>>>>>>>>>> the parameters themselves are very HW specific and belong to
>>>>>>>>>>>>> each vendor's dir.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is not unique to MSM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Lets take a few other examples:
>>>>>>>>>>>>>
>>>>>>>>>>>>> AMD:
>>>>>>>>>>>>> https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/amd/display/dc/dml/dsc/rc_calc_fpu.c#L165
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> i915:
>>>>>>>>>>>>> https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/i915/display/intel_vdsc.c#L379
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I checked several values here. Intel driver defines more 
>>>>>>>>>>>> bpc/bpp
>>>>>>>>>>>> combinations, but the ones which are defined in intel_vdsc 
>>>>>>>>>>>> and in
>>>>>>>>>>>> this patch seem to match. If there are major differences there,
>>>>>>>>>>>> please point me to the exact case.
>>>>>>>>>>>>
>>>>>>>>>>>> I remember that AMD driver might have different values.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Some values in the rc_params table do match. But the
>>>>>>>>>>> rc_buf_thresh[] doesnt.
>>>>>>>>>>
>>>>>>>>>> Because later they do:
>>>>>>>>>>
>>>>>>>>>> vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/i915/display/intel_vdsc.c#L40
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Vs
>>>>>>>>>>>
>>>>>>>>>>> +static u16 dpu_dsc_rc_buf_thresh[DSC_NUM_BUF_RANGES - 1] = {
>>>>>>>>>>> +               0x0e, 0x1c, 0x2a, 0x38, 0x46, 0x54,
>>>>>>>>>>> +               0x62, 0x69, 0x70, 0x77, 0x79, 0x7b, 0x7d, 0x7e
>>>>>>>>>>> +};
>>>>>>>>>>
>>>>>>>>>> I'd prefer to have 896, 1792, etc. here, as those values come 
>>>>>>>>>> from the
>>>>>>>>>> standard. As it's done in the Intel driver.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Got it, thanks
>>>>>>>>>
>>>>>>>>>>> I dont know the AMD calculation very well to say that moving 
>>>>>>>>>>> this
>>>>>>>>>>> to the
>>>>>>>>>>> helper is going to help.
>>>>>>>>>>
>>>>>>>>>> Those calculations correspond (more or less) at the first 
>>>>>>>>>> glance to
>>>>>>>>>> what intel does for their newer generations. I think that's 
>>>>>>>>>> not our
>>>>>>>>>> problem for now.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Well, we have to figure out if each value matches and if each of
>>>>>>>>> them come from the spec for us and i915 and from which section. So
>>>>>>>>> it is unfortunately our problem.
>>>>>>>>
>>>>>>>> Otherwise it will have to be handled by Marijn, me or anybody else
>>>>>>>> wanting to hack up the DSC code. Or by anybody adding DSC 
>>>>>>>> support to
>>>>>>>> the next platform and having to figure out the difference between
>>>>>>>> i915, msm and their platform.
>>>>>>>>
>>>>>>>
>>>>>>> Yes, I wonder why the same doubt didn't arise when the other vendors
>>>>>>> added their support both from other maintainers and others.
>>>>>>>
>>>>>>> Which makes me think that like I wrote in my previous response, 
>>>>>>> these
>>>>>>> are "recommended" values in the spec but its not mandatory.
>>>>>>
>>>>>> I think, it is because there were no other drivers to compare. In 
>>>>>> other
>>>>>> words, for a first driver it is pretty logical to have everything
>>>>>> handled on its own. As soon as we start getting other 
>>>>>> implementations of
>>>>>> a feature, it becomes logical to think if the code can be 
>>>>>> generalized.
>>>>>> This is what we see we with the HDCP series or with the code being 
>>>>>> moved
>>>>>> to DP helpers.
>>>>>>
>>>>>
>>>>> We were not the second, MSM was/is the third to add support for DSC 
>>>>> afer
>>>>> i915 and AMD. Thats what made me think why whoever was the second 
>>>>> didnt
>>>>> end up generalizing. Was it just missed out or was it intentionally 
>>>>> left
>>>>> in the vendor driver.
>>>>
>>>> I didn't count AMD here, since it calculates some of the params rather
>>>> than using the fixed ones from the model.
>>>>
>>>>>
>>>>>>>
>>>>>>> Moving this to the drm_dsc_helper is generalizing the tables and not
>>>>>>> giving room for the vendors to customize even if they want to (which
>>>>>>> the spec does allow).
>>>>>>
>>>>>> That depends on the API you select. For example, in
>>>>>> intel_dsc_compute_params() I see customization being applied to
>>>>>> rc_buf_thresh in 6bpp case. I'd leave that to the i915 driver.
>>>>>>
>>>>>
>>>>> Thanks for going through the i915 to figure out that the 6bpp is 
>>>>> handled
>>>>> in a customized way. So what you are saying is let the helper first 
>>>>> fill
>>>>> up the recommended values of the spec, whatever is changed from 
>>>>> that let
>>>>> the vendor driver override that.
>>>>>
>>>>> Thats where the case-by-case handling comes.
>>>>>
>>>>> Why not we do this way? Like you mentioned lets move these tables 
>>>>> to the
>>>>> drm_dsc_helper and let MSM driver first use those.
>>>>>
>>>>> Then in a separate patchset if i915 and AMD would like to move to 
>>>>> that,
>>>>> let them handle it for their respective drivers instead of MSM going
>>>>> through whats customized for each calculation and doing it.
>>>>>
>>>>> I am hesitant to take up that effort.
>>>>
>>>> Writing a tool to convert model's rc_Nbpc_Mbpp_foo.cfg into C
>>>> languages structures used by Intel code took 15-20 minutes. Plugging
>>>> generated structures took another 5 minutes. I will send the patches
>>>> later today or tomorrow, as I find a time slot to clean them. Thank
>>>> you for spending more time on arguing than it took me to generate &
>>>> verify the data.
>>>>
>>>
>>> Great, we will wait for your patches. We didnt intend to spend time 
>>> on this at this point. We always wanted to take it up in a separate 
>>> series of moving the tables.
>>
>> Getting rid of msm_display_dsc_config and then making use of 
>> drm_dsc_compute_rc_parameters() was bad enough. So, let's get things 
>> done in a good way now, rather than at some random point later.
>>
> 
> Alright, we will wait for your change then :)
> 
>>
>>>
>>> You preferred not to wait. Upto you.
>>>
>>> So thanks for doing it.
>>>
>>>>>
>>>>> If the recommended values work for the vendor, they can clean it up 
>>>>> and
>>>>> move to the drm_dsc_helper themselves and preserving their
>>>>> customizations rather than one vendor doing it for all of them.
>>>>>
>>>>>> In case the driver needs to perform customization of the params, 
>>>>>> nothing
>>>>>> stops it drop applying after filling all the RC params in the
>>>>>> drm_dsc_config struct via the generic helper.
>>>>>>
>>>>>>
>>>>>>> So if this has any merit and if you or Marijn would like to take it
>>>>>>> up, go for it. We would do the same thing as either of you would 
>>>>>>> have
>>>>>>> to in terms of figuring out the difference between msm and the 
>>>>>>> i915 code.
>>>>>>>
>>>>>>> This is not a generic API we are trying to put in a helper, these 
>>>>>>> are
>>>>>>> hard-coded tables so there is a difference between looking at 
>>>>>>> these Vs
>>>>>>> looking at some common code which can move to the core.
>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Also, i think its too risky to change other drivers to use
>>>>>>>>>>> whatever math
>>>>>>>>>>> we put in the drm_dsc_helper to compute thr RC params because
>>>>>>>>>>> their code
>>>>>>>>>>> might be computing and using this tables differently.
>>>>>>>>>>>
>>>>>>>>>>> Its too much ownership for MSM developers to move this to
>>>>>>>>>>> drm_dsc_helper
>>>>>>>>>>> and own that as it might cause breakage of basic DSC even if 
>>>>>>>>>>> some
>>>>>>>>>>> values
>>>>>>>>>>> are repeated.
>>>>>>>>>>
>>>>>>>>>> It's time to stop thinking about ownership and start thinking 
>>>>>>>>>> about
>>>>>>>>>> shared code. We already have two instances of DSC tables. I don't
>>>>>>>>>> think having a third instance, which is a subset of an existing
>>>>>>>>>> dataset, would be beneficial to anybody.
>>>>>>>>>> AMD has complicated code which supports half-bit bpp and 
>>>>>>>>>> calculates
>>>>>>>>>> some of the parameters. But sharing data with the i915 driver is
>>>>>>>>>> straightforward.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sorry, but I would like to get an ack from i915 folks if this 
>>>>>>>>> is going
>>>>>>>>> to be useful to them if we move this to helper because we have to
>>>>>>>>> look at every table. Not just one.
>>>>>>>>
>>>>>>>> Added i915 maintainers to the CC list for them to be able to 
>>>>>>>> answer.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks, lets wait to hear from them about where finally these tables
>>>>>>> should go but thats can be taken up as a separate effort too.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Also, this is just 1.1, we will add more tables for 1.2. So we 
>>>>>>>>> will
>>>>>>>>> have to end up changing both 1.1 and 1.2 tables as they are
>>>>>>>>> different for QC.
>>>>>>>>
>>>>>>>> I haven't heard back from Kuogee about the possible causes of using
>>>>>>>> rc/qp values from 1.2 even for 1.1 panels. Maybe you can comment on
>>>>>>>> that? In other words, can we always stick to the values from 1.2
>>>>>>>> standard? What will be the drawback?
>>>>>>>>
>>>>>>>> Otherwise, we'd have to have two different sets of values, like you
>>>>>>>> do in your vendor driver.
>>>>>>>>
>>>>>>>
>>>>>>> I have responded to this in the other email.
>>>>>>>
>>>>>>> All this being said, even if the rc tables move the drm_dsc_helper
>>>>>>> either now or later on, we will still need MSM specific calculations
>>>>>>> for many of the other encoder parameters (which are again either
>>>>>>> hard-coded or calculated). Please refer to the
>>>>>>> sde_dsc_populate_dsc_config() downstream. And yes, you will not find
>>>>>>> those in the DP spec directly.
>>>>>>>
>>>>>>> So we will still need a dsc helper for MSM calculations to be common
>>>>>>> for DSI / DP irrespective of where the tables go.
>>>>>>>
>>>>>>> So, lets finalize that first.
>>>>>>
>>>>>> I went on and trimmed sde_dsc_populate_dsc_config() to remove
>>>>>> duplication with the drm_dsc_compute_rc_parameters() (which we 
>>>>>> already
>>>>>> use for the MSM DSI DSC).
>>>>>>
>>>>>> Not much is left:
>>>>>>
>>>>>> dsc->first_line_bpg_offset set via the switch
>>>>>>
>>>>>> dsc->line_buf_depth = bpc + 1;
>>>>>> dsc->mux_word_size = bpc > 10 ? DSC_MUX_WORD_SIZE_12_BPC:
>>>>>>            DSC_MUX_WORD_SIZE_8_10_BPC;
>>>>>>
>>>>>> if ((dsc->dsc_version_minor == 0x2) && (dsc->native_420))
>>>>>>        dsc->nsl_bpg_offset = (2048 *
>>>>>>                 (DIV_ROUND_UP(dsc->second_line_bpg_offset,
>>>>>>                                    (dsc->slice_height - 1))));
>>>>>>
>>>>>> dsc->initial_scale_value = 8 * dsc->rc_model_size /
>>>>>>                            (dsc->rc_model_size - 
>>>>>> dsc->initial_offset);
>>>>>>
>>>>>>
>>>>>> mux_word_size comes from the standard (must)
>>>>>> initial_scale_value calculation is recommended, but not required
>>>>>> nsl_bpg_offset follows the standard (must), also see below (*).
>>>>>>
>>>>>> first_line_bpg_offset calculation differs between three drivers. The
>>>>>> standard also provides a recommended formulas. I think we can 
>>>>>> leave it
>>>>>> as is for now.
>>>>>>
>>>>>> I think, that mux_word_size and nsl_bpg_offset calculation should be
>>>>>> moved to drm_dsc_compute_rc_parameters(), while leaving
>>>>>> initial_scale_value in place (in the driver code).
>>>>>>
>>>>>> * I think nsl_bpg_offset is slightly incorrectly calculated. Standard
>>>>>> demands that it is set to 'second_line_bpg_offset / (slice_height 
>>>>>> - 1),
>>>>>> rounded up to 16 fraction bits', while SDE driver code sets it to the
>>>>>> value rounded up to the next integer (having 16 fraction bits
>>>>>> representation).
>>>>>>
>>>>>> In my opinion correct calculation should be:
>>>>>> dsc->nsl_bpg_offset = DIV_ROUND_UP(2048 * 
>>>>>> dsc->second_line_bpg_offset,
>>>>>>                                    (dsc->slice_height - 1));
>>>>>>
>>>>>> Could you please check, which one is correct according to the 
>>>>>> standard?
>>>>>>
>>>>>>
>>>>>
>>>>> Sure, i will check about nsl_bpg_offset. But sorry if I was not more
>>>>> clear about this but sde_dsc_populate_dsc_config() is only one example
>>>>> which from your analysis can be moved to the drm_dsc_helper() but not
>>>>> the initial line calculation _dce_dsc_initial_line_calc(),
>>>>> _dce_dsc_ich_reset_override_needed() , _dce_dsc_setup_helper().
>>>>
>>>> The initial_line is already calculated in dpu_encoder.c. As for the
>>>> _dce_dsc_ich_reset_override_needed(), I don't think we support partial
>>>> updates in the upstream driver.
>>>>
>>>>>
>>>>> All of these are again common between DSI and DP.
>>>>>
>>>>> So in addition to thinking about what can be moved to the 
>>>>> drm_dsc_helper
>>>>> also think about what is specific to MSM but common to DSI and DP 
>>>>> modules.
>>>>>
>>>>> That was the bigger picture I was trying to convey.
>>>>
>>>
>>> _dce_dsc_initial_line_calc which will get expanded with v1.2 gets 
>>> added has much more than whats there in upstream today.
>>>
>>> Dumping everything in dpu_encoder is not the solution. Sorry.
>>
>> But it is still the DPU thing. So, no problems.
>>
> 
> I am not fully convinced. We will wait for your post, then see how the 
> code looks.

Hi Dmitry and Abhinav,

Just wanted to follow up on this thread. I've gone over the MSM-specific 
DSC params for DP and DSI and have found a few shared calculations and 
variables between both DSI and DP paths:

- (as mentioned earlier in the thread) almost all the calculations in 
dpu_dsc_populate_dsc_config() match dsi_populate_dsc_params() [1]. The 
only difference in the math I'm seeing is initial_scale_value.

- dsc_extra_pclk_cycle_cnt and dce_bytes_per_line, which were introduced 
in Kuogee's v1 DSC series [2], are used for DSI, DP, and the DPU timing 
engine. dsc_extra_pclk_cycle_cnt is calculated based on pclk_per_line 
(which is calculated differently between DP and DSI), but 
dce_bytes_per_line is calculated the same way between DP and DSI.

To avoid having to duplicate math in 2 different places, I think it 
would help to have these calculations in some msm_dsc_helper.c file. Any 
thoughts on this?

Thanks,

Jessica Zhang

[1] 
https://elixir.bootlin.com/linux/v6.3-rc2/source/drivers/gpu/drm/msm/dsi/dsi_host.c#L1756

[2] https://patchwork.freedesktop.org/patch/519845/?series=113240&rev=1

> 
>>>
>>>>
>>>>
>>


More information about the dri-devel mailing list