[Freedreno] [RFC PATCH v2 06/13] drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation

Tue May 23 07:25:59 UTC 2023

On 23/05/2023 01:22, Abhinav Kumar wrote:
> Sorry for the delay, other topics delayed my response on this one.
> 
> On 5/18/2023 6:50 PM, Dmitry Baryshkov wrote:
>> On 19/05/2023 02:46, Abhinav Kumar wrote:
>>>
>>>
>>> On 3/20/2023 6:18 PM, Dmitry Baryshkov wrote:
>>>> Up to now the driver has been using encoder to allocate hardware
>>>> resources. Switch it to use CRTC id in preparation for the next step.
>>>>
>>>
>>> This decision to use encoder id instead of CRTC has been there 
>>> downstream for quite sometime. So most of the delay in reviewing this 
>>> series was trying to understand why we had this in the first place 
>>> and who knew that story.
>>>
>>> You are right that from the perspective of clone mode mapping 
>>> resources to CRTC is correct but using encoder id is what has been 
>>> working so far without too much difficulty with a little more 
>>> management  But another use-case can get simplified with this.
>>
>> Thank you for the historical perspective and for the feedback.
>>
>> I think that keeping resource allocation in dpu_encoder was required 
>> when INTF/WB themselves were allocated through the RM. However as 
>> INTF/WB blocks are now allocated in a static way, it doesn't make so 
>> much sense anymore.
>>
> 
> No, whether intf/wb block themselves are allocated through RM or not did 
> not really go into this. It was just about considering where all 
> hardware blocks make sense to be mapped : crtc or encoder. At the end, 
> considering the dsc, cwb and some more blocks encoder id was used.

Ack, thanks for the historical perspective. Didn't know that.

> 
>>>
>>> There is another angle to this. There are hardware blocks which can 
>>> do writeback and the physical display concurrently. We call it 
>>> concurrent writeback or CWB. This is present even on some of the 
>>> chipsets already supported upstream.
>>>
>>> Now, lets say we start a concurrent writeback session , in todays 
>>> code we will allocate the resources with the encoder id of the 
>>> writeback's encoder and the other physical display's encoder.
>>>
>>> When the session stops, we can just deallocate the resources of the 
>>> writeback encoder without touching the other encoder. So it will 
>>> become easier to just free up the resources mapped to the encoder.
>>
>> I have not looked into CWB programming. However from your description 
>> it would be as easy to do a full reallocation of the pipeline as just 
>> dropping the CWB/extra encoder. In fact this is what the driver was 
>> doing: in case of a modeset, drop all old resources and allocate full 
>> set of resources.
>>
> 
> Correct and the reason it was able to seamlessly drop all the old 
> resources was because of the encoder_id mapping, for the cwb use-case 
> using crtc id will not be so seamless to release the resources.

Can you please tell, why? At all the times we can drop all resources and 
then reacquire them. In the worst case it results in wasted time, but 
there should no be troubles doing this acquisition.

Also see below.	

>>> With clone mode implemented with CRTC id to map resources, we will 
>>> have to probably do some extra book-keeping to handle concurrent 
>>> writeback.
>>
>> Probably not much. We'd have to describe the topology/requirements and 
>> then pass that to RM. I have been waiting for this patchset (and up to 
>> some point the DSC/ctl) to be reviewed before finalizing/submitting 
>> the patches that rework the CTL interface to use this kind of data 
>> structure.
>>
> 
> There is some effort there from what I can see in the cwb case. I am 
> unable to visualize how your rework will help that case. If you want to 
> move this mapping to crtc id to that series to convince us how, then it 
> is a better fit for that series.

-ENOTFINISHED. Anyway, I think it is separate from the topology changes too.

> 
>>> Thats the use-case which gets impacted with this but for others, 
>>> there shouldnt be a major impact from what we see.
>>>
>>> That being said, what benefit are you seeing from making that change 
>>> now for this series? Why is it specifically needed for virtual planes?
>>>
>>> I see in the commit text that you have mentioned this is in 
>>> preparation for next step and next step talks about clone mode. But 
>>> clone mode is not there yet. So why this change now?
>>
>> There were several items that triggered this review.
>>
>> First thing first. Current design allocates resources from 
>> dpu_encoder_virt_atomic_check(), then in 
>> dpu_encoder_virt_atomic_mode_set() the driver has to poke manually in 
>> the cstate and fill CTL and LM. This kept on bugging me for some time. 
>> The encoder should not be poking into the CRTC state.
>>
> 
> Interesting point, but the DRM documentation seems to allow that and I 
> think now thats one of the positives to have things in encoder's atomic 
> check.
> 
> 803      * This callback is used to validate encoder state for atomic 
> drivers.
> 804      * Since the encoder is the object connecting the CRTC and 
> connector it
> 805      * gets passed both states, to be able to validate interactions and
> 806      * update the CRTC to match what the encoder needs for the 
> requested
> 807      * connector.
> 808      *
> 
> Encoder is the place where we have both the crtc and the connector state 
> being passed down to. the crtc's atomic check doesnt have the states of 
> encoder. So it just seems the encoder's atomic check is more centralized 
> for the entire pipeline.

First. Once [1] lands, the driver will not use connector state.

Regarding the encoder vs crtc state. Currently the CRTC's atomic_check() 
code can not influence resource allocation. Encoder's atomic_check() 
happens earlier. This results in code like msm_atomic_check().

If display resources are allocated from CRTC's atomic_check(), the 
mentioned function can go away by moving this check to 
dpu_crtc_atomic_check().

Last, but not least, let me point our the text you have quoted: "... 
update the CRTC to match what the encoder needs for the requested 
connector.". In our case the driver doesn't update the CRTC state 
according to the needs of the requested connector. Instead, it updates 
the CRTC state for what is needed for the CRTC. It is the CRTC itself 
who needs one or two LMs, not the connector.

[1] https://patchwork.freedesktop.org/series/117979/

> 
>> Then came the virtual planes. I think you'd agree that for the virtual 
>> planes we assign SSPPs to the CRTCs. My initial design kept enc_id for 
>> all the resources except the SSPP blocks (which were allocated per 
>> crtc_id). This was not perfect from the symmetry point of view.
>>
> 
> Yes for SSPPs, since they are connected to LMs and LM is mapped to CRTC 
> yes its right that it will be better of to map to CRTC , only if we 
> think about these two blocks in isolation. But if I would think that if 
> we want to validate the pipeline encoder is more central.

I don't agree here. Especially if we have cloned output (=CWB?) support.
There will be two encoders/bridge-chains/connectors being driven by a 
single CRTC. CRTC is the spider in the centre of the web with the 
threads going around to the connectors.

And this is pretty much supported by the fact that the encoder doesn't 
have its own atomic_state. IIRC, quite frequently encoder is just a shim 
between the CRTC and bridge-chain/connector.

>> Above all, filling the cstate in mode_set was too late for 
>> atomic_check to look into allocated LM to see if it supports 
>> MIXER_SOURCESPLIT or not. See dpu_plane_atomic_check().
>>
>> I started by moving the cstate filling from the 
>> dpu_encoder_virt_atomic_mode_set() to dpu_encoder_virt_atomic_check(). 
>> And then it just became natural to turn it to be CRTC-centric code. 
>> The encoder doesn't have to peek into CRTC state. CRTC/plane do not 
>> have to delay the checks becasuse the necessary data is provided by 
>> the other party at a later point.
>>
> 
> I agree that moving from mode_set() to atomic_check() for the cstate had 
> to be done. But like I wrote encoder being passed a crtc state is 
> exactly for this purpose as its central to crtc and connector.
> 
>>> Resource allocation is centered around LMs for the blocks we have 
>>> seen so far like LM, SSPP  but  ....
>>
>> And LM is a CRTC.
>>
>>> DSC is already an exception because in earlier chipsets we saw that 
>>> PP and DSC go together and cannot be muxed but later on that changed.
>>>
>>> Now, I have explained this case but I am not getting why we need to 
>>> make the change for this series. Is it absolutely necessary?
>>
>> Strictly, we don't. I can work around all the deficiencies of the 
>> current code. But it is more natural to do it now rather than later.
>>
> 
> Overall, like I wrote in my last response, I am not against the idea but 
> from your reasoning so far and from the responses i have given above, I 
> do not think that this series demands this change necessarily. I think 
> without this change, the virtual plane series itself is pretty small to 
> review and will be an easier task to land that first. We should deal 
> with it on a need basis rather than just deciding to use virtual planes 
> to make this design change.

To save time on arguing, I'll take a look at rebasing patches. Please 
continue review with patches 1,2,9-13.

> 
>>>
>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov at linaro.org>
>>>> ---
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c |  16 +--
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  10 +-
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 119 
>>>> ++++++++++----------
>>>>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h      |  15 ++-
>>>>   4 files changed, 77 insertions(+), 83 deletions(-)
>>
>>
>> [trimmed the patch contents, it is not relevant to this discussion]
>>

-- 
With best wishes
Dmitry