[PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

Leo Li sunpeng.li at amd.com
Wed Apr 17 18:51:59 UTC 2024




On 2024-04-16 10:10, Harry Wentland wrote:
> 
> 
> On 2024-04-16 04:01, Pekka Paalanen wrote:
>> On Mon, 15 Apr 2024 18:33:39 -0400
>> Leo Li <sunpeng.li at amd.com> wrote:
>>
>>> On 2024-04-15 04:19, Pekka Paalanen wrote:
>>>> On Fri, 12 Apr 2024 16:14:28 -0400
>>>> Leo Li <sunpeng.li at amd.com> wrote:
>>>>    
>>>>> On 2024-04-12 11:31, Alex Deucher wrote:
>>>>>> On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen
>>>>>> <pekka.paalanen at collabora.com> wrote:
>>>>>>>
>>>>>>> On Fri, 12 Apr 2024 10:28:52 -0400
>>>>>>> Leo Li <sunpeng.li at amd.com> wrote:
>>>>>>>      
>>>>>>>> On 2024-04-12 04:03, Pekka Paalanen wrote:
>>>>>>>>> On Thu, 11 Apr 2024 16:33:57 -0400
>>>>>>>>> Leo Li <sunpeng.li at amd.com> wrote:
>>>>>>>>>      
>>>>>>>
>>>>>>> ...
>>>>>>>      
>>>>>>>>>> That begs the question of what can be nailed down and what can left to
>>>>>>>>>> independent implementation. I guess things like which plane should be enabled
>>>>>>>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed)
>>>>>>>>>> can be defined. How to handle atomic test failures could be as well.
>>>>>>>>>
>>>>>>>>> What room is there for the interpretation of zpos values?
>>>>>>>>>
>>>>>>>>> I thought they are unambiguous already: only the relative numerical
>>>>>>>>> order matters, and that uniquely defines the KMS plane ordering.
>>>>>>>>
>>>>>>>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way
>>>>>>>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I
>>>>>>>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently
>>>>>>>> documented as a way to support underlay.
>>>>>>>
>>>>>>> I always thought it's obvious that the zpos numbers dictate the plane
>>>>>>> order without any other rules. After all, we have the universal planes
>>>>>>> concept, where the plane type is only informational to aid heuristics
>>>>>>> rather than defining anything.
>>>>>>>
>>>>>>> Only if the zpos property does not exist, the plane types would come
>>>>>>> into play.
>>>>>>>
>>>>>>> Of course, if there actually exists userspace that fails if zpos allows
>>>>>>> an overlay type plane to be placed below primary, or fails if primary
>>>>>>> zpos is not zero, then DRM needs a new client cap.
>>>>>
>>>>> Right, it wasn't immediately clear to me that the API allowed placement of
>>>>> things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*,
>>>>> there's nothing that forbids it.
>>>>>   
>>>>>>>      
>>>>>>>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So
>>>>>>>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY
>>>>>>>> for the underlay view.
>>>>>>>
>>>>>>> That's totally ok. It works, right? Plane type does not matter if the
>>>>>>> KMS driver accepts the configuration.
>>>>>>>
>>>>>>> What is a "scanout plane"? Aren't all KMS planes by definition scanout
>>>>>>> planes?
>>>>>
>>>>> Pardon my terminology, I thought the scanout plane was where weston rendered
>>>>> non-offloadable surfaces to. I guess it's more correct to call it the "render
>>>>> plane". On weston, it seems to be always assigned to the PRIMARY.
>>>>>   
>>>>
>>>> The assignment restriction is just technical design debt. It is
>>>> limiting. There is no other good reason for it, than when lighting
>>>> up a CRTC for the first time, Weston should do it with the renderer FB
>>>> only, on the plane that is most likely to succeed i.e. PRIMARY. After
>>>> the CRTC is lit, there should be no built-in limitations in what can go
>>>> where.
>>>>
>>>> The reason for this is that if a CRTC can be activated, it must always
>>>> be able to show the renderer FB without incurring a modeset. This is
>>>> important for ensuring that the fallback compositing (renderer) is
>>>> always possible. So we start with that configuration, and everything
>>>> else is optional bonus.
>>>
>>> Genuinely curious - What exactly is limiting with keeping the renderer FB on
>>> PRIMARY? IOW, what is the additional benefit of placing the renderer FB on
>>> something other than PRIMARY?
>>
>> The limitations come from a combination of hardware limitations.
>> Perhaps zpos is not mutable, or maybe other planes cannot arbitrarily
>> move between above and below the primary. This reduces the number of
>> possible configurations, which might cause off-loading to fail.
>>
>> I think older hardware has more of these arbitrary restrictions.

I see. I was thinking that drivers can do under-the-hood stuff to present a
mutable zpos to clients, even if their hardware planes cannot be arbitrarily
rearranged, by mapping the PRIMARY to a different hardware plane. But not all
planes have the same function, so this sounds more complicated than helpful.

>>
>>>>>
>>>>> For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay
>>>>> plane would work. But I think keeping the render plane on PRIMARY (a la weston)
>>>>> makes underlay arrangements easier to allocate, and would be nice to incorporate
>>>>> into a shared algorithm.
>>>>
>>>> If zpos exists, I don't think such limitation is a good idea. It will
>>>> just limit the possible configurations for no reason.
>>>>
>>>> With zpos, the KMS plane type should be irrelevant for their
>>>> z-ordering. Underlay vs. overlay completely loses its meaning at the
>>>> KMS level.
>>>
>>> Right, the plane types loose their meanings. But at least with the way
>>> libliftoff builds the plane arrangement, where we first allocate the renderer fb
>>> matters.
>>>
>>> libliftoff incrementally builds the atomic state by adding a single plane to the
>>> atomic state, then testing it. It essentially does a depth-first-search of all
>>> possible arrangements, pruning the search on atomic test fail. The state that
>>> offloads the most number of FBs will be the arrangement used.
>>>
>>> Of course, it's unlikely that the entire DFS tree will traversed in time for a
>>> frame. So the key is to search the most probable and high-benefit branches
>>> first, while minimizing the # of atomic tests needed, before a hard-coded
>>> deadline is hit.
>>>
>>> Following this algorithm, the PRIMARY needs to be enabled first, followed by all
>>> the secondary planes. After a plane is enabled, it's not preferred to change
>>> it's assigned FB, since that can cause the state to be rejected (in actuality,
>>> not just the FB, but also any color and transformation stuffs associated with
>>> the surface). It is preferable to build on the state by enabling another
>>> fb->plane. This is where changing a plane's zpos to be above/below the PRIMARY
>>> is advantageous, rather than changing the FBs assigned, to accommodate
>>> overlay/underlay arrangements.
>>
>> This all sounds reasonable, but why limit this to only the renderer FB
>> on primary plane? The same idea should apply equally to any FB on any
>> plane. Then one needs more heuristics on when to stop the search short,
>> and when to reconsider each FB-plane assignment in case new candidates
>> have appeared but the old ones have not disappeared.

libliftoff starts the search by assigning the renderer FB, if one is provided by
the compositor, to PRIMARY. I think the reason is to always have the renderer
option available for FBs that need it. Eventually, if the search tree is
traversed enough, an arrangement that does not need the renderer fb may be
found, if all the FBs can be assigned, and there are enough planes for them. But
we may not get there before the deadline.

Perhaps having more time to search is the solution here.

(p.s. if a candidate FB is added or removed, libliftoff starts the search anew)

>>
>>> I imagine that any algorithm which incrementally builds up the plane arrangement
>>> will have a similar preference. Of course, it's entirely possible that such an
>>> algorithm isn't the best, I admittedly have not thought much about other
>>> possibilities, yet...
>>
>> It's a complicated problem, indeed. Maybe there needs to be a background
>> task that is not limited by the page flip deadline and can do an
>> exhaustive search over many refresh periods.
>>
> 
> That would be nice. Kick this off when there is a configuration change,
> e.g., user starts video playback, opens a new video, etc.
> 
> One would need to avoid doing too much of that, though, as one could
> envision scenarios where this happens frequently and could have its
> own impact on power by keeping the CPU busy.
> 
> Harry

I recall emersion had a similar suggestion for libliftoff by caching the
incomplete plane arrangement for further processing on future frames once the
deadline is reached. It avoids the need for a separate task.

Having more time to do a more exhaustive search would make zpos meaningless
outside of determining the correct z-ordering, as pq previously mentioned. It
would support hardware that have zpos limitations. It is more complex, but maybe
that's fine, as long as the complexity doesn't bleed into other parts of the
compositor.

There are still ways to limit the # of atomic tests needed for the search, which
will help speed things up (already considered by libliftoff today):

* IN_FORMAT property for what FB formats a plane supports
* zpos property for correct z-ordering
* Occlusion rules. A FB occluded by a rendered FB or underlay-ed FB cannot be
overlay-ed, for example
* And potentially more

Thanks,
Leo

> 
>>
>> Thanks,
>> pq
> 


More information about the dri-devel mailing list