[PATCH v2] drm/amd/display: Fix two cursor duplication when using overlay
Harry Wentland
harry.wentland at amd.com
Tue Aug 24 16:48:53 UTC 2021
On 2021-08-24 10:56 a.m., Kazlauskas, Nicholas wrote:
> On 2021-08-24 9:59 a.m., Simon Ser wrote:
>> Hi Rodrigo!
>>
>> Thanks a lot for your reply! Comments below, please bear with me: I'm
>> a bit familiar with the cursor issues, but my knowledge of AMD hw is
>> still severely lacking.
>>
>> On Wednesday, August 18th, 2021 at 15:18, Rodrigo Siqueira <Rodrigo.Siqueira at amd.com> wrote:
>>
>>> On 08/18, Simon Ser wrote:
>>>> Hm. This patch causes a regression for me. I was using primary + overlay
>>>> not covering the whole primary plane + cursor before. This patch breaks it.
>>>
>>> Which branch are you using? Recently, I reverted part of that patch,
>>> see:
>>>
>>> Revert "drm/amd/display: Fix overlay validation by considering cursors"
>>
>> Right. This revert actually makes things worse. Prior to the revert the
>> overlay could be enabled without the cursor. With the revert the overlay
>> cannot be enabled at all, even if the cursor is disabled.
>>
>>>> This patch makes the overlay plane very useless for me, because the primary
>>>> plane is always under the overlay plane.
>>>
>>> I'm curious about your use case with overlay planes. Could you help me
>>> to understand it better? If possible, describe:
>>>
>>> 1. Context and scenario
>>> 2. Compositor
>>> 3. Kernel version
>>> 4. If you know which IGT test describe your test?
>>>
>>> I'm investigating overlay issues in our driver, and a userspace
>>> perspective might help me.
>>
>> I'm working on gamescope [1], Valve's gaming compositor. Our use-cases include
>> displaying (from bottom to top) a game in the background, a notification popup
>> over it in the overlay plane, and a cursor in the cursor plane. All of the
>> planes might be rotated. The game's buffer might be scaled and might not cover
>> the whole CRTC.
>>
>> libliftoff [2] is used to provide vendor-agnostic KMS plane offload. In other
>> words, I'd prefer to avoid relying too much on hardware specific details, e.g.
>> I'd prefer to avoid hole-punching via a underlay (it might work on AMD hw, but
>> will fail on many other drivers).
>
> Hi Simon,
>
> Siqueria explained a bit below, but the problem is that we don't have dedicated cursor planes in hardware.
>
> It's easiest to under the hardware cursor as being constrained within the DRM plane specifications. Each DRM plane maps to 1 (or 2) hardware pipes and the cursor has to be drawn along with it. The cursor will inherit the scale, bounds, and color management associated with the underlying pipes.
>
To elaborate on this a bit more, each HW plane's scanout engine
has the ability to scan out a cursor, in addition to the plane's
framebuffer. This cursor is drawn onto the plane at the scanout
phase. Any further scaling, color processing, or other operation
on the pipe will equally apply to the cursor as to the framebuffer
itself.
Our driver will look at the cursor position and place the cursor
with the topmost HW plane at that position.
> From the kernel display driver perspective that makes things quite difficult with the existing DRM API - we can only really guarantee you get HW cursor when the framebuffer covers the entire screen and it is unscaled or matches the scaling expected by the user.
>
> Hole punching generally satisfies both of these since it's a transparent framebuffer that covers the entire screen.
>
> The case that's slightly more complicated is when the overlay doesn't cover the entire screen but the primary plane does. We can still enable the cursor if the primary plane and overlay have a matching scale and color management - our display hardware can draw the cursor on multiple pipes. (Note: this statement only applies for DCN2.1+)
>
> If the overlay plane does not cover the entire screen and the scale or the color management differs then we cannot enable the HW cursor plane. As you mouse over the bounds of the overlay you will see the cursor drawn differently on the primary and overlay pipe.
>
> If the overlay plane and primary plane do not cover the entire screen then you will lose HW cursor outside of the union of their bounds.
>
> Correct me if I'm wrong, but I think your usecase [1] falls under the category where:
> 1. Primary plane covers entire screen
> 2. Overlay plane does not cover the entire screen
> 3. Overlay plane is scaled
>
If I understood Simon right the primary plane (bottom-most,
game plane) might not cover the entire screen, which is fine.
Is the Steam overlay always the size of the crtc, or does it
match the size of the game plane, or is it unrelated to either?
If the overlay is covering the entire screen and always active
you should be good. If the overlay appears and disappears the
cursor drawing would switch between the overlay and the game
plane.
> This isn't a support configuration because HW cursor cannot be drawn in the same position on both pipes.
>
> I think you can see a similar usecase to [1] on Windows, but the difference is that the cursor is drawn on the "primary plane" instead of on top of the primary and overlay. I don't remember if DRM has a requirement that the cursor plane must be topmost, but we can't enable [1] as long as it is.
>
> I don't know if you have more usecases in mind than [1], but just as some general recommendations I think you should only really use overlays when they fall under one of two categories:
>
> 1. You want to save power:
>
> You will burn additional power for the overlay pipe.
>
> But you will save power in use cases like video playback - where the decoder produces the framebuffer and we can avoid a shader composited copy with its associated GFX engine overhead and memory traffic.
>
> 2. You want more performance:
>
> You will burn additional power for the overlay pipe.
>
> On bandwidth constrained systems you can save significant memory bandwidth by avoiding the shader composition by allowing for direct scanout of game or other application buffers.
>
> Your usecase [1] falls under this category, but as an aside I discourage trying to design usecases where the compositor requires the overlay for functional purposes.
>
> Best regards,
> Nicholas Kazlauskas
>
>>
>> I'm usually using the latest kernel (at the time of writing, v5.13.10), but I
>> often test with drm-tip or agd5f's amd-staging-drm-next, especially when
>> working on amdgpu patches.
>>
>> My primary hardware of interest is RDNA 2 based (the upcoming Steam Deck), but
>> of course it's better if gamescope can run on a wide range of hardware.
>>
>> I don't know if there's an IGT covering my use-case.
>>
>> [1]: https://github.com/Plagman/gamescope>>> [2]: https://github.com/emersion/libliftoff>>>
>>>>> Basically, we cannot draw the cursor at the same size and position on
>>>>> two separated pipes since it uses extra bandwidth and DML only run
>>>>> with one cursor.
>>>>
>>>> I don't understand this limitation. Why would it be necessary to draw the
>>>> cursor on two separate pipes? Isn't it only necessary to draw it once on
>>>> the overlay pipe, and not draw it on the primary pipe?
>>>
>>> I will try to provide some background. Harry and Nick, feel free to
>>> correct me or add extra information.
>>>
>>> In the amdgpu driver and from the DRM perspective, we expose cursors as
>>> a plane, but we don't have a real plane dedicated to cursors from the
>>> hardware perspective. We have part of our HUPB handling cursors (see
>>> commit "drm/amd/display: Add DCN3.1 DCHHUB" for a hardware block
>>> overview), which requires a different way to deal with the cursor plane
>>> since they are not planes in the hardware.
>>
>> What are DCHUBBUB and MMHUBBUB responsible for? Is one handling the primary
>> plane and the other handling the overlay plane? Or something else entirely?
>>
MMHUBBUB > DCHUBBUB > HUBP (for each pipe)
MMHUBBUB is irrelevant if DWB (display writeback) is not used. DWB is not
enabled in the driver.
DCHUBBUB is the overall scanout engine for all DC pipes and includes a
HUBP per pipe.
HUBP will have requestors for the primary framebuffer, DCC meta, dynamic
metadata (for things like Dolby HDR, though it's not fully implemented),
and cursor data.
Harry
>>> As a result, we have some
>>> limitations, such as not support multiple cursors with overlay; to
>>> support this, we need to deal with these aspects:
>>
>> Hm, but I don't want to draw multiple cursors. I want to draw a single
>> cursor. If all planes are enabled, can't we paint the cursor only on the
>> overlay plane and not paint the cursor on the primary plane?
>>
>> Or maybe it's impossible to draw the cursor on the overlay plane outside
>> of the overlay plane bounds?
>>
>> I'm also confused by the commit message in "drm/amd/display: Fix two cursor
>> duplication when using overlay", because an overlay which doesn't cover the
>> whole CRTC used to work perfectly fine, even with the cursor plane enabled.
>>
>>> 1. We need to make multiple cursor match in the same position and size.
>>> Again, keep in mind that cursors are handled in the HUBP, which is the
>>> first part of our pipe, and it is not a plane.
>>>
>>> 2. Fwiu, our Display Mode Library (DML), has gaps with multiple cursor
>>> support, which can lead to bandwidth problems such as underflow. Part of
>>> these limitations came from DCN1.0; the new ASIC probably can support
>>> multiple cursors without issues.
>>>
>>> Additionally, we fully support a strategy named underlay, which inverts
>>> the logic around the overlay. The idea is to put the DE in the overlay
>>> plane covering the entire screen and the other fb in the primary plane
>>> behind the overlay (DE); this can be useful for playback video
>>> scenarios.
>>
>> Yeah, as I said above this requires knowing a lot about the target hardware,
>> which is a bit unfortunate. This requires hole-punching and significantly
>> changes the composition logic.
>>
>
More information about the amd-gfx
mailing list