[PATCH v2] drm/amd/display: Fix two cursor duplication when using overlay

Kazlauskas, Nicholas nicholas.kazlauskas at amd.com
Tue Aug 24 14:56:51 UTC 2021


On 2021-08-24 9:59 a.m., Simon Ser wrote:
> Hi Rodrigo!
> 
> Thanks a lot for your reply! Comments below, please bear with me: I'm
> a bit familiar with the cursor issues, but my knowledge of AMD hw is
> still severely lacking.
> 
> On Wednesday, August 18th, 2021 at 15:18, Rodrigo Siqueira <Rodrigo.Siqueira at amd.com> wrote:
> 
>> On 08/18, Simon Ser wrote:
>>> Hm. This patch causes a regression for me. I was using primary + overlay
>>> not covering the whole primary plane + cursor before. This patch breaks it.
>>
>> Which branch are you using? Recently, I reverted part of that patch,
>> see:
>>
>>    Revert "drm/amd/display: Fix overlay validation by considering cursors"
> 
> Right. This revert actually makes things worse. Prior to the revert the
> overlay could be enabled without the cursor. With the revert the overlay
> cannot be enabled at all, even if the cursor is disabled.
> 
>>> This patch makes the overlay plane very useless for me, because the primary
>>> plane is always under the overlay plane.
>>
>> I'm curious about your use case with overlay planes. Could you help me
>> to understand it better? If possible, describe:
>>
>> 1. Context and scenario
>> 2. Compositor
>> 3. Kernel version
>> 4. If you know which IGT test describe your test?
>>
>> I'm investigating overlay issues in our driver, and a userspace
>> perspective might help me.
> 
> I'm working on gamescope [1], Valve's gaming compositor. Our use-cases include
> displaying (from bottom to top) a game in the background, a notification popup
> over it in the overlay plane, and a cursor in the cursor plane. All of the
> planes might be rotated. The game's buffer might be scaled and might not cover
> the whole CRTC.
> 
> libliftoff [2] is used to provide vendor-agnostic KMS plane offload. In other
> words, I'd prefer to avoid relying too much on hardware specific details, e.g.
> I'd prefer to avoid hole-punching via a underlay (it might work on AMD hw, but
> will fail on many other drivers).

Hi Simon,

Siqueria explained a bit below, but the problem is that we don't have 
dedicated cursor planes in hardware.

It's easiest to under the hardware cursor as being constrained within 
the DRM plane specifications. Each DRM plane maps to 1 (or 2) hardware 
pipes and the cursor has to be drawn along with it. The cursor will 
inherit the scale, bounds, and color management associated with the 
underlying pipes.

 From the kernel display driver perspective that makes things quite 
difficult with the existing DRM API - we can only really guarantee you 
get HW cursor when the framebuffer covers the entire screen and it is 
unscaled or matches the scaling expected by the user.

Hole punching generally satisfies both of these since it's a transparent 
framebuffer that covers the entire screen.

The case that's slightly more complicated is when the overlay doesn't 
cover the entire screen but the primary plane does. We can still enable 
the cursor if the primary plane and overlay have a matching scale and 
color management - our display hardware can draw the cursor on multiple 
pipes. (Note: this statement only applies for DCN2.1+)

If the overlay plane does not cover the entire screen and the scale or 
the color management differs then we cannot enable the HW cursor plane. 
As you mouse over the bounds of the overlay you will see the cursor 
drawn differently on the primary and overlay pipe.

If the overlay plane and primary plane do not cover the entire screen 
then you will lose HW cursor outside of the union of their bounds.

Correct me if I'm wrong, but I think your usecase [1] falls under the 
category where:
1. Primary plane covers entire screen
2. Overlay plane does not cover the entire screen
3. Overlay plane is scaled

This isn't a support configuration because HW cursor cannot be drawn in 
the same position on both pipes.

I think you can see a similar usecase to [1] on Windows, but the 
difference is that the cursor is drawn on the "primary plane" instead of 
on top of the primary and overlay. I don't remember if DRM has a 
requirement that the cursor plane must be topmost, but we can't enable 
[1] as long as it is.

I don't know if you have more usecases in mind than [1], but just as 
some general recommendations I think you should only really use overlays 
when they fall under one of two categories:

1. You want to save power:

You will burn additional power for the overlay pipe.

But you will save power in use cases like video playback - where the 
decoder produces the framebuffer and we can avoid a shader composited 
copy with its associated GFX engine overhead and memory traffic.

2. You want more performance:

You will burn additional power for the overlay pipe.

On bandwidth constrained systems you can save significant memory 
bandwidth by avoiding the shader composition by allowing for direct 
scanout of game or other application buffers.

Your usecase [1] falls under this category, but as an aside I discourage 
trying to design usecases where the compositor requires the overlay for 
functional purposes.

Best regards,
Nicholas Kazlauskas

> 
> I'm usually using the latest kernel (at the time of writing, v5.13.10), but I
> often test with drm-tip or agd5f's amd-staging-drm-next, especially when
> working on amdgpu patches.
> 
> My primary hardware of interest is RDNA 2 based (the upcoming Steam Deck), but
> of course it's better if gamescope can run on a wide range of hardware.
> 
> I don't know if there's an IGT covering my use-case.
> 
> [1]: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPlagman%2Fgamescope&data=04%7C01%7CNicholas.Kazlauskas%40amd.com%7C0a5e1d2ce0874a87929e08d96707743a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654104020179511%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PCliYWadIaVDDnEQUOONNo%2FmC2ieIMjUw9Zr4XP3XDM%3D&reserved=0
> [2]: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Femersion%2Flibliftoff&data=04%7C01%7CNicholas.Kazlauskas%40amd.com%7C0a5e1d2ce0874a87929e08d96707743a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654104020179511%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=q4NCvqFpwdSXVnBcBSdxCYII44ekOiQBWTe9SUDhFUo%3D&reserved=0
> 
>>>> Basically, we cannot draw the cursor at the same size and position on
>>>> two separated pipes since it uses extra bandwidth and DML only run
>>>> with one cursor.
>>>
>>> I don't understand this limitation. Why would it be necessary to draw the
>>> cursor on two separate pipes? Isn't it only necessary to draw it once on
>>> the overlay pipe, and not draw it on the primary pipe?
>>
>> I will try to provide some background. Harry and Nick, feel free to
>> correct me or add extra information.
>>
>> In the amdgpu driver and from the DRM perspective, we expose cursors as
>> a plane, but we don't have a real plane dedicated to cursors from the
>> hardware perspective. We have part of our HUPB handling cursors (see
>> commit "drm/amd/display: Add DCN3.1 DCHHUB" for a hardware block
>> overview), which requires a different way to deal with the cursor plane
>> since they are not planes in the hardware.
> 
> What are DCHUBBUB and MMHUBBUB responsible for? Is one handling the primary
> plane and the other handling the overlay plane? Or something else entirely?
> 
>> As a result, we have some
>> limitations, such as not support multiple cursors with overlay; to
>> support this, we need to deal with these aspects:
> 
> Hm, but I don't want to draw multiple cursors. I want to draw a single
> cursor. If all planes are enabled, can't we paint the cursor only on the
> overlay plane and not paint the cursor on the primary plane?
> 
> Or maybe it's impossible to draw the cursor on the overlay plane outside
> of the overlay plane bounds?
> 
> I'm also confused by the commit message in "drm/amd/display: Fix two cursor
> duplication when using overlay", because an overlay which doesn't cover the
> whole CRTC used to work perfectly fine, even with the cursor plane enabled.
> 
>> 1. We need to make multiple cursor match in the same position and size.
>> Again, keep in mind that cursors are handled in the HUBP, which is the
>> first part of our pipe, and it is not a plane.
>>
>> 2. Fwiu, our Display Mode Library (DML), has gaps with multiple cursor
>> support, which can lead to bandwidth problems such as underflow. Part of
>> these limitations came from DCN1.0; the new ASIC probably can support
>> multiple cursors without issues.
>>
>> Additionally, we fully support a strategy named underlay, which inverts
>> the logic around the overlay. The idea is to put the DE in the overlay
>> plane covering the entire screen and the other fb in the primary plane
>> behind the overlay (DE); this can be useful for playback video
>> scenarios.
> 
> Yeah, as I said above this requires knowing a lot about the target hardware,
> which is a bit unfortunate. This requires hole-punching and significantly
> changes the composition logic.
> 



More information about the amd-gfx mailing list