[PATCH] drm/amd/display: Fix overlay validation by considering cursors
Harry Wentland
harry.wentland at amd.com
Mon Jun 7 18:37:07 UTC 2021
On 2021-06-07 2:19 p.m., Sean Paul wrote:
> On Tue, May 18, 2021 at 2:58 PM Rodrigo Siqueira
> <Rodrigo.Siqueira at amd.com> wrote:
>>
>> On 05/14, Mark Yacoub wrote:
>>> On Fri, May 14, 2021 at 12:31 PM Mark Yacoub <markyacoub at google.com> wrote:
>>>>
>>>> On Fri, May 14, 2021 at 11:28 AM Harry Wentland <harry.wentland at amd.com> wrote:
>>>>>
>>>>> On 2021-05-14 7:47 a.m., Rodrigo Siqueira wrote:
>>>>>> A few weeks ago, we saw a two cursor issue in a ChromeOS system. We
>>>>>> fixed it in the commit:
>>>>>>
>>>>>> drm/amd/display: Fix two cursor duplication when using overlay
>>>>>> (read the commit message for more details)
>>>>>>
>>>>>> After this change, we noticed that some IGT subtests related to
>>>>>> kms_plane and kms_plane_scaling started to fail. After investigating
>>>>>> this issue, we noticed that all subtests that fail have a primary plane
>>>>>> covering the overlay plane, which is currently rejected by amdgpu dm.
>>>>>> Fail those IGT tests highlight that our verification was too broad and
>>>>>> compromises the overlay usage in our drive. This patch fixes this issue
>>> nit: s/drive/driver?
>>>>>> by ensuring that we only reject commits where the primary plane is not
>>>>>> fully covered by the overlay when the cursor hardware is enabled. With
>>>>>> this fix, all IGT tests start to pass again, which means our overlay
>>>>>> support works as expected.
>>>>>>
>>>>>> Cc: Tianci.Yin <tianci.yin at amd.com>
>>>>>> Cc: Harry Wentland <harry.wentland at amd.com>
>>>>>> Cc: Nicholas Choi <nicholas.choi at amd.com>
>>>>>> Cc: Bhawanpreet Lakha <bhawanpreet.lakha at amd.com>
>>>>>> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas at amd.com>
>>>>>> Cc: Mark Yacoub <markyacoub at google.com>
>>>>>> Cc: Daniel Wheeler <daniel.wheeler at amd.com>
>>>>>>
>>>>>> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira at amd.com>
>>>>>> ---
>>>>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +++++++++-
>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>> index ccd67003b120..9c2537a17a7b 100644
>>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>> @@ -10067,7 +10067,7 @@ static int validate_overlay(struct drm_atomic_state *state)
>>>>>> int i;
>>>>>> struct drm_plane *plane;
>>>>>> struct drm_plane_state *old_plane_state, *new_plane_state;
>>>>>> - struct drm_plane_state *primary_state, *overlay_state = NULL;
>>>>>> + struct drm_plane_state *primary_state, *cursor_state, *overlay_state = NULL;
>>>>>>
>>>>>> /* Check if primary plane is contained inside overlay */
>>>>>> for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
>>>>>> @@ -10097,6 +10097,14 @@ static int validate_overlay(struct drm_atomic_state *state)
>>>>>> if (!primary_state->crtc)
>>>>>> return 0;
>>>>>>
>>>>>> + /* check if cursor plane is enabled */
>>>>>> + cursor_state = drm_atomic_get_plane_state(state, overlay_state->crtc->cursor);
>>>>>> + if (IS_ERR(cursor_state))
>>>>>> + return PTR_ERR(cursor_state);
>>>>>> +
>>>>>> + if (drm_atomic_plane_disabling(plane->state, cursor_state))
>>>>>> + return 0;
>>>>>> +
>>>>>
>>>>> I thought this breaks Chrome's compositor since it can't handle an atomic commit rejection
>>>>> based solely on whether a cursor is enabled or not.
>>> For context: To use overlays (the old/current async way), Chrome tests
>>> if an overlay strategy will work by doing an atomic commit with a TEST
>>> flag to check if the combination would work. If it works, it flags
>>> this planes configuration as valid. As it's valid, it performs an
>>> atomic page flip (which could also include the cursor).
>>> So to Harry's point, you would pass an atomic test but fail on an
>>> atomic page flip with the exact same configuration that's been flagged
>>> as valid. Failing a pageflip causes Chrome to crash (Reset the GPU
>>> process cause something went horribly wrong when it shouldn't).
>>
>> Hi Mark and Sean,
>>
>> I don't know if I fully comprehended the scenario which in my patch
>> might cause a ChromeOS crash, but this is what I understood:
>>
>
> Chrome compositor only does an atomic test when the layout geometry
> changes (ie: plane is added/removed/resized/moved). This does not take
> into consideration the cursor. Once the atomic test has been validated
> for a given layout geometry (set of planes/fbs along with their sizes
> and locations), Chrome assumes this will continue to be valid.
>
> So the situation I'm envisioning is if the cursor is hidden, an
> overlay-based strategy will pass in the kernel. If at any time the
> cursor becomes visible, the kernel will start failing commits which
> causes Chrome's GPU process to crash.
>
> In general I'm uneasy with using the cursor in the atomic check since
> it feels like it could be racy independent of the issue above. I
> haven't dove into the helper code enough to prove it, just a
> spidey-sense.
>
Isn't the cursor supposed to be just another plane? If so, shouldn't
adding/removing the cursor trigger an atomic test?
Is Chrome's compositor doing cursor update through legacy IOCTLs or
through the ATOMIC IOCTL?
Thanks,
Harry
> Sean
>
>> There is a chance that we pass the atomic check
>> (DRM_MODE_ATOMIC_TEST_ONLY - TEST) but fails in the atomic page flip
>> because, after use TEST, the compositor might slightly change the commit
>> config by adding the cursor. The reason behind that came from the
>> assumption that adds the cursor in the atomic commit config after the
>> test is harmless. Is my understand correct? Please, correct me if I'm
>> wrong.
>>
>> If the above reasoning is correct, I think the compositor should not
>> assume anything extra from what it got from the atomic check.
>>
>> Best Regards,
>> Siqueira
>>
>>>>>
>>>>> Harry
>>>>>
>>>>>> /* Perform the bounds check to ensure the overlay plane covers the primary */
>>>>>> if (primary_state->crtc_x < overlay_state->crtc_x ||
>>>>>> primary_state->crtc_y < overlay_state->crtc_y ||
>>>>>>
>>>>>
>>
>> --
>> Rodrigo Siqueira
>> https://siqueira.tech/>
More information about the amd-gfx
mailing list