[PATCH] drm/amd/display: Fix overlay validation by considering cursors

Sean Paul seanpaul at google.com
Mon Jun 7 18:45:37 UTC 2021


On Mon, Jun 7, 2021 at 2:37 PM Harry Wentland <harry.wentland at amd.com>
wrote:

> On 2021-06-07 2:19 p.m., Sean Paul wrote:
> > On Tue, May 18, 2021 at 2:58 PM Rodrigo Siqueira
> > <Rodrigo.Siqueira at amd.com> wrote:
> >>
> >> On 05/14, Mark Yacoub wrote:
> >>> On Fri, May 14, 2021 at 12:31 PM Mark Yacoub <markyacoub at google.com>
> wrote:
> >>>>
> >>>> On Fri, May 14, 2021 at 11:28 AM Harry Wentland <
> harry.wentland at amd.com> wrote:
> >>>>>
> >>>>> On 2021-05-14 7:47 a.m., Rodrigo Siqueira wrote:
> >>>>>> A few weeks ago, we saw a two cursor issue in a ChromeOS system. We
> >>>>>> fixed it in the commit:
> >>>>>>
> >>>>>>  drm/amd/display: Fix two cursor duplication when using overlay
> >>>>>>  (read the commit message for more details)
> >>>>>>
> >>>>>> After this change, we noticed that some IGT subtests related to
> >>>>>> kms_plane and kms_plane_scaling started to fail. After investigating
> >>>>>> this issue, we noticed that all subtests that fail have a primary
> plane
> >>>>>> covering the overlay plane, which is currently rejected by amdgpu
> dm.
> >>>>>> Fail those IGT tests highlight that our verification was too broad
> and
> >>>>>> compromises the overlay usage in our drive. This patch fixes this
> issue
> >>> nit: s/drive/driver?
> >>>>>> by ensuring that we only reject commits where the primary plane is
> not
> >>>>>> fully covered by the overlay when the cursor hardware is enabled.
> With
> >>>>>> this fix, all IGT tests start to pass again, which means our overlay
> >>>>>> support works as expected.
> >>>>>>
> >>>>>> Cc: Tianci.Yin <tianci.yin at amd.com>
> >>>>>> Cc: Harry Wentland <harry.wentland at amd.com>
> >>>>>> Cc: Nicholas Choi <nicholas.choi at amd.com>
> >>>>>> Cc: Bhawanpreet Lakha <bhawanpreet.lakha at amd.com>
> >>>>>> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas at amd.com>
> >>>>>> Cc: Mark Yacoub <markyacoub at google.com>
> >>>>>> Cc: Daniel Wheeler <daniel.wheeler at amd.com>
> >>>>>>
> >>>>>> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira at amd.com>
> >>>>>> ---
> >>>>>>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +++++++++-
> >>>>>>  1 file changed, 9 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>>>> index ccd67003b120..9c2537a17a7b 100644
> >>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>>>> @@ -10067,7 +10067,7 @@ static int validate_overlay(struct
> drm_atomic_state *state)
> >>>>>>       int i;
> >>>>>>       struct drm_plane *plane;
> >>>>>>       struct drm_plane_state *old_plane_state, *new_plane_state;
> >>>>>> -     struct drm_plane_state *primary_state, *overlay_state = NULL;
> >>>>>> +     struct drm_plane_state *primary_state, *cursor_state,
> *overlay_state = NULL;
> >>>>>>
> >>>>>>       /* Check if primary plane is contained inside overlay */
> >>>>>>       for_each_oldnew_plane_in_state_reverse(state, plane,
> old_plane_state, new_plane_state, i) {
> >>>>>> @@ -10097,6 +10097,14 @@ static int validate_overlay(struct
> drm_atomic_state *state)
> >>>>>>       if (!primary_state->crtc)
> >>>>>>               return 0;
> >>>>>>
> >>>>>> +     /* check if cursor plane is enabled */
> >>>>>> +     cursor_state = drm_atomic_get_plane_state(state,
> overlay_state->crtc->cursor);
> >>>>>> +     if (IS_ERR(cursor_state))
> >>>>>> +             return PTR_ERR(cursor_state);
> >>>>>> +
> >>>>>> +     if (drm_atomic_plane_disabling(plane->state, cursor_state))
> >>>>>> +             return 0;
> >>>>>> +
> >>>>>
> >>>>> I thought this breaks Chrome's compositor since it can't handle an
> atomic commit rejection
> >>>>> based solely on whether a cursor is enabled or not.
> >>> For context: To use overlays (the old/current async way), Chrome tests
> >>> if an overlay strategy will work by doing an atomic commit with a TEST
> >>> flag to check if the combination would work. If it works, it flags
> >>> this planes configuration as valid. As it's valid, it performs an
> >>> atomic page flip (which could also include the cursor).
> >>> So to Harry's point, you would pass an atomic test but fail on an
> >>> atomic page flip with the exact same configuration that's been flagged
> >>> as valid. Failing a pageflip causes Chrome to crash (Reset the GPU
> >>> process cause something went horribly wrong when it shouldn't).
> >>
> >> Hi Mark and Sean,
> >>
> >> I don't know if I fully comprehended the scenario which in my patch
> >> might cause a ChromeOS crash, but this is what I understood:
> >>
> >
> > Chrome compositor only does an atomic test when the layout geometry
> > changes (ie: plane is added/removed/resized/moved). This does not take
> > into consideration the cursor. Once the atomic test has been validated
> > for a given layout geometry (set of planes/fbs along with their sizes
> > and locations), Chrome assumes this will continue to be valid.
> >
> > So the situation I'm envisioning is if the cursor is hidden, an
> > overlay-based strategy will pass in the kernel. If at any time the
> > cursor becomes visible, the kernel will start failing commits which
> > causes Chrome's GPU process to crash.
> >
> > In general I'm uneasy with using the cursor in the atomic check since
> > it feels like it could be racy independent of the issue above. I
> > haven't dove into the helper code enough to prove it, just a
> > spidey-sense.
> >
>
> Isn't the cursor supposed to be just another plane? If so, shouldn't
> adding/removing the cursor trigger an atomic test?
>
>
Chrome is using legacy cursor.

Yes it will trigger an atomic test in the kernel, and that atomic test will
fail. Unfortunately Chrome does not expect a failure so it will crash.

Sean


Is Chrome's compositor doing cursor update through legacy IOCTLs or
> through the ATOMIC IOCTL?
>
> Thanks,
> Harry
>
> > Sean
> >
> >> There is a chance that we pass the atomic check
> >> (DRM_MODE_ATOMIC_TEST_ONLY - TEST) but fails in the atomic page flip
> >> because, after use TEST, the compositor might slightly change the commit
> >> config by adding the cursor. The reason behind that came from the
> >> assumption that adds the cursor in the atomic commit config after the
> >> test is harmless. Is my understand correct? Please, correct me if I'm
> >> wrong.
> >>
> >> If the above reasoning is correct, I think the compositor should not
> >> assume anything extra from what it got from the atomic check.
> >>
> >> Best Regards,
> >> Siqueira
> >>
> >>>>>
> >>>>> Harry
> >>>>>
> >>>>>>       /* Perform the bounds check to ensure the overlay plane
> covers the primary */
> >>>>>>       if (primary_state->crtc_x < overlay_state->crtc_x ||
> >>>>>>           primary_state->crtc_y < overlay_state->crtc_y ||
> >>>>>>
> >>>>>
> >>
> >> --
> >> Rodrigo Siqueira
> >> https://siqueira.tech/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20210607/3224cbff/attachment-0001.htm>


More information about the amd-gfx mailing list