[PATCH v3] drm/rockchip: update cursors asynchronously through atomic.
Gustavo Padovan
gustavo.padovan at collabora.com
Mon Nov 26 23:54:06 UTC 2018
Hi Tomasz,
On 11/23/18 12:27 AM, Tomasz Figa wrote:
> Hi Helen,
>
> On Fri, Nov 23, 2018 at 8:31 AM Helen Koike <helen.koike at collabora.com> wrote:
>> Hi Tomasz,
>>
>> On 11/20/18 4:48 AM, Tomasz Figa wrote:
>>> Hi Helen,
>>>
>>> On Tue, Nov 20, 2018 at 4:08 AM Helen Koike <helen.koike at collabora.com> wrote:
>>>> From: Enric Balletbo i Serra <enric.balletbo at collabora.com>
>>>>
>>>> Add support to async updates of cursors by using the new atomic
>>>> interface for that.
>>>>
>>>> Signed-off-by: Enric Balletbo i Serra <enric.balletbo at collabora.com>
>>>> [updated for upstream]
>>>> Signed-off-by: Helen Koike <helen.koike at collabora.com>
>>>>
>>>> ---
>>>> Hello,
>>>>
>>>> This is the third version of the async-plane update suport to the
>>>> Rockchip driver.
>>>>
>>> Thanks for a quick respin. Please see my comments inline. (I'll try to
>>> be better at responding from now on...)
>>>
>>>> I tested running igt kms_cursor_legacy and kms_atomic tests using a 96Boards Ficus.
>>>>
>>>> Note that before the patch, the following igt tests failed:
>>>>
>>>> basic-flip-before-cursor-atomic
>>>> basic-flip-before-cursor-legacy
>>>> cursor-vs-flip-atomic
>>>> cursor-vs-flip-legacy
>>>> cursor-vs-flip-toggle
>>>> flip-vs-cursor-atomic
>>>> flip-vs-cursor-busy-crc-atomic
>>>> flip-vs-cursor-busy-crc-legacy
>>>> flip-vs-cursor-crc-atomic
>>>> flip-vs-cursor-crc-legacy
>>>> flip-vs-cursor-legacy
>>>>
>>>> Full log: https://people.collabora.com/~koike/results-4.20/html/
>>>>
>>>> Now with the patch applied the following were fixed:
>>>> basic-flip-before-cursor-atomic
>>>> basic-flip-before-cursor-legacy
>>>> flip-vs-cursor-atomic
>>>> flip-vs-cursor-legacy
>>>>
>>>> Full log: https://people.collabora.com/~koike/results-4.20-async/html/
>>> Could you also test modetest, with the -C switch to test the legacy
>>> cursor API? I remember it triggering crashes due to synchronization
>>> issues easily.
>> Sure. I tested with
>> $ modetest -M rockchip -s 37:1920x1080 -C
>>
>> I also vary the mode but I couldn't trigger any crashes.
>>
>>>> Tomasz, as you mentined in v2 about waiting the hardware before updating
>>>> the framebuffer, now I call the loop you pointed out in the async path,
>>>> was that what you had in mind? Or do you think I would make sense to
>>>> call the vop_crtc_atomic_flush() instead of just exposing that loop?
>>>>
>>>> Thanks
>>>> Helen
>>>>
>>>> Changes in v3:
>>>> - Rebased on top of drm-misc
>>>> - Fix missing include in rockchip_drm_vop.c
>>>> - New function vop_crtc_atomic_commit_flush
>>>>
>>>> Changes in v2:
>>>> - v2: https://patchwork.freedesktop.org/patch/254180/
>>>> - Change the framebuffer as well to cover jumpy cursor when hovering
>>>> text boxes or hyperlink. (Tomasz)
>>>> - Use the PSR inhibit mechanism when accessing VOP hardware instead of
>>>> PSR flushing (Tomasz)
>>>>
>>>> Changes in v1:
>>>> - Rebased on top of drm-misc
>>>> - In async_check call drm_atomic_helper_check_plane_state to check that
>>>> the desired plane is valid and update various bits of derived state
>>>> (clipped coordinates etc.)
>>>> - In async_check allow to configure new scaling in the fast path.
>>>> - In async_update force to flush all registered PSR encoders.
>>>> - In async_update call atomic_update directly.
>>>> - In async_update call vop_cfg_done needed to set the vop registers and take effect.
>>>>
>>>> drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 36 -------
>>>> drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 37 +++++++
>>>> drivers/gpu/drm/rockchip/rockchip_drm_psr.h | 3 +
>>>> drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 108 +++++++++++++++++---
>>>> 4 files changed, 131 insertions(+), 53 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
>>>> index ea18cb2a76c0..08bec50d9c5d 100644
>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
>>>> @@ -127,42 +127,6 @@ rockchip_user_fb_create(struct drm_device *dev, struct drm_file *file_priv,
>>>> return ERR_PTR(ret);
>>>> }
>>>>
>>>> -static void
>>>> -rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state)
>>>> -{
>>>> - struct drm_crtc *crtc;
>>>> - struct drm_crtc_state *crtc_state;
>>>> - struct drm_encoder *encoder;
>>>> - u32 encoder_mask = 0;
>>>> - int i;
>>>> -
>>>> - for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
>>>> - encoder_mask |= crtc_state->encoder_mask;
>>>> - encoder_mask |= crtc->state->encoder_mask;
>>>> - }
>>>> -
>>>> - drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
>>>> - rockchip_drm_psr_inhibit_get(encoder);
>>>> -}
>>>> -
>>>> -static void
>>>> -rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state)
>>>> -{
>>>> - struct drm_crtc *crtc;
>>>> - struct drm_crtc_state *crtc_state;
>>>> - struct drm_encoder *encoder;
>>>> - u32 encoder_mask = 0;
>>>> - int i;
>>>> -
>>>> - for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
>>>> - encoder_mask |= crtc_state->encoder_mask;
>>>> - encoder_mask |= crtc->state->encoder_mask;
>>>> - }
>>>> -
>>>> - drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
>>>> - rockchip_drm_psr_inhibit_put(encoder);
>>>> -}
>>>> -
>>>> static void
>>>> rockchip_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state)
>>>> {
>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
>>>> index 01ff3c858875..22a70ab6e214 100644
>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
>>>> @@ -13,6 +13,7 @@
>>>> */
>>>>
>>>> #include <drm/drmP.h>
>>>> +#include <drm/drm_atomic.h>
>>>> #include <drm/drm_crtc_helper.h>
>>>>
>>>> #include "rockchip_drm_drv.h"
>>>> @@ -109,6 +110,42 @@ int rockchip_drm_psr_inhibit_put(struct drm_encoder *encoder)
>>>> }
>>>> EXPORT_SYMBOL(rockchip_drm_psr_inhibit_put);
>>>>
>>>> +void rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state)
>>>> +{
>>>> + struct drm_crtc *crtc;
>>>> + struct drm_crtc_state *crtc_state;
>>>> + struct drm_encoder *encoder;
>>>> + u32 encoder_mask = 0;
>>>> + int i;
>>>> +
>>>> + for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
>>>> + encoder_mask |= crtc_state->encoder_mask;
>>>> + encoder_mask |= crtc->state->encoder_mask;
>>>> + }
>>>> +
>>>> + drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
>>>> + rockchip_drm_psr_inhibit_get(encoder);
>>>> +}
>>>> +EXPORT_SYMBOL(rockchip_drm_psr_inhibit_get_state);
>>>> +
>>>> +void rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state)
>>>> +{
>>>> + struct drm_crtc *crtc;
>>>> + struct drm_crtc_state *crtc_state;
>>>> + struct drm_encoder *encoder;
>>>> + u32 encoder_mask = 0;
>>>> + int i;
>>>> +
>>>> + for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
>>>> + encoder_mask |= crtc_state->encoder_mask;
>>>> + encoder_mask |= crtc->state->encoder_mask;
>>>> + }
>>>> +
>>>> + drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
>>>> + rockchip_drm_psr_inhibit_put(encoder);
>>>> +}
>>>> +EXPORT_SYMBOL(rockchip_drm_psr_inhibit_put_state);
>>>> +
>>>> /**
>>>> * rockchip_drm_psr_inhibit_get - acquire PSR inhibit on given encoder
>>>> * @encoder: encoder to obtain the PSR encoder
>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.h b/drivers/gpu/drm/rockchip/rockchip_drm_psr.h
>>>> index 860c62494496..25350ba3237b 100644
>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.h
>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.h
>>>> @@ -20,6 +20,9 @@ void rockchip_drm_psr_flush_all(struct drm_device *dev);
>>>> int rockchip_drm_psr_inhibit_put(struct drm_encoder *encoder);
>>>> int rockchip_drm_psr_inhibit_get(struct drm_encoder *encoder);
>>>>
>>>> +void rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state);
>>>> +void rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state);
>>>> +
>>>> int rockchip_drm_psr_register(struct drm_encoder *encoder,
>>>> int (*psr_set)(struct drm_encoder *, bool enable));
>>>> void rockchip_drm_psr_unregister(struct drm_encoder *encoder);
>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>> index fb70fb486fbf..176d6e8207ed 100644
>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>> @@ -15,6 +15,7 @@
>>>> #include <drm/drm.h>
>>>> #include <drm/drmP.h>
>>>> #include <drm/drm_atomic.h>
>>>> +#include <drm/drm_atomic_uapi.h>
>>>> #include <drm/drm_crtc.h>
>>>> #include <drm/drm_crtc_helper.h>
>>>> #include <drm/drm_flip_work.h>
>>>> @@ -819,10 +820,99 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
>>>> spin_unlock(&vop->reg_lock);
>>>> }
>>>>
>>>> +static int vop_plane_atomic_async_check(struct drm_plane *plane,
>>>> + struct drm_plane_state *state)
>>>> +{
>>>> + struct vop_win *vop_win = to_vop_win(plane);
>>>> + const struct vop_win_data *win = vop_win->data;
>>>> + int min_scale = win->phy->scl ? FRAC_16_16(1, 8) :
>>>> + DRM_PLANE_HELPER_NO_SCALING;
>>>> + int max_scale = win->phy->scl ? FRAC_16_16(8, 1) :
>>>> + DRM_PLANE_HELPER_NO_SCALING;
>>>> + struct drm_crtc_state *crtc_state;
>>>> + int ret;
>>>> +
>>>> + if (plane != state->crtc->cursor)
>>>> + return -EINVAL;
>>>> +
>>>> + if (!plane->state)
>>>> + return -EINVAL;
>>>> +
>>>> + if (!plane->state->fb)
>>>> + return -EINVAL;
>>>> +
>>>> + if (state->state)
>>>> + crtc_state = drm_atomic_get_existing_crtc_state(state->state,
>>>> + state->crtc);
>>>> + else /* Special case for asynchronous cursor updates. */
>>>> + crtc_state = plane->crtc->state;
>>>> +
>>>> + ret = drm_atomic_helper_check_plane_state(plane->state,
>>>> + crtc_state,
>>>> + min_scale, max_scale,
>>>> + true, true);
>>>> + return ret;
>>>> +}
>>>> +
>>>> +static void vop_crtc_atomic_commit_flush(struct drm_crtc *crtc,
>>>> + struct drm_crtc_state *old_crtc_state)
>>>> +{
>>>> + struct drm_atomic_state *old_state = old_crtc_state->state;
>>>> + struct drm_plane_state *old_plane_state, *new_plane_state;
>>>> + struct vop *vop = to_vop(crtc);
>>>> + struct drm_plane *plane;
>>>> + int i;
>>>> +
>>>> + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state,
>>>> + new_plane_state, i) {
>>> Hmm, from what I can see, we're not going through the full atomic
>>> commit sequence, with state flip, so I'm not sure where we would get
>>> the new state here from.
>>>
>>>> + if (!old_plane_state->fb)
>>>> + continue;
>>>> +
>>>> + if (old_plane_state->fb == new_plane_state->fb)
>>>> + continue;
>>>> +
>>>> + drm_framebuffer_get(old_plane_state->fb);
>>>> + WARN_ON(drm_crtc_vblank_get(crtc) != 0);
>>>> + drm_flip_work_queue(&vop->fb_unref_work, old_plane_state->fb);
>>>> + set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
>>>> + }
>>>> +}
>>>> +
>>>> +static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>> + struct drm_plane_state *new_state)
>>>> +{
>>>> + struct vop *vop = to_vop(plane->state->crtc);
>>>> +
>>>> + if (vop->crtc.state->state)
>>>> + vop_crtc_atomic_commit_flush(&vop->crtc, vop->crtc.state);
>>> Since we just operate on one plane here, we could just do like this:
>>>
>>> if (plane->state->fb && plane->state->fb != new_state->fb) {
>>> drm_framebuffer_get(plane->state->fb);
>>> WARN_ON(drm_crtc_vblank_get(crtc) != 0);
>>> drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb);
>>> set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
>>> }
>>>
>>> However, we cannot simply to this here, because it races with the
>>> vblank interrupt. We need to program the hw plane with the new fb
>>> first and trigger the update. This needs all the careful handling that
>>> is done in vop_crtc_atomic_flush() and so my original suggestion to
>>> just call it.
>> vop_crtc_atomic_flush() also updates the crtc->state->event, I don't
>> think we want that.
> Good point, we don't want to touch the event.
>
>> And actually I don't think we have this race condition, please see below
>>
> Just to clarify, the race conditions I'm talking here are not anything
> new to this patch alone. I had actually observed those when
> implementing the Chrome OS downstream async cursor code before:
>
> https://chromium.googlesource.com/chromiumos/third_party/kernel/+/65d4ff0af3f8c7ebecad8f3b402b546f277d9225/drivers/gpu/drm/rockchip/rockchip_drm_vop.c#1015
>
> Note that the code there actually duplicates the current plane state,
> updates the copy, swaps both states and then update the hardware based
> on both new and old states, as the regular atomic commit flow would
> do. With that, no reference counts are dropped until the old state
> gets destroyed at the end of the function. Also note the if block that
> handles the condition of (plane_state->fb && plane_state->fb !=
> plane->state->fb) (plane_state is the old state after the swap).
>
>>> Of course to call it in its current shape, one needs to have a full
>>> atomic state from a commit, after a flip, but we only have the new
>>> plane state here. Perhaps you could duplicate existing state, update
>>> the desired plane state, flip and then call vop_crtc_atomic_flush()?
>> Could you please clarify your proposal? You mean duplicating
>> plane->state ? I'm trying to see how this would fit it in the code.
>> The drm_atomic_state structure at plate->state->state is actually always
>> NULL (as an async update is applied right away).
>>
>>
>>> Best regards,
>>> Tomasz
>>>
>> From your comment in v2:
>>
>>> Isn't this going to drop the old fb reference on the floor without
>>> waiting for the hardware to actually stop scanning out from it?
>> I've been trying to analyze this better, I also got some help from
>> Gustavo Padovan, and I think there is no problem here as the
>> configuration we are doing here will just be taken into consideration by
>> the hardware in the next vblank, so I guess we can set and re-set
>> plane->fb as much as we want.
> The point here is not about setting and resetting the plane->fb
> pointer. It's about what happens inside drm_atomic_set_fb_for_plane().
>
> It calls drm_framebuffer_get() for the new fb and
> drm_framebuffer_put() for the old fb. In result, if the fb changes,
> the old fb, which had its reference count incremented in the atomic
> commit that set it to the plane before, has its reference count
> decremented. Moreover, if the new reference count becomes 0,
> drm_framebuffer_put() will immediately free the buffer.
>
> Freeing a buffer when the hardware is still scanning out of it isn't a
> good idea, is it?
My understanding is that in the async update path we never touch the fb
that is still being scanned out - what you are referring as old fb. That
is the design of async update. drm_atomic_set_fb_for_plane() replace
pointers and refs between the fb we just received from userspace with
the one on plane->state (that after the atomic swap holds the new state)
which won't be scanned out until the next flip. However I don't
understand this part of the rockchip hardware, the fb is not the one
being scanned out. Is this still a problem?
Regards,
Gustavo
--
Gustavo Padovan
Collabora Ltd
More information about the dri-devel
mailing list