[PATCH 3/3] drm/komeda - Fix handling of pending crtc state commit to avoid lock-up

Carsten Haitzler carsten.haitzler at foss.arm.com
Fri Jul 15 15:59:56 UTC 2022



On 7/11/22 11:13, Liviu Dudau wrote:
> On Fri, Jul 08, 2022 at 07:03:37PM +0100, Carsten Haitzler wrote:
>>
>>
>> On 7/8/22 17:02, Liviu Dudau wrote:
>>> On Mon, Jun 06, 2022 at 12:47:14PM +0100, carsten.haitzler at foss.arm.com wrote:
>>>> From: Carsten Haitzler <carsten.haitzler at arm.com>
>>>
>>> Hi Carsten,
>>>
>>>>
>>>> Sometimes there is an extra dcm crtc state in the pipeline whose
>>>> penting vblank event has not been handled yet. We would previously
>>>> throw this out and the vblank event never triggers leading to hard
>>>> lockups higher up the stack where an expected vlank event never comes
>>>> back (screen freezes).
>>>>
>>>> This fixes that by tracking a pending crtc state that needs handling
>>>> and handle it producing a vlank event next vblank if it had not
>>>> laready been handled before. This fixes the hangs and ensures our
>>>> display keeps updating seamlessly and is certainly a much better state
>>>> to be in than after some time ending up with a mysteriously frozen
>>>> screen and a lot of kernle messages complaining about this too.
>>>
>>> Sorry it took me so long to review and reply to this patch, but I had this nagging
>>
>> No worries. :)
>>
>>> feeling that the patch is not actually correct so I've tried to track the actual
>>> issue. It turns out that the problem is easy to reproduce in a different setup with
>>> Mali D71 and it comes from the fact that Komeda doesn't properly wait for the
>>> hardware to signal acceptance of config valid on setting new commits. I have created
>>> a new patch that I would be happy if you can test to try to fix the actual issue.
>>
>> This works (tested).
> 
> Thank you very much for testing this! Can I add your Tested-by?

Indeed. Go for it.

>>   One errant "flip without commit":
>>
>> [    9.402559] fbcon: Taking over console
>> [    9.525373] [drm:komeda_print_events [komeda]] *ERROR* err detect: gcu:
>> MERR, pipes[0]: FLIP, pipes[1]: None
>> [    9.525455] [drm] CRTC[0]: FLIP happened but no pending commit.
>> [    9.542215] Console: switching to colour frame buffer device 240x67
>>
> 
> Is this with your 2/3 patch applied and coming out from the firmware having already
> initialised the hardware? If so, the handover probably doesn't quiescence the
> interrupts correctly so an interrupt is pending when the kernel driver is
> initialised. That's another area worth looking at for the handover purposes.

Yeah. the firmware is not doing a very clean handover - it's leaving the 
hardware in whatever state it has. it probably could shut down 
interrupts on handover, but not something to worry about in the kernel 
driver at this point.

>> But nothing worrying. It does work, though doesn't compile due to:
>>
>> drivers/gpu/drm/arm/display/komeda/komeda_kms.c: In function
>> ‘komeda_kms_atomic_commit_hw_done’:
>> drivers/gpu/drm/arm/display/komeda/komeda_kms.c:77:9: error: ‘for’ loop
>> initial declarations are only allowed in C99 or C11 mode
>>     77 |         for (int i = 0; i < kms->n_crtcs; i++) {
>>        |         ^~~
>> drivers/gpu/drm/arm/display/komeda/komeda_kms.c:77:9: note: use option
>> ‘-std=c9
>> ’, ‘-std=gnu99’, ‘-std=c11’ or ‘-std=gnu11’ to compile your code
>>
>> but that was a trivial fixup.
> 
> Interesting that I'm not seeing that, probably due to using GCC12? Anyway, I'll fix
> that and send a proper patch.

I tend to use newer gcc's indeed, but I am using gcc11 in this case.

>> Your commit handler does sit and wait. I guess I avoided that and had it
>> still deferred and handled next time the vblank interrupt fires. Yours is a
>> bit shorter and less complex so it wins. :)
> 
> Yes, that's the essence of my issue with your patch. Delaying the handling of the
> event until the next vblank means older software that doesn't use the timestamping
> from the vblank event will get wrong framerates (playing video might be affected).

But this is a rare situation. I certainly was very happy going from "my 
entire display locks up and only a reboot really fixes it" to "look ma - 
it works and I didn't even see a stuttered frame!" :) But your fix is 
better in that regard.

> Waiting here when we're also calling drm_atomic_helper_wait_for_flip_done() after
> drm_atomic_helper_commit_hw_done() feels wrong, but then the later is checking if we
> have consumed the event so we have to. Maybe the introduction of the
> drm_atomic_helper_fake_vblank() is needed in komeda as well like the generic
> commit_tail helper function does? I need to investigate more the next time I get some
> spare cycles on komeda.
> 
> I will send a new email with the updated patch and your Tested-by if that's OK.

All happy and fine with that.

> Best regards,
> Liviu
> 
>>
>>> --8<---------------------------------------------------------------------------
>>>
>>>   From 76f9e5fed216a815e9eb56152f85454521079f10 Mon Sep 17 00:00:00 2001
>>> From: Liviu Dudau <liviu.dudau at arm.com>
>>> Date: Fri, 8 Jul 2022 16:39:21 +0100
>>> Subject: [PATCH] drm/komeda: Fix handling of atomic commits in the
>>>    atomic_commit_tail hook
>>>
>>> Komeda driver relies on the generic DRM atomic helper functions to handle
>>> commits. It only implements an atomic_commit_tail hook for the
>>> mode_config_helper_funcs and even that one is pretty close to the generic
>>> implementation with the exception of additional dma_fence signalling.
>>>
>>> What the generic helper framework doesn't do is waiting for the actual
>>> hardware to signal that the commit parameters have been written into the
>>> appropriate registers. As we signal CRTC events only on the irq handlers,
>>> we need to flush the configuration and wait for the hardware to respond.
>>>
>>> Add the Komeda specific implementation for atomic_commit_hw_done() that
>>> flushes and waits for flip done before calling drm_atomic_helper_commit_hw_done().
>>>
>>> The fix was prompted by a patch from Carsten Haitzler where he was trying to
>>> solve the same issue but in a different way that I think can lead to wrong
>>> event signaling to userspace.
>>>
>>> Reported-by: Carsten Haitzler <carsten.haitzler at arm.com>
>>> Signed-off-by: Liviu Dudau <liviu.dudau at arm.com>
>>> ---
>>>    .../gpu/drm/arm/display/komeda/komeda_crtc.c  |  4 ++--
>>>    .../gpu/drm/arm/display/komeda/komeda_kms.c   | 20 ++++++++++++++++++-
>>>    .../gpu/drm/arm/display/komeda/komeda_kms.h   |  2 ++
>>>    3 files changed, 23 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c b/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c
>>> index 59172acb973803d..292f533d8cf0de6 100644
>>> --- a/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c
>>> +++ b/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c
>>> @@ -235,7 +235,7 @@ void komeda_crtc_handle_event(struct komeda_crtc   *kcrtc,
>>>    			crtc->state->event = NULL;
>>>    			drm_crtc_send_vblank_event(crtc, event);
>>>    		} else {
>>> -			DRM_WARN("CRTC[%d]: FLIP happen but no pending commit.\n",
>>> +			DRM_WARN("CRTC[%d]: FLIP happened but no pending commit.\n",
>>>    				 drm_crtc_index(&kcrtc->base));
>>>    		}
>>>    		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
>>> @@ -286,7 +286,7 @@ komeda_crtc_atomic_enable(struct drm_crtc *crtc,
>>>    	komeda_crtc_do_flush(crtc, old);
>>>    }
>>> -static void
>>> +void
>>>    komeda_crtc_flush_and_wait_for_flip_done(struct komeda_crtc *kcrtc,
>>>    					 struct completion *input_flip_done)
>>>    {
>>> diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
>>> index 93b7f09b96ca950..1a53bd87e81d8ad 100644
>>> --- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
>>> +++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
>>> @@ -69,6 +69,24 @@ static const struct drm_driver komeda_kms_driver = {
>>>    	.minor = 1,
>>>    };
>>> +static void komeda_kms_atomic_commit_hw_done(struct drm_atomic_state *state)
>>> +{
>>> +	struct drm_device *dev = state->dev;
>>> +	struct komeda_kms_dev *kms = to_kdev(dev);
>>> +
>>> +	for (int i = 0; i < kms->n_crtcs; i++) {
>>> +		struct komeda_crtc *kcrtc = &kms->crtcs[i];
>>> +
>>> +		if (kcrtc->base.state->active) {
>>> +			struct completion *flip_done = NULL;
>>> +			if (kcrtc->base.state->event)
>>> +				flip_done = kcrtc->base.state->event->base.completion;
>>> +			komeda_crtc_flush_and_wait_for_flip_done(kcrtc, flip_done);
>>> +		}
>>> +	}
>>> +	drm_atomic_helper_commit_hw_done(state);
>>> +}
>>> +
>>>    static void komeda_kms_commit_tail(struct drm_atomic_state *old_state)
>>>    {
>>>    	struct drm_device *dev = old_state->dev;
>>> @@ -81,7 +99,7 @@ static void komeda_kms_commit_tail(struct drm_atomic_state *old_state)
>>>    	drm_atomic_helper_commit_modeset_enables(dev, old_state);
>>> -	drm_atomic_helper_commit_hw_done(old_state);
>>> +	komeda_kms_atomic_commit_hw_done(old_state);
>>>    	drm_atomic_helper_wait_for_flip_done(dev, old_state);
>>> diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.h b/drivers/gpu/drm/arm/display/komeda/komeda_kms.h
>>> index 456f3c435719317..bf6e8fba5061335 100644
>>> --- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.h
>>> +++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.h
>>> @@ -182,6 +182,8 @@ void komeda_kms_cleanup_private_objs(struct komeda_kms_dev *kms);
>>>    void komeda_crtc_handle_event(struct komeda_crtc   *kcrtc,
>>>    			      struct komeda_events *evts);
>>> +void komeda_crtc_flush_and_wait_for_flip_done(struct komeda_crtc *kcrtc,
>>> +					      struct completion *input_flip_done);
>>>    struct komeda_kms_dev *komeda_kms_attach(struct komeda_dev *mdev);
>>>    void komeda_kms_detach(struct komeda_kms_dev *kms);
> 


More information about the dri-devel mailing list