[PATCH v14 RESEND 5/6] drm/imx: Introduce i.MX8qm/qxp DPU DRM

Tue Sep 5 03:32:47 UTC 2023

On Tuesday, August 22, 2023 8:59 PM, Maxime Ripard <mripard at kernel.org> wrote:
> 
> Hi,

Hi,

> 
> Aside from the discussion on the binding and the general architecture, I
> have some comments there.

Thanks for your comments.

> 
> On Tue, Aug 22, 2023 at 04:59:48PM +0800, Liu Ying wrote:
> > +int dpu_cf_init(struct dpu_soc *dpu, unsigned int index,
> > +		unsigned int id, enum dpu_unit_type type,
> > +		unsigned long pec_base, unsigned long base)
> > +{
> > +	struct dpu_constframe *cf;
> > +
> > +	cf = devm_kzalloc(dpu->dev, sizeof(*cf), GFP_KERNEL);
> > +	if (!cf)
> > +		return -ENOMEM;
> > +
> > +	dpu->cf_priv[index] = cf;
> 
> You can't store structures related to KMS in a device managed structure.
> The DRM KMS device will stick around (and be accessible from userspace)
> after the device has been removed until the last application closed its
> file descriptor to the device.

The DRM device is registered after component_bind_all() is called in
dpu_drm_bind().  The CRTC components' platform devices are created
in the dpu_core_probe() where the device managed resources are
created.   So, it looks those resources are safe because the DRM device
will be unregistered before those resources are freed.

> 
> This can be checked by enabling KASAN and manually unbinding the driver
> through sysfs.

I enabled KASAN and manually unbound the dpu-core driver with command:

echo 56180000.dpu > /sys/bus/platform/drivers/dpu-core/56180000.dpu/driver/unbind 

KASAN didin't report memory issue regarding those device managed
resources.  However, it did report another issue in dpu_drm_unbind(),
where drm_device should be got from drv_data->drm_dev instead of
dev_get_drvdata(dev).  I'll fix that in next version.

BTW, the dpu-core driver was successfully bound again after unbinding with
command:

echo  56180000.dpu > /sys/bus/platform/drivers/dpu-core/bind

> 
> > +	cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16);
> > +	if (!cf->pec_base)
> > +		return -ENOMEM;
> > +
> > +	cf->base = devm_ioremap(dpu->dev, base, SZ_32);
> > +	if (!cf->base)
> > +		return -ENOMEM;
> 
> For the same reason, you need to protect any access to a device managed
> resource (so clocks, registers, regulators, etc.) by a call to
> drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug instead
> of drm_dev_unregister.

That's a good point.  I've tried to do that, but it turns out that the display controller
cannot be enabled again after binding the dpu-core driver manually again.  It seems
that the display controller requires a proper disablement procedure, but the "driver
instance overview " kdoc mentions the shortcoming of no proper disablement if
drm_dev_unplug() is used:

"""
* Drivers that want to support device unplugging (USB, DT overlay unload) should
 * use drm_dev_unplug() instead of drm_dev_unregister(). The driver must protect
 * regions that is accessing device resources to prevent use after they're
 * released. This is done using drm_dev_enter() and drm_dev_exit(). There is one
 * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged before
 * drm_atomic_helper_shutdown() is called. This means that if the disable code
 * paths are protected, they will not run on regular driver module unload,
 * possibly leaving the hardware enabled.
"""

A DPU reset in dpu_core() might be helpful, but I'm not sure if there is any
reset line provided by the embodying system.

Even if the reset works, the 2nd DPU instance in i.MX8qm would be a problem,
because it won't be reset or properly disabled if the 1st DPU instance is unbound.
Although the two DPU instances could be wrapped by two DRM devices, I tend
not to do that because downstream bridges in future SoCs might be able to mux
to different DPU instances at runtime.

Due to the disablement issue, can we set drm_dev_enter/exit/unplug aside first?

> 
> > +static int dpu_crtc_pm_runtime_get_sync(struct dpu_crtc *dpu_crtc)
> > +{
> > +	int ret;
> > +
> > +	ret = pm_runtime_get_sync(dpu_crtc->dev->parent);
> > +	if (ret < 0) {
> > +		pm_runtime_put_noidle(dpu_crtc->dev->parent);
> > +		dpu_crtc_err(&dpu_crtc->base,
> > +			     "failed to get parent device RPM sync: %d\n", ret);
> > +	}
> > +
> > +	return ret;
> > +}
> 
> That's pm_runtime_resume_and_get.

Ok, will use it.

> 
> > +static int dpu_crtc_pm_runtime_put(struct dpu_crtc *dpu_crtc)
> > +{
> > +	int ret;
> > +
> > +	ret = pm_runtime_put(dpu_crtc->dev->parent);
> > +	if (ret < 0)
> > +		dpu_crtc_err(&dpu_crtc->base,
> > +			     "failed to put parent device RPM: %d\n", ret);
> > +
> > +	return ret;
> > +}
> > +
> > +static void dpu_crtc_mode_set_nofb(struct drm_crtc *crtc)
> > +{
> > +	struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc);
> > +	struct drm_display_mode *adj = &crtc->state->adjusted_mode;
> > +	enum dpu_link_id cf_link;
> > +
> > +	dpu_crtc_dbg(crtc, "mode " DRM_MODE_FMT "\n",
> DRM_MODE_ARG(adj));
> > +
> > +	/* request power-on when we start to set mode for CRTC */
> > +	dpu_crtc_pm_runtime_get_sync(dpu_crtc);
> 
> From the drm_crtc_helper_funcs documentation:
> 
> """
> 	 * Note that the display pipe is completely off when this function is
> 	 * called. Atomic drivers which need hardware to be running before
> they
> 	 * program the new display mode (e.g. because they implement
> runtime PM)
> 	 * should not use this hook. This is because the helper library calls
> 	 * this hook only once per mode change and not every time the display
> 	 * pipeline is suspended using either DPMS or the new "ACTIVE"
> property.
> 	 * Which means register values set in this callback might get reset
> when
> 	 * the CRTC is suspended, but not restored.  Such drivers should
> instead
> 	 * move all their CRTC setup into the @atomic_enable callback.
> """

I can use drm_atomic_helper_commit_tail() but not
drm_atomic_helper_commit_tail_rpm() because the planes need to be
updated prior to modeset_enables(where trigger shadow registers in
plane's HW resources to take effect).   Using the former one means that
RPM needs to be get/put in drm_atomic_helper_commit_planes(), which
doesn't seem good because in some cases the power of the display controller
might be lost after RPM put and I'm not sure if the registers set there will
be lost or not.   So, to avoid the issue the documentation mentions,
crtc_state->mode_changed is forced to be true in dpu_crtc_atomic_check()
if the CRTC is changed to active.  Is this acceptable?

> 
> > +static void dpu_crtc_atomic_enable(struct drm_crtc *crtc,
> > +				   struct drm_atomic_state *state)
> > +{
> > +	struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc);
> > +	unsigned long flags;
> > +
> > +	drm_crtc_vblank_on(crtc);
> > +
> > +	enable_irq(dpu_crtc->dec_shdld_irq);
> > +	enable_irq(dpu_crtc->ed_cont_shdld_irq);
> > +	enable_irq(dpu_crtc->ed_safe_shdld_irq);
> > +
> > +	dpu_fg_enable_clock(dpu_crtc->fg);
> > +	dpu_ed_pec_sync_trigger(dpu_crtc->ed_cont);
> > +	dpu_ed_pec_sync_trigger(dpu_crtc->ed_safe);
> > +	if (crtc->state->gamma_lut)
> > +		dpu_crtc_set_gammacor(dpu_crtc);
> > +	else
> > +		dpu_crtc_disable_gammacor(dpu_crtc);
> > +	dpu_fg_shdtokgen(dpu_crtc->fg);
> > +
> > +	/* don't relinquish CPU until TCON is set to operation mode */
> > +	local_irq_save(flags);
> > +	preempt_disable();
> > +	dpu_fg_enable(dpu_crtc->fg);
> 
> That's super fishy. You shouldn't need that, at all. What is going on
> there?

This aims to fully workaround the below errata recently released by
NXP.

ERR010910: DC: Display Subsystem First Frame Programming Restriction
Link: https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf

In short, to make sure the display controller can be enabled properly with
prefetch engine(DPRC + PRG), the TCON must be switch from bypass mode
to operation mode after FrameGen(FG) generates the first frame.

Timing is critical here, so local irq and preemption are disabled during the
switch procedure.

BTW, the driver always use prefetch engines for KMS, although they can
be bypassed.

> 
> > +
> > +	/*
> > +	 * TKT320590:
> 
> Those are NXP internal references as far as as I can tell. They
> shouldn't be here.

Ok, will change it to be ERR010910.

> 
> > +	 * Turn TCON into operation mode as soon as the first dumb
> > +	 * frame is generated by DPU(we don't relinquish CPU to ensure
> > +	 * this).  This makes DPR/PRG be able to evade the frame.
> > +	 */
> > +	DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc-
> >fg);
> > +	dpu_tcon_set_operation_mode(dpu_crtc->tcon);
> > +	local_irq_restore(flags);
> > +	preempt_enable();
> > +
> > +	DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(ed_safe_shdld_done);
> > +	DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(ed_cont_shdld_done);
> > +	DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(dec_shdld_done);
> > +
> > +	disable_irq(dpu_crtc->ed_safe_shdld_irq);
> > +	disable_irq(dpu_crtc->ed_cont_shdld_irq);
> > +	disable_irq(dpu_crtc->dec_shdld_irq);
> > +
> > +	DPU_CRTC_WAIT_FOR_FRAMEGEN_SECONDARY_SYNCUP(dpu_crtc-
> >fg);
> 
> The dance around the interrupts doesn't look great either. This need a
> proper description of the problem this was trying to solve. Also, what
> happens if any of those interrupts fail to trigger before you timeout?

Hmm, this is just following the display controller spec which provides detail
steps to enable the display pipeline which include waiting for the interrupts
and the FrameGen primary or secondary channel syncup.

If the interrupts fail to trigger and we timeout, then there must be something
wrong either caused by DPU HW itself or driver bug.   Here, warnings are
generated only.

> 
> > +	DPU_CRTC_CHECK_FRAMEGEN_FIFO(dpu_crtc->fg);
> > +
> > +	dpu_crtc_queue_state_event(crtc);
> > +}
> > +
> > +static void dpu_crtc_atomic_disable(struct drm_crtc *crtc,
> > +				    struct drm_atomic_state *state)
> > +{
> > +	struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc);
> > +	struct drm_plane *plane;
> > +	struct drm_plane_state *old_plane_state;
> > +	struct dpu_plane_state *old_dpstate;
> > +	struct dpu_fetchunit *fu;
> > +	struct dpu_dprc *dprc;
> > +	const struct dpu_fetchunit_ops *fu_ops;
> > +	unsigned long flags;
> > +	int i;
> > +
> > +	enable_irq(dpu_crtc->dec_seq_complete_irq);
> > +
> > +	/* don't relinquish CPU until DPRC repeat_en is disabled */
> > +	local_irq_save(flags);
> > +	preempt_disable();
> > +	/*
> > +	 * Sync to FrameGen frame counter moving so that
> > +	 * FrameGen can be disabled in the next frame.
> > +	 */
> > +	DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc-
> >fg);
> > +	dpu_fg_disable(dpu_crtc->fg);
> > +	/*
> > +	 * There is one frame leftover after FrameGen disablement.
> > +	 * Sync to FrameGen frame counter moving so that
> > +	 * DPRC repeat_en can be disabled in the next frame.
> > +	 */
> > +	DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc-
> >fg);
> > +
> > +	for_each_old_plane_in_state(state, plane, old_plane_state, i) {
> > +		old_dpstate = to_dpu_plane_state(old_plane_state);
> > +
> > +		if (!old_plane_state->fb)
> > +			continue;
> > +
> > +		if (old_plane_state->crtc != crtc)
> > +			continue;
> > +
> > +		fu = old_dpstate->source;
> > +
> > +		fu_ops = dpu_fu_get_ops(fu);
> > +
> > +		dprc = fu_ops->get_dprc(fu);
> > +		dpu_dprc_disable_repeat_en(dprc);
> > +	}
> > +
> > +	local_irq_restore(flags);
> > +	preempt_enable();
> > +
> > +
> 	DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(dec_seq_complete_d
> one);
> > +
> > +	disable_irq(dpu_crtc->dec_seq_complete_irq);
> > +
> > +	dpu_fg_disable_clock(dpu_crtc->fg);
> > +
> > +	drm_crtc_vblank_off(crtc);
> > +
> > +	spin_lock_irq(&crtc->dev->event_lock);
> > +	if (crtc->state->event && !crtc->state->active) {
> > +		drm_crtc_send_vblank_event(crtc, crtc->state->event);
> > +		crtc->state->event = NULL;
> > +	}
> > +	spin_unlock_irq(&crtc->dev->event_lock);
> > +
> > +	/* request power-off when CRTC is disabled */
> > +	dpu_crtc_pm_runtime_put(dpu_crtc);
> > +}
> 
> Same story than in atomic_enable here.

Similar to atomic_enable, strict procedure needs to be followed
to disable the CRTC properly, including disabling FrameGen by syncing
to frame counter moving and disabling DPRC repeat_en as soon as
possible.   The critical timing is achieved by disabling local irq and
preemption during the procedure.

> 
> 
> > +static int legacyfb_depth = 32;
> > +module_param(legacyfb_depth, uint, 0444);
> 
> No custom module parameter

Ok, will remove them.

> 
> > +static void dpu_atomic_put_plane_state(struct drm_atomic_state *state,
> > +				       struct drm_plane *plane)
> > +{
> > +	int index = drm_plane_index(plane);
> > +
> > +	plane->funcs->atomic_destroy_state(plane, state->planes[index].state);
> > +	state->planes[index].ptr = NULL;
> > +	state->planes[index].state = NULL;
> > +	state->planes[index].old_state = NULL;
> > +	state->planes[index].new_state = NULL;
> > +
> > +	drm_modeset_unlock(&plane->mutex);
> > +
> > +	dpu_plane_dbg(plane, "put state\n");
> > +}
> > +
> > +static void dpu_atomic_put_crtc_state(struct drm_atomic_state *state,
> > +				      struct drm_crtc *crtc)
> > +{
> > +	int index = drm_crtc_index(crtc);
> > +
> > +	crtc->funcs->atomic_destroy_state(crtc, state->crtcs[index].state);
> > +	state->crtcs[index].ptr = NULL;
> > +	state->crtcs[index].state = NULL;
> > +	state->crtcs[index].old_state = NULL;
> > +	state->crtcs[index].new_state = NULL;
> > +
> > +	drm_modeset_unlock(&crtc->mutex);
> > +
> > +	dpu_crtc_dbg(crtc, "put state\n");
> > +}
> > +
> > +static void
> > +dpu_atomic_put_possible_states_per_crtc(struct drm_crtc_state
> *crtc_state)
> > +{
> > +	struct drm_atomic_state *state = crtc_state->state;
> > +	struct drm_crtc *crtc = crtc_state->crtc;
> > +	struct drm_plane *plane;
> > +	struct drm_plane_state *old_plane_state, *new_plane_state;
> > +	struct dpu_plane_state *old_dpstate, *new_dpstate;
> > +
> > +	drm_atomic_crtc_state_for_each_plane(plane, crtc_state) {
> > +		old_plane_state = drm_atomic_get_old_plane_state(state,
> plane);
> > +		new_plane_state = drm_atomic_get_new_plane_state(state,
> plane);
> > +
> > +		old_dpstate = to_dpu_plane_state(old_plane_state);
> > +		new_dpstate = to_dpu_plane_state(new_plane_state);
> > +
> > +		/* Should be enough to check the below HW plane resources.
> */
> > +		if (old_dpstate->stage.ptr != new_dpstate->stage.ptr ||
> > +		    old_dpstate->source != new_dpstate->source ||
> > +		    old_dpstate->blend != new_dpstate->blend)
> > +			return;
> > +	}
> > +
> > +	drm_atomic_crtc_state_for_each_plane(plane, crtc_state)
> > +		dpu_atomic_put_plane_state(state, plane);
> > +
> > +	dpu_atomic_put_crtc_state(state, crtc);
> > +}
> 
> That's super suspicious too. Are you really going around and dropping
> and destroying plane and crtc states in a global state?

Yes.

> 
> At the very least, you need to describe what this addresses and why you
> think it's a good solution.

This is the solution to assign HW resources of a plane group to the two CRTCs
in one DPU or one CRTC group _dynamically_ at runtime.  Dpu.h has some
comments which hint this:

"""
/*
 * fetchunit/scaler/layerblend resources of a plane group are
 * shared by the two CRTCs in a CRTC group
 */
"""

I can add a DPU display controller block diagram in dpu_kms.c to tell the HW
architecture and some SW architecture to clarify this more.

Regards,
Liu Ying

> 
> I kind of skimmed over the last part of the driver, but we should really
> address these first comments. There's a larger discussion on the fact
> that this driver does much more that it should and needs to (especially in
> atomic_check, but not only), and this applies to the rest of patch.
> 
> Maxime