[PATCH v4 3/8] accel/qaic: Add MHI controller

Manivannan Sadhasivam manivannan.sadhasivam at linaro.org
Mon Mar 27 06:59:55 UTC 2023


On Fri, Mar 24, 2023 at 09:26:49AM -0600, Jeffrey Hugo wrote:
> On 3/24/2023 4:26 AM, Manivannan Sadhasivam wrote:
> > On Mon, Mar 20, 2023 at 09:11:09AM -0600, Jeffrey Hugo wrote:
> > > An AIC100 device contains a MHI interface with a number of different
> > > channels for controlling different aspects of the device. The MHI
> > > controller works with the MHI bus to enable and drive that interface.
> > > 
> > > AIC100 uses the BHI protocol in PBL to load SBL. The MHI controller
> > > expects the SBL to be located at /lib/firmware/qcom/aic100/sbl.bin and
> > > expects the MHI bus to manage the process of loading and sending SBL to
> > > the device.
> > > 
> > > Signed-off-by: Jeffrey Hugo <quic_jhugo at quicinc.com>
> > > Reviewed-by: Carl Vanderlip <quic_carlv at quicinc.com>
> > > Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy at quicinc.com>
> > > Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka at linux.intel.com>
> > > ---
> > >   drivers/accel/qaic/mhi_controller.c | 563 ++++++++++++++++++++++++++++++++++++
> > >   drivers/accel/qaic/mhi_controller.h |  16 +
> > >   2 files changed, 579 insertions(+)
> > >   create mode 100644 drivers/accel/qaic/mhi_controller.c
> > >   create mode 100644 drivers/accel/qaic/mhi_controller.h
> > > 
> > > diff --git a/drivers/accel/qaic/mhi_controller.c b/drivers/accel/qaic/mhi_controller.c
> > > new file mode 100644
> > > index 0000000..777dfbe
> > > --- /dev/null
> > > +++ b/drivers/accel/qaic/mhi_controller.c
> > > @@ -0,0 +1,563 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +
> > > +/* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */
> > > +/* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
> > > +
> > > +#include <linux/delay.h>
> > > +#include <linux/err.h>
> > > +#include <linux/memblock.h>
> > > +#include <linux/mhi.h>
> > > +#include <linux/moduleparam.h>
> > > +#include <linux/pci.h>
> > > +#include <linux/sizes.h>
> > > +
> > > +#include "mhi_controller.h"
> > > +#include "qaic.h"
> > > +
> > > +#define MAX_RESET_TIME_SEC 25
> > > +
> > > +static unsigned int mhi_timeout_ms = 2000; /* 2 sec default */
> > > +module_param(mhi_timeout_ms, uint, 0600);
> > > +MODULE_PARM_DESC(mhi_timeout_ms, "MHI controller timeout value");
> > > +
> > > +static struct mhi_channel_config aic100_channels[] = {
> > > +	{
> > > +		.name = "QAIC_LOOPBACK",
> > 
> > Why do you need QAIC_ prefix for channel names?
> 
> To avoid existing and anticipated conflicts.
> 
> As you are aware, the channel name becomes critical for the bus device and
> is the key that the consumer driver will probe on.
> 
> Sadly, that is rife for conflicts.  You can only have one driver for a
> particular MHI device (channel).  Multiple drivers can register for it, but
> only the first one will bind to the device.  This creates a race condition.
> Whoever is able to register with the bus first, owns all instances of that
> device.  That also means that particular driver on the bus also needs to be
> able to handle all instances of that device.
> 
> The WWAN subsystem already claims DIAG.  You and I both know from the WWAN
> subsystem creation experience, the Net folks don't want a common framework
> that can service multiple types of devices.  QAIC devices are not WWAN
> devices, and were an argument for having a WWAN specific thing.  So, I can't
> leverage WWAN, and frankly I shouldn't because my device is not a WWAN
> device.  The WWAN userspace shouldn't try to use ACCEL/QAIC devices (one of
> the reasons for having ACCEL instead of DRM).  Therefore DIAG devices are
> WWAN exclusive, and I need to have a different device.  "DIAG2" seems like a
> poor name.  If the QAIC DIAG device is going to be QAIC specific, having
> QAIC in the name to isolate and identify it seems like the best option.
> 
> I anticipate similar conflicts with
> SAHARA/QDSS/DEBUG/TIMESYNC/LOGGING/LOOPBACK.  All of these are "common" with
> other existing MHI devices.
> 

Hmm, this is something I anticipated to happen... :/

> I anticipate future conflicts with STATUS/RAS/TELEMETRY/CONTROL/SSR. These
> are rather generic channel names.  It seems likely that a future WWAN device
> or other MHI device would want a channel with the same name as one of these.
> I'd like to leave that open as a possibility by not exclusivly claiming the
> sole use to one of these names.
> 
> Arguably this is an internal implementation detail with how the MHI bus
> operates and could be fixed at first look.  However I don't think that is
> the case because it looks like the WWAN subsystem is exposing these names to
> userspace, which creates a uAPI that cannot be broken. Therefore I think we
> are rather quite stuck with this situation and what I have proposed with
> this patch is the best thing I've come up with to address the problem.  If
> you have an alternate suggestion, I'm willing to discuss with you.
> 

I think what you have is the best for now. Only downside of this approach is
the code duplication among the client drivers but I think we compromised this
during the WWAN framework discussion.

> > 
> > > +		.num = 0,
> > > +		.num_elements = 32,
> > > +		.local_elements = 0,
> > > +		.event_ring = 0,
> > > +		.dir = DMA_TO_DEVICE,
> > > +		.ee_mask = MHI_CH_EE_AMSS,
> > > +		.pollcfg = 0,
> > > +		.doorbell = MHI_DB_BRST_DISABLE,
> > > +		.lpm_notify = false,
> > > +		.offload_channel = false,
> > > +		.doorbell_mode_switch = false,
> > > +		.auto_queue = false,
> > > +		.wake_capable = false,
> > > +	},
> > 
> > [...]
> > 
> > > +static struct mhi_event_config aic100_events[] = {
> > > +	{
> > > +		.num_elements = 32,
> > > +		.irq_moderation_ms = 0,
> > > +		.irq = 0,
> > > +		.channel = U32_MAX,
> > > +		.priority = 1,
> > > +		.mode = MHI_DB_BRST_DISABLE,
> > > +		.data_type = MHI_ER_CTRL,
> > > +		.hardware_event = false,
> > > +		.client_managed = false,
> > > +		.offload_channel = false,
> > > +	},
> > > +};
> > > +
> > 
> > It'd be nice to use macros for defining the channels and events as done in the
> > pci_generic driver.
> 
> I think the pci_generic driver has a usecase for using a macro in that it is
> servicing multiple devices, with different configuration.  Right now, we
> only have the one device with the one config.  I suspect that will change in
> the future, but I don't have concrete information at the time to inform a
> proper design.
> 
> I feel this should be left until such time the multi-device scenario becomes
> realized.
> 

Ok, this sounds reasonable to me.

> > 
> > > +static struct mhi_controller_config aic100_config = {
> > > +	.max_channels = 128,
> > > +	.timeout_ms = 0, /* controlled by mhi_timeout */
> > > +	.buf_len = 0,
> > > +	.num_channels = ARRAY_SIZE(aic100_channels),
> > > +	.ch_cfg = aic100_channels,
> > > +	.num_events = ARRAY_SIZE(aic100_events),
> > > +	.event_cfg = aic100_events,
> > > +	.use_bounce_buf = false,
> > > +	.m2_no_db = false,
> > > +};
> > > +
> > > +static int mhi_read_reg(struct mhi_controller *mhi_cntl, void __iomem *addr, u32 *out)
> > > +{
> > > +	u32 tmp = readl_relaxed(addr);
> > > +
> > > +	if (tmp == U32_MAX)
> > > +		return -EIO;
> > > +
> > > +	*out = tmp;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void mhi_write_reg(struct mhi_controller *mhi_cntl, void __iomem *addr, u32 val)
> > > +{
> > > +	writel_relaxed(val, addr);
> > > +}
> > > +
> > > +static int mhi_runtime_get(struct mhi_controller *mhi_cntl)
> > > +{
> > > +	return 0;
> > > +}
> > > +
> > > +static void mhi_runtime_put(struct mhi_controller *mhi_cntl)
> > > +{
> > > +}
> > > +
> > > +static void mhi_status_cb(struct mhi_controller *mhi_cntl, enum mhi_callback reason)
> > > +{
> > > +	struct qaic_device *qdev = pci_get_drvdata(to_pci_dev(mhi_cntl->cntrl_dev));
> > > +
> > > +	/* this event occurs in atomic context */
> > > +	if (reason == MHI_CB_FATAL_ERROR)
> > > +		pci_err(qdev->pdev, "Fatal error received from device. Attempting to recover\n");
> > 
> > Why no dev_err()?
> 
> pci_err is more specific than dev_err.  It is built upon dev_err. pci_err
> seems to be preferred for pci devices, and also matches uses elsewhere in
> the driver.
> 

Ok.

> > 
> > > +	/* this event occurs in non-atomic context */
> > > +	if (reason == MHI_CB_SYS_ERROR)
> > > +		qaic_dev_reset_clean_local_state(qdev, true);
> > > +}
> > > +
> > > +static int mhi_reset_and_async_power_up(struct mhi_controller *mhi_cntl)
> > > +{
> > > +	char time_sec = 1;
> > 
> > u8?
> 
> Eh.  Ok I guess.  I usually reserve the size specific types for things where
> that size is required, such as sending data over a network.
> 
> > 
> > > +	int current_ee;
> > > +	int ret;
> > > +
> > > +	/* Reset the device to bring the device in PBL EE */
> > > +	mhi_soc_reset(mhi_cntl);
> > > +
> > > +	/*
> > > +	 * Keep checking the execution environment(EE) after every 1 second
> > > +	 * interval.
> > > +	 */
> > > +	do {
> > > +		msleep(1000);
> > > +		current_ee = mhi_get_exec_env(mhi_cntl);
> > > +	} while (current_ee != MHI_EE_PBL && time_sec++ <= MAX_RESET_TIME_SEC);
> > > +
> > > +	/* If the device is in PBL EE retry power up */
> > > +	if (current_ee == MHI_EE_PBL)
> > > +		ret = mhi_async_power_up(mhi_cntl);
> > > +	else
> > > +		ret = -EIO;
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar,
> > > +						    int mhi_irq)
> > > +{
> > > +	struct mhi_controller *mhi_cntl;
> > 
> > Cosmetic change: We use "mhi_cntrl" in other controller drivers. So it is
> > better to follow the same pattern here also.
> 
> If you insist.  "cntl" is the more common abbreviation.  The MHI bus is the
> first place I recall seeing "cntrl".
> 

For some reason, all MHI controller drivers have picked up this notation. So
I'd like to keep it same.

> > 
> > > +	int ret;
> > > +
> > > +	mhi_cntl = devm_kzalloc(&pci_dev->dev, sizeof(*mhi_cntl), GFP_KERNEL);
> > > +	if (!mhi_cntl)
> > > +		return ERR_PTR(-ENOMEM);
> > > +
> > > +	mhi_cntl->cntrl_dev = &pci_dev->dev;
> > > +
> > > +	/*
> > > +	 * Covers the entire possible physical ram region. Remote side is
> > > +	 * going to calculate a size of this range, so subtract 1 to prevent
> > > +	 * rollover.
> > > +	 */
> > > +	mhi_cntl->iova_start = 0;
> > > +	mhi_cntl->iova_stop = PHYS_ADDR_MAX - 1;
> > > +	mhi_cntl->status_cb = mhi_status_cb;
> > > +	mhi_cntl->runtime_get = mhi_runtime_get;
> > > +	mhi_cntl->runtime_put = mhi_runtime_put;
> > > +	mhi_cntl->read_reg = mhi_read_reg;
> > > +	mhi_cntl->write_reg = mhi_write_reg;
> > > +	mhi_cntl->regs = mhi_bar;
> > > +	mhi_cntl->reg_len = SZ_4K;
> > 
> > Is this size fixed for all AIC100 revisions? I think you should get this value
> > from pci_resource_len() to avoid issues later.
> 
> Yes, this size is burned into the silicon with no provision for ever
> changing it.
> 

Fine then!

Thanks,
Mani

> > 
> > Thanks,
> > Mani
> > 
> > > +	mhi_cntl->nr_irqs = 1;
> > > +	mhi_cntl->irq = devm_kmalloc(&pci_dev->dev, sizeof(*mhi_cntl->irq), GFP_KERNEL);
> > > +
> > > +	if (!mhi_cntl->irq)
> > > +		return ERR_PTR(-ENOMEM);
> > > +
> > > +	mhi_cntl->irq[0] = mhi_irq;
> > > +	mhi_cntl->fw_image = "qcom/aic100/sbl.bin";
> > > +
> > > +	/* use latest configured timeout */
> > > +	aic100_config.timeout_ms = mhi_timeout_ms;
> > > +	ret = mhi_register_controller(mhi_cntl, &aic100_config);
> > > +	if (ret) {
> > > +		pci_err(pci_dev, "mhi_register_controller failed %d\n", ret);
> > > +		return ERR_PTR(ret);
> > > +	}
> > > +
> > > +	ret = mhi_prepare_for_power_up(mhi_cntl);
> > > +	if (ret) {
> > > +		pci_err(pci_dev, "mhi_prepare_for_power_up failed %d\n", ret);
> > > +		goto prepare_power_up_fail;
> > > +	}
> > > +
> > > +	ret = mhi_async_power_up(mhi_cntl);
> > > +	/*
> > > +	 * If EIO is returned it is possible that device is in SBL EE, which is
> > > +	 * undesired. SOC reset the device and try to power up again.
> > > +	 */
> > > +	if (ret == -EIO && MHI_EE_SBL == mhi_get_exec_env(mhi_cntl)) {
> > > +		pci_err(pci_dev, "Found device in SBL at MHI init. Attempting a reset.\n");
> > > +		ret = mhi_reset_and_async_power_up(mhi_cntl);
> > > +	}
> > > +
> > > +	if (ret) {
> > > +		pci_err(pci_dev, "mhi_async_power_up failed %d\n", ret);
> > > +		goto power_up_fail;
> > > +	}
> > > +
> > > +	return mhi_cntl;
> > > +
> > > +power_up_fail:
> > > +	mhi_unprepare_after_power_down(mhi_cntl);
> > > +prepare_power_up_fail:
> > > +	mhi_unregister_controller(mhi_cntl);
> > > +	return ERR_PTR(ret);
> > > +}
> > > +
> > > +void qaic_mhi_free_controller(struct mhi_controller *mhi_cntl, bool link_up)
> > > +{
> > > +	mhi_power_down(mhi_cntl, link_up);
> > > +	mhi_unprepare_after_power_down(mhi_cntl);
> > > +	mhi_unregister_controller(mhi_cntl);
> > > +}
> > > +
> > > +void qaic_mhi_start_reset(struct mhi_controller *mhi_cntl)
> > > +{
> > > +	mhi_power_down(mhi_cntl, true);
> > > +}
> > > +
> > > +void qaic_mhi_reset_done(struct mhi_controller *mhi_cntl)
> > > +{
> > > +	struct pci_dev *pci_dev = container_of(mhi_cntl->cntrl_dev, struct pci_dev, dev);
> > > +	int ret;
> > > +
> > > +	ret = mhi_async_power_up(mhi_cntl);
> > > +	if (ret)
> > > +		pci_err(pci_dev, "mhi_async_power_up failed after reset %d\n", ret);
> > > +}
> > > diff --git a/drivers/accel/qaic/mhi_controller.h b/drivers/accel/qaic/mhi_controller.h
> > > new file mode 100644
> > > index 0000000..c105e93
> > > --- /dev/null
> > > +++ b/drivers/accel/qaic/mhi_controller.h
> > > @@ -0,0 +1,16 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only
> > > + *
> > > + * Copyright (c) 2019-2020, The Linux Foundation. All rights reserved.
> > > + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> > > + */
> > > +
> > > +#ifndef MHICONTROLLERQAIC_H_
> > > +#define MHICONTROLLERQAIC_H_
> > > +
> > > +struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar,
> > > +						    int mhi_irq);
> > > +void qaic_mhi_free_controller(struct mhi_controller *mhi_cntl, bool link_up);
> > > +void qaic_mhi_start_reset(struct mhi_controller *mhi_cntl);
> > > +void qaic_mhi_reset_done(struct mhi_controller *mhi_cntl);
> > > +
> > > +#endif /* MHICONTROLLERQAIC_H_ */
> > > -- 
> > > 2.7.4
> > > 
> > 
> 

-- 
மணிவண்ணன் சதாசிவம்


More information about the dri-devel mailing list