[PATCH v7 03/13] PCI: Reimplement plural devres functions

Ilpo Järvinen ilpo.jarvinen at linux.intel.com
Thu Jun 13 16:54:38 UTC 2024


On Wed, 5 Jun 2024, Philipp Stanner wrote:

> When the original PCI devres API was implemented, priority was given to
> the creation of a set of "plural functions" such as
> pcim_request_regions(). These functions have bit masks as parameters to
> specify which BARs shall get mapped. Most users, however, only use those
> to map 1-3 BARs.
> 
> A complete set of "singular functions" does not exist.
> 
> As functions mapping / requesting multiple BARs at once have (almost) no
> mechanism in C to return the resources to the caller of the plural
> function, the PCI devres API utilizes the iomap-table administrated by the
> function pcim_iomap_table().
> 
> The entire PCI devres API was strongly tied to that table
> which only allows for mapping whole, complete BARs, as the BAR's index
> is used as table index. Consequently, it's not possible to, e.g., have a
> pcim_iomap_range() function with that mechanism.
> 
> An additional problem is hat the PCI devres API has been ipmlemented in
> a sort of "hybrid-mode": Some unmanaged functions have managed
> counterparts (e.g.: pci_iomap() <-> pcim_iomap()), making their managed
> nature obvious to the programmer. However, the region-request functions
> in pci.c, prefixed with pci_, behave either managed or unmanaged,
> depending on whether pci_enable_device() or pcim_enable_device() has
> been called in advance.
> 
> This hybrid API is confusing and should be more cleanly separated by
> providing always-managed functions prefixed with pcim_.
> 
> Thus, the existing PCI devres API is not desirable because:
>   a) The vast majority of the users of the plural functions only ever
>      sets a single bit in the bit mask, consequently making them singular
>      functions anyways.
>   b) There is no mechanism to request / iomap only part of a BAR.
>   c) The iomap-table mechanism is over-engineered and complicated. Even
>      worse, some users index over the table administration function
>      directly:
>      void __iomem *mapping = pcim_iomap_table(pdev)[my_index];
>      This can not perform bounds checks; an invalid index won't cause
>      return of -EINVAL or even NULL, resulting in undefined behavior.
>   d) region-request functions being sometimes managed and sometimes not
>      is bug-provoking.
> 
> Implement a set of internal helper functions that don't have the problem
> of a hybrid nature that their counter parts in pci.c have. Write those
> helpers in a generic manner so that they can easily be extended to,
> e.g., ranged mappings and requests.
> 
> Implement a set of singular functions that use devres as it's intended
> and use those singular functions to reimplement the plural functions.
> 
> Signed-off-by: Philipp Stanner <pstanner at redhat.com>

>  /*
> - * PCI iomap devres
> + * On the state of PCI's devres implementation:
> + *
> + * The older devres API for PCI has two significant problems:
> + *
> + * 1. It is very strongly tied to the statically allocated mapping table in
> + *    struct pcim_iomap_devres below. This is mostly solved in the sense of the
> + *    pcim_ functions in this file providing things like ranged mapping by
> + *    bypassing this table, wheras the functions that were present in the old
> + *    API still enter the mapping addresses into the table for users of the old
> + *    API.
> + * 2. The region-request-functions in pci.c do become managed IF the device has
> + *    been enabled with pcim_enable_device() instead of pci_enable_device().
> + *    This resulted in the API becoming inconsistent: Some functions have an
> + *    obviously managed counter-part (e.g., pci_iomap() <-> pcim_iomap()),
> + *    whereas some don't and are never managed, while others don't and are
> + *    _sometimes_ managed (e.g. pci_request_region()).
> + *    Consequently, in the new API, region requests performed by the pcim_
> + *    functions are automatically cleaned up through the devres callback
> + *    pcim_addr_resource_release(), while requests performed by
> + *    pcim_enable_device() + pci_*region*() are automatically cleaned up
> + *    through the for-loop in pcim_release().
> + *
> + * TODO 1:
> + * Remove the legacy table entirely once all calls to pcim_iomap_table() in
> + * the kernel have been removed.
> + *
> + * TODO 2:
> + * Port everyone calling pcim_enable_device() + pci_*region*() to using the
> + * pcim_ functions. Then, remove all devres functionality from pci_*region*()
> + * functions and remove the associated cleanups described above in point #2.
>   */
> -#define PCIM_IOMAP_MAX	PCI_STD_NUM_BARS
>  
> +/*
> + * Legacy struct storing addresses to whole mapped BARs.
> + */
>  struct pcim_iomap_devres {
> -	void __iomem *table[PCIM_IOMAP_MAX];
> +	void __iomem *table[PCI_STD_NUM_BARS];
> +};

> +/**
> + * __pcim_request_region_range - Request a ranged region
> + * @pdev: PCI device the region belongs to
> + * @bar: The BAR the region is within
> + * @offset: offset from the BAR's start address
> + * @maxlen: length in bytes, beginning at @offset
> + * @name: name associated with the request
> + * @req_flags: flags for the request. For example for kernel-exclusive requests.
> + *
> + * Returns: 0 on success, a negative error code on failure.
> + *
> + * Request a ranged region within a device's PCI BAR. This function performs
> + * sanity checks on the input.
> + */
> +static int __pcim_request_region_range(struct pci_dev *pdev, int bar,
> +		unsigned long offset, unsigned long maxlen,
> +		const char *name, int req_flags)
> +{
> +	resource_size_t start = pci_resource_start(pdev, bar);
> +	resource_size_t len = pci_resource_len(pdev, bar);
> +	unsigned long dev_flags = pci_resource_flags(pdev, bar);
> +
> +	if (start == 0 || len == 0) /* That's an unused BAR. */
> +		return 0;
> +	if (len <= offset)
> +		return  -EINVAL;

Extra space.

> +
> +	start += offset;
> +	len -= offset;
> +
> +	if (len > maxlen && maxlen != 0)
> +		len = maxlen;
> +
> +	if (dev_flags & IORESOURCE_IO) {
> +		if (!request_region(start, len, name))
> +			return -EBUSY;
> +	} else if (dev_flags & IORESOURCE_MEM) {
> +		if (!__request_mem_region(start, len, name, req_flags))
> +			return -EBUSY;
> +	} else {
> +		/* That's not a device we can request anything on. */
> +		return -ENODEV;
> +	}
> +
> +	return 0;
> +}
> +
> +static void __pcim_release_region_range(struct pci_dev *pdev, int bar,
> +		unsigned long offset, unsigned long maxlen)
> +{
> +	resource_size_t start = pci_resource_start(pdev, bar);
> +	resource_size_t len = pci_resource_len(pdev, bar);
> +	unsigned long flags = pci_resource_flags(pdev, bar);
> +
> +	if (len <= offset || start == 0)
> +		return;
> +
> +	if (len == 0 || maxlen == 0) /* This an unused BAR. Do nothing. */
> +		return;
> +
> +	start += offset;
> +	len -= offset;
> +
> +	if (len > maxlen)
> +		len = maxlen;
> +
> +	if (flags & IORESOURCE_IO)
> +		release_region(start, len);
> +	else if (flags & IORESOURCE_MEM)
> +		release_mem_region(start, len);
> +}

> @@ -473,19 +945,14 @@ EXPORT_SYMBOL(pcim_iomap_regions_request_all);
>   */
>  void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
>  {
> -	void __iomem * const *iomap;
> -	int i;
> -
> -	iomap = pcim_iomap_table(pdev);
> -	if (!iomap)
> -		return;
> +	short bar;

The current best practice is to use unsigned for loop vars that will never 
be negative.

I don't entirely follow what is reasoning behind making it short instead 
of unsigned int?

> -	for (i = 0; i < PCIM_IOMAP_MAX; i++) {
> -		if (!mask_contains_bar(mask, i))
> +	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {

Is this change minimal if it contains variable renames like this?
Was "i" not "bar" even if it was given as a parameter to 
mask_contains_bar()?

-- 
 i.

> +		if (!mask_contains_bar(mask, bar))
>  			continue;
>  
> -		pcim_iounmap(pdev, iomap[i]);
> -		pci_release_region(pdev, i);
> +		pcim_iounmap_region(pdev, bar);
> +		pcim_remove_bar_from_legacy_table(pdev, bar);
>  	}
>  }
>  EXPORT_SYMBOL(pcim_iounmap_regions);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e5f243dd4288..b5d21d8207d6 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3880,6 +3880,16 @@ void pci_release_region(struct pci_dev *pdev, int bar)
>  		release_mem_region(pci_resource_start(pdev, bar),
>  				pci_resource_len(pdev, bar));
>  
> +	/*
> +	 * This devres utility makes this function sometimes managed
> +	 * (when pcim_enable_device() has been called before).
> +	 * This is bad because it conflicts with the pcim_ functions being
> +	 * exclusively responsible for managed pci. Its "sometimes yes, sometimes
> +	 * no" nature can cause bugs.
> +	 *
> +	 * TODO: Remove this once all users that use pcim_enable_device() PLUS
> +	 * a region request function have been ported to using pcim_ functions.
> +	 */
>  	dr = find_pci_dr(pdev);
>  	if (dr)
>  		dr->region_mask &= ~(1 << bar);




More information about the dri-devel mailing list