[PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier

Rob Herring robh at kernel.org
Tue Sep 7 15:33:24 UTC 2021


On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding at gmail.com> wrote:
>
> On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding at gmail.com> wrote:
> > >
> > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding at gmail.com> wrote:
> > > > >
> > > > > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > > > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > > > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > > > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > > > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > > > >>>>> From: Thierry Reding <treding at nvidia.com>
> > > > > > >>>>>
> > > > > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > > > > >>>>> specifier that provides additional information about how that specific
> > > > > > >>>>> reference should be treated.
> > > > > > >>>>>
> > > > > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > > > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > > > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > > > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > > > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > > > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > > > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > > > > >>>>> enabled for the display controller.
> > > > > > >>>>>
> > > > > > >>>>> Signed-off-by: Thierry Reding <treding at nvidia.com>
> > > > > > >>>>> ---
> > > > > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > > > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > > > > >>>>>  2 files changed, 29 insertions(+)
> > > > > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > > > > >>>>
> > > > > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > > > > >>>>
> > > > > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > > > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > > > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > > > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > > > > >>>
> > > > > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > > > > >>> name for memory-region to make the naming more consistent with other
> > > > > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > > > > >>> that, we could easily differentiate between the "legacy" cases where
> > > > > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > > > > >>>
> > > > > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > > > > >>>> node already has 'memory-region', then adding more regions is more
> > > > > > >>>> complicated compared to adding new properties. And defining what each
> > > > > > >>>> memory-region entry is or how many in schemas is impossible.
> > > > > > >>>
> > > > > > >>> It's true that updating the property gets a bit complicated, but it's
> > > > > > >>> not exactly rocket science. We really just need to splice the array. I
> > > > > > >>> have a working implemention for this in U-Boot.
> > > > > > >>>
> > > > > > >>> For what it's worth, we could run into the same issue with any new
> > > > > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > > > > >>> it's still possible that a bootloader may have to update this property
> > > > > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > > > > >>> been added by some earlier bootloader or firmware).
> > > > > > >>>
> > > > > > >>>> Both could be addressed with a new property. Perhaps something like
> > > > > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > > > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > > > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > > > > >>>> don't really see it in this case.
> > > > > > >>>
> > > > > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > > > > >>> might proliferate a number of *-memory-region properties with varying
> > > > > > >>> prefixes.
> > > > > > >>>
> > > > > > >>> I am aware of one other case where we might need something like this: on
> > > > > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > > > > >>> using a DMA engine. These processors are booted from early firmware
> > > > > > >>> using firmware from system memory. In order to avoid trashing the
> > > > > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > > > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > > > > >>> need to make sure that the firmware memory is marked as reserved within
> > > > > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > > > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > > > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > > > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > > > > >>>
> > > > > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > > > > >>> iommu-memory-region property for that because then we don't have enough
> > > > > > >>> information to decide which type of reservation we need.
> > > > > > >>>
> > > > > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > > > > >>> could just as well use memory-regions in that case since we have
> > > > > > >>> something more generic anyway.
> > > > > > >>>
> > > > > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > > > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > > > > >>> take that other use case into account. If we than also change to the new
> > > > > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > > > > >>> of consistency while at it).
> > > > > > >>
> > > > > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > > > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > > > > >> how do we proceed?
> > > > > > >
> > > > > > > Rob, given that additional use-case, do you want me to run with this
> > > > > > > proposal and send out an updated series?
> > > > > >
> > > > > >
> > > > > > What about variant with a "descriptor" properties that will describe
> > > > > > each region:
> > > > > >
> > > > > > fb_desc: display-framebuffer-memory-descriptor {
> > > > > >       needs-identity-mapping;
> > > > > > }
> > > > > >
> > > > > > display at 52400000 {
> > > > > >       memory-region = <&fb ...>;
> > > > > >       memory-region-descriptor = <&fb_desc ...>;
> > > > > > };
> > > > > >
> > > > > > It could be a more flexible/extendible variant.
> > > > >
> > > > > This problem recently came up on #dri-devel again. Adding Alyssa and
> > > > > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > > > > understood correctly). Also adding dri-devel for visibility since this
> > > > > is a very common problem for display in particular.
> > > > >
> > > > > On M1 the situation is slightly more complicated: the firmware will
> > > > > allocate a couple of buffers (including the framebuffer) in high memory
> > > > > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > > > > so that the display hardware can access it. This makes it impossible to
> > > > > bypass the IOMMU like we do on other chips (in particular to work around
> > > > > the fault-by-default policy of the ARM SMMU driver). It also means that
> > > > > in addition to the simple reserved regions I mentioned we need for audio
> > > > > use-cases and identity mapping use-cases we need for display on Tegra,
> > > > > we now also need to be able to convey physical to IOVA mappings.
> > > > >
> > > > > Fitting the latter into the original proposal sounds difficult. A quick
> > > > > fix would've been to generate a mapping table in memory and pass that to
> > > > > the kernel using a reserved-memory node (similar to what's done for
> > > > > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > > > > such using a special flag. But that then involves two layers of parsing,
> > > > > which seems a bit suboptimal. Another way to shoehorn that into the
> > > > > original proposal would've been to add flags for physical and virtual
> > > > > address regions and use pairs to pass them using special flags. Again,
> > > > > this is a bit wonky because it needs these to be carefully parsed and
> > > > > matched up.
> > > > >
> > > > > Another downside is that we now have a situation where some of these
> > > > > regions are no longer "reserved-memory regions" in the traditional
> > > > > sense. This would require an additional flag in the reserved-memory
> > > > > region nodes to prevent the IOVA regions from being reserved. By the
> > > > > way, this is something that would also be needed for the audio use-case
> > > > > I mentioned before, because the physical memory at that address can
> > > > > still be used by an operating system.
> > > > >
> > > > > A more general solution would be to draw a bit from Dmitry's proposal
> > > > > and introduce a new top-level "iov-reserved-memory" node. This could be
> > > > > modelled on the existing reserved-memory node, except that the physical
> > > > > memory pages for regions represented by child nodes would not be marked
> > > > > as reserved. Only the IOVA range described by the region would be
> > > > > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> > > > >
> > > > > The simplest case where we just want to reserve some IOVA region could
> > > > > then be done like this:
> > > > >
> > > > >         iov-reserved-memory {
> > > > >                 /*
> > > > >                  * Probably safest to default to <2>, <2> here given
> > > > >                  * that most IOMMUs support either > 32 bits of IAS
> > > > >                  * or OAS.
> > > > >                  */
> > > > >                 #address-cells = <2>;
> > > > >                 #size-cells = <2>;
> > > > >
> > > > >                 firmware: firmware at 80000000 {
> > > > >                         reg = <0 0x80000000 0 0x01000000>;
> > > > >                 };
> > > > >         };
> > > > >
> > > > >         audio at 30000000 {
> > > > >                 ...
> > > > >                 iov-memory-regions = <&firmware>;
> > > > >                 ...
> > > > >         };
> > > > >
> > > > > Mappings could be represented by an IOV reserved region taking a
> > > > > reference to the reserved-region that they map:
> > > > >
> > > > >         reserved-memory {
> > > > >                 #address-cells = <2>;
> > > > >                 #size-cells = <2>;
> > > > >
> > > > >                 /* 16 MiB of framebuffer at top-of-memory */
> > > > >                 framebuffer: framebuffer at 1,ff000000 {
> > > > >                         reg = <0x1 0xff000000 0 0x01000000>;
> > > > >                         no-map;
> > > > >                 };
> > > > >         };
> > > > >
> > > > >         iov-reserved-memory {
> > > > >                 /* IOMMU supports only 32-bit output address space */
> > > > >                 #address-cells = <1>;
> > > > >                 #size-cells = <1>;
> > > > >
> > > > >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> > > > >                 fb: fb at ff000000 {
> > > > >                         reg = <0 0xff000000 0 0x01000000>;
> > > > >                         memory-region = <&framebuffer>;
> > > > >                 };
> > > > >         };
> > > > >
> > > > >         display at 40000000 {
> > > > >                 ...
> > > > >                 /* optional? */
> > > > >                 memory-region = <&framebuffer>;
> > > > >                 iov-memory-regions = <&fb>;
> > > > >                 ...
> > > > >         };
> > > > >
> > > > > It's interesting how identity mapped regions now become a trivial
> > > > > special case of mappings. All that is needed is to make the reg property
> > > > > of the IOV reserved region correspond to the reg property of the normal
> > > > > reserved region. Alternatively, as a small optimization for lazy people
> > > > > like me, we could just allow these cases to omit the reg property and
> > > > > instead inherit it from the referenced reserved region.
> > > > >
> > > > > As the second example shows it might be convenient if memory-region
> > > > > could be derived from iov-memory-regions. This could be useful for cases
> > > > > where the driver wants to do something with the physical pages of the
> > > > > reserved region (such as mapping them and copying out the framebuffer
> > > > > data to another buffer so that the reserved memory can be recycled). If
> > > > > we have the IOV reserved region, we could provide an API to extract the
> > > > > physical reserved region (if it exists). That way we could avoid
> > > > > referencing it twice in DT. Then again, there's something elegant about
> > > > > the explicit second reference to. It indicates the intent that we may
> > > > > want to use the region for something other than just the IOV mapping.
> > > > >
> > > > > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > > > > Sven, it'd be interesting to hear if you think this could work as a
> > > > > solution to the problem on M1.
> > > > >
> > > > > Rob, I think you might like this alternative because it basically gets
> > > > > rid of all the points in the original proposal that you were concerned
> > > > > about. Let me know what you think.
> > > >
> > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > issue I see would be handling reserved iova areas without a physical
> > > > area. That can be handled with just a iova and no reg. We already have
> > > > a no reg case.
> > >
> > > I had thought about that initially. One thing I'm worried about is that
> > > every child node in /reserved-memory will effectively cause the memory
> > > that it described to be reserved. But we don't want that for regions
> > > that are "virtual only" (i.e. IOMMU reservations).
> >
> > By virtual only, you mean no physical mapping, just a region of
> > virtual space, right? For that we'd have no 'reg' and therefore no
> > (physical) reservation by the OS. It's similar to non-static regions.
> > You need a specific handler for them. We'd probably want a compatible
> > as well for these virtual reservations.
>
> Yeah, these would be purely used for reserving regions in the IOVA so
> that they won't be used by the IOVA allocator. Typically these would be
> used for cases where those addresses have some special meaning.
>
> Do we want something like:
>
>         compatible = "iommu-reserved";
>
> for these? Or would that need to be:
>
>         compatible = "linux,iommu-reserved";
>
> ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> compatible strings in the reserved-memory DT bindings directory.

I would not use 'linux,' here.

>
> On the other hand, do we actually need the compatible string? Because we
> don't really want to associate much extra information with this like we
> do for example with "shared-dma-pool". The logic to handle this would
> all be within the IOMMU framework. All we really need is for the
> standard reservation code to skip nodes that don't have a reg property
> so we don't reserve memory for "virtual-only" allocations.

It doesn't hurt to have one and I can imagine we might want to iterate
over all the nodes. It's slightly easier and more common to iterate
over compatible nodes rather than nodes with some property.

> > Are these being global in DT going to be a problem? Presumably we have
> > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > we wouldn't be able to distinguish overlapping addresses from separate
> > address spaces. Or we could have 2 different IOVAs for 1 physical
> > space. That could be solved with something like this:
> >
> > iommu-addresses = <&iommu1 <address cells> <size cells>>;
>
> The only case that would be problematic would be if we have overlapping
> physical regions, because that will probably trip up the standard code.
>
> But this could also be worked around by looking at iommu-addresses. For
> example, if we had something like this:
>
>         reserved-memory {
>                 fb_dc0: fb at 80000000 {
>                         reg = <0x80000000 0x01000000>;
>                         iommu-addresses = <0xa0000000 0x01000000>;
>                 };
>
>                 fb_dc1: fb at 80000000 {

You can't have 2 nodes with the same name (actually, you can, they
just get merged together). Different names with the same unit-address
is a dtc warning. I'd really like to make that a full blown
overlapping region check.

>                         reg = <0x80000000 0x01000000>;
>                         iommu-addresses = <0xb0000000 0x01000000>;
>                 };
>         };
>
> We could make the code identify that this is for the same physical
> reservation (maybe make it so that reg needs to match exactly for this
> to be recognized) but with different virtual allocations.
>
> On a side-note: do we really need to repeat the size? I'd think if we
> want mappings then we'd likely want them for the whole reservation.

Humm, I suppose not, but dropping it paints us into a corner if we
come up with wanting a different size later. You could have a carveout
for double/triple buffering your framebuffer, but the bootloader
framebuffer is only single buffered. So would you want actual size?

> I'd like to keep references to IOMMUs out of this because they would be
> duplicated. We will only use these nodes if they are referenced by a
> device node that also has an iommus property. Also, the IOMMU reference
> itself isn't enough. We'd also need to support the complete specifier
> because you can have things like SIDs in there to specify the exact
> address space that a device uses.
>
> Also, for some of these they may be reused independently of the IOMMU
> address space. For example the Tegra framebuffer identity mapping can
> be used by either of the 2-4 display controllers, each with (at least
> potentially) their own address space. But we don't want to have to
> describe the identity mapping separately for each display controller.

Okay, but I'd rather have to duplicate things in your case than not be
able to express some other case.

> Another thing to consider is that these nodes will often be added by
> firmware (e.g. firmware will allocate the framebuffer and set up the
> corresponding reserved memory region in DT). Wiring up references like
> this would get very complicated very quickly.

Yes.

The using 'iommus' property option below can be optional and doesn't
have to be defined/supported now. Just trying to think ahead and not
be stuck with something that can't be extended.

> > Or the other way to do this is reuse 'iommus' property to define the
> > mapping of each address entry to iommu.
> >
> > > Obviously we can fix that in Linux, but what about other operating
> > > systems? Currently "reg" is a required property for statically allocated
> > > regions (which all of these would be). Do you have an idea of how widely
> > > that's used? What about other OSes, or bootloaders, what if they
> > > encounter these nodes that don't have a "reg" property?
> >
> > Without 'reg', there must be a compatible that the client understands
> > or the node should be ignored.
> >
> > My suspicion is that /reserved-memory is abused for all sorts of
> > things downstream, but that's not really relevant here.
>
> Yeah, my only concern was that we might break users of this that are not
> sophisticated enough to handle the nuances that we'd introduce here. If
> we can assume that nodes without a reg property will be ignored, then I
> think that's good enough.

I'm pretty sure we should be okay, but check the code.

Rob


More information about the dri-devel mailing list