[RFC PATCH v2 00/11] Device Memory TCP

Willem de Bruijn willemdebruijn.kernel at gmail.com
Tue Aug 15 14:41:35 UTC 2023


On Tue, Aug 15, 2023 at 9:38 AM David Laight <David.Laight at aculab.com> wrote:
>
> From: Mina Almasry
> > Sent: 10 August 2023 02:58
> ...
> > * TL;DR:
> >
> > Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
> > from device memory efficiently, without bouncing the data to a host memory
> > buffer.
>
> Doesn't that really require peer-to-peer PCIe transfers?
> IIRC these aren't supported by many root hubs and have
> fundamental flow control and/or TLP credit problems.
>
> I'd guess they are also pretty incompatible with IOMMU?

Yes, this is a form of PCI_P2PDMA and all the limitations of that apply.

> I can see how you might manage to transmit frames from
> some external memory (eg after encryption) but surely
> processing receive data that way needs the packets
> be filtered by both IP addresses and port numbers before
> being redirected to the (presumably limited) external
> memory.

This feature depends on NIC receive header split. The TCP/IP headers
are stored to host memory, the payload to device memory.

Optionally, on devices that do not support explicit header-split, but
do support scatter-gather I/O, if the header size is constant and
known, that can be used as a weak substitute. This has additional
caveats wrt unexpected traffic for which payload must be host visible
(e.g., ICMP).

> OTOH isn't the kernel going to need to run code before
> the packet is actually sent and just after it is received?
> So all you might gain is a bit of latency?
> And a bit less utilisation of host memory??
> But if your system is really limited by cpu-memory bandwidth
> you need more cache :-)
>
>
> So how much benefit is there over efficient use of host
> memory bounce buffers??

Among other things, on a PCIe tree this makes it possible to load up
machines with many NICs + GPUs.


More information about the dri-devel mailing list