[RFC PATCH v3 09/12] net: add support for skbs with unreadable frags

David Ahern dsahern at kernel.org
Mon Nov 6 19:34:52 UTC 2023


On 11/6/23 11:47 AM, Stanislav Fomichev wrote:
> On 11/05, Mina Almasry wrote:
>> For device memory TCP, we expect the skb headers to be available in host
>> memory for access, and we expect the skb frags to be in device memory
>> and unaccessible to the host. We expect there to be no mixing and
>> matching of device memory frags (unaccessible) with host memory frags
>> (accessible) in the same skb.
>>
>> Add a skb->devmem flag which indicates whether the frags in this skb
>> are device memory frags or not.
>>
>> __skb_fill_page_desc() now checks frags added to skbs for page_pool_iovs,
>> and marks the skb as skb->devmem accordingly.
>>
>> Add checks through the network stack to avoid accessing the frags of
>> devmem skbs and avoid coalescing devmem skbs with non devmem skbs.
>>
>> Signed-off-by: Willem de Bruijn <willemb at google.com>
>> Signed-off-by: Kaiyuan Zhang <kaiyuanz at google.com>
>> Signed-off-by: Mina Almasry <almasrymina at google.com>
>>
>> ---
>>  include/linux/skbuff.h | 14 +++++++-
>>  include/net/tcp.h      |  5 +--
>>  net/core/datagram.c    |  6 ++++
>>  net/core/gro.c         |  5 ++-
>>  net/core/skbuff.c      | 77 ++++++++++++++++++++++++++++++++++++------
>>  net/ipv4/tcp.c         |  6 ++++
>>  net/ipv4/tcp_input.c   | 13 +++++--
>>  net/ipv4/tcp_output.c  |  5 ++-
>>  net/packet/af_packet.c |  4 +--
>>  9 files changed, 115 insertions(+), 20 deletions(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 1fae276c1353..8fb468ff8115 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -805,6 +805,8 @@ typedef unsigned char *sk_buff_data_t;
>>   *	@csum_level: indicates the number of consecutive checksums found in
>>   *		the packet minus one that have been verified as
>>   *		CHECKSUM_UNNECESSARY (max 3)
>> + *	@devmem: indicates that all the fragments in this skb are backed by
>> + *		device memory.
>>   *	@dst_pending_confirm: need to confirm neighbour
>>   *	@decrypted: Decrypted SKB
>>   *	@slow_gro: state present at GRO time, slower prepare step required
>> @@ -991,7 +993,7 @@ struct sk_buff {
>>  #if IS_ENABLED(CONFIG_IP_SCTP)
>>  	__u8			csum_not_inet:1;
>>  #endif
>> -
>> +	__u8			devmem:1;
>>  #if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS)
>>  	__u16			tc_index;	/* traffic control index */
>>  #endif
>> @@ -1766,6 +1768,12 @@ static inline void skb_zcopy_downgrade_managed(struct sk_buff *skb)
>>  		__skb_zcopy_downgrade_managed(skb);
>>  }
>>  
>> +/* Return true if frags in this skb are not readable by the host. */
>> +static inline bool skb_frags_not_readable(const struct sk_buff *skb)
>> +{
>> +	return skb->devmem;
> 
> bikeshedding: should we also rename 'devmem' sk_buff flag to 'not_readable'?
> It better communicates the fact that the stack shouldn't dereference the
> frags (because it has 'devmem' fragments or for some other potential
> future reason).

+1.

Also, the flag on the skb is an optimization - a high level signal that
one or more frags is in unreadable memory. There is no requirement that
all of the frags are in the same memory type.


More information about the dri-devel mailing list