[PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers

Petr Tesařík petr at tesarici.cz
Mon May 15 10:47:22 UTC 2023


(restoring the Cc list that I accidentally removed in my previous reply.)

On Mon, 15 May 2023 10:39:11 +0100
Catalin Marinas <catalin.marinas at arm.com> wrote:

> Hi Petr,
> 
> On Mon, May 15, 2023 at 10:47:37AM +0200, Petr Tesařík wrote:
> > On Sun, 14 May 2023 19:54:27 +0100
> > Catalin Marinas <catalin.marinas at arm.com> wrote:  
> > > On Tue, May 09, 2023 at 11:18:19AM +0200, Petr Tesarik wrote:  
> > > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > > > index daa2064f2ede..8cbb0bebb0bc 100644
> > > > --- a/include/linux/swiotlb.h
> > > > +++ b/include/linux/swiotlb.h
> > > > @@ -152,7 +152,11 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
> > > >  
> > > >  	return mem &&
> > > >  		(is_swiotlb_fixed(mem, paddr) ||
> > > > -		 (mem->allow_dyn && is_swiotlb_dyn(dev, paddr)));
> > > > +		 /* Pairs with smp_store_release() in swiotlb_dyn_map()
> > > > +		  * and swiotlb_dyn_unmap().
> > > > +		  */
> > > > +		 (smp_load_acquire(&dev->dma_io_tlb_have_dyn) &&
> > > > +		  is_swiotlb_dyn(dev, paddr)));
> > > >  }
> > > >  
> > > >  static inline bool is_swiotlb_force_bounce(struct device *dev)
> > > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > > > index 81eab1c72c50..e8be3ee50f18 100644
> > > > --- a/kernel/dma/swiotlb.c
> > > > +++ b/kernel/dma/swiotlb.c
> > > > @@ -642,6 +642,9 @@ static phys_addr_t swiotlb_dyn_map(struct device *dev, phys_addr_t orig_addr,
> > > >  
> > > >  	spin_lock_irqsave(&dev->dma_io_tlb_dyn_lock, flags);
> > > >  	list_add(&slot->node, &dev->dma_io_tlb_dyn_slots);
> > > > +	if (!dev->dma_io_tlb_have_dyn)
> > > > +		/* Pairs with smp_load_acquire() in is_swiotlb_buffer() */
> > > > +		smp_store_release(&dev->dma_io_tlb_have_dyn, true);
> > > >  	spin_unlock_irqrestore(&dev->dma_io_tlb_dyn_lock, flags);    
> > > 
> > > I'm not sure this works. What this seems to do is that if the caller of
> > > is_swiotlb_buffer() sees the flag set, it's guaranteed that something
> > > was added to the dma_io_tlb_dyn_slots list. But the reverse is not
> > > necessarily true. IOW, if something was added to the list, there is a
> > > brief window where the dma_io_tlb_have_dyn flag is still false. In the
> > > general case, I doubt any ordering between list_add() and the flag
> > > setting changes anything since neither of them may be visible to another
> > > CPU.  
> > 
> > Thank you for the review! This patch probably needs a bit more
> > explanation.
> > 
> > The goal is to avoid taking a spin lock in the mkost common case that
> > the dynamic list is empty. The only required invariant is:
> > 
> >   When the flag is clear, it is safe to skip the list.
> > 
> > It's not a bug to walk an empty list, it's merely less efficient. Such
> > race window would be acceptable. OTOH that's not your concern if I
> > understand you correctly.  
> 
> I got what the patch aims to do. What I meant is that your proposed
> invariant (flag == false => list_empty()) doesn't hold. The list becomes
> non-empty with list_add() but the flag is set after. For this to work,
> you'd need to set the flag prior to list_add().
> 
> However, even if you fix this invariant, I don't think it helps because
> the ordering isn't between a list update and the flag but rather a list
> update and the actual paddr visibility you want to look up in the list.
> 
> > > What you need is for a 'paddr' added to the dynamic list to be correctly
> > > identified by another CPU as dynamic swiotlb. That other CPU doesn't
> > > check random addresses but only those returned by the DMA API.  
> > 
> > Yes, that's correct.
> >   
> > > Such
> > > values would be normally passed through a memory location (e.g. driver
> > > local structures) and that's what you want to order against.  
> > 
> > This would certainly work, but I'm not sure I need it. My only goal is
> > that when the flag is set, the new value is observed by all CPUs on the
> > next call to is_swiotlb_buffer().  
> 
> Which value? paddr? That's not guaranteed with your current ordering.

I was thinking the flag value, but...

> Let's say P0 does:
> 
> 	list_add(paddr);
> 	smp_store_release(&flag, true);
> 	WRITE_ONCE(blah, paddr);	// using *_ONCE for clarity
> 
> On P1:
> 
> 	paddr = READ_ONCE(blah);
> 	if (smp_load_acquire(&flag)) {
> 		// check the list etc.
> 	}

... I see now, the problem is that smp_store_release() on P0 may happen
after smp_load_acquire() on P1, while paddr was already visible
(not through the list).

Thank you for patience. I was too focused on the linked list.

>[...]
> > Wait, let me check that I understand you right. Do you suggest that I
> > convert dma_io_tlb_dyn_slots to a lockless list and get rid of the
> > spinlock?  
> 
> I don't mind the spinlock. It feels safer to keep it ;) (and you do need
> for list updates anyway). Using RCU gets too complicated TBH, I'd rather
> keep it simple, it's not on an extremely critical path.
> 
> What I believe is that you can get rid of the flag, and just rely on
> probing list_empty() locklessly. If that's false, you take the lock and
> search the list again. Basically dev->dma_io_tlb_syn_slots.next is your
> flag, no need for an additional one (just the explicit smp_*mb()
> barriers as I mentioned above).

Since we don't control how drivers store the physical address (to give
them release semantics), the check cannot be performed without a read
memory barrier (aka "dsb" on arm64). Okay... still better than the
spinlock.

Petr T


More information about the dri-devel mailing list