[PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers

Petr Tesařík petr at tesarici.cz
Mon May 15 10:00:54 UTC 2023


On Mon, 15 May 2023 10:48:47 +0200
Petr Tesařík <petr at tesarici.cz> wrote:

> Hi Catalin,
> 
> On Sun, 14 May 2023 19:54:27 +0100
> Catalin Marinas <catalin.marinas at arm.com> wrote:
>[...]
> > Now, thinking about the list_head access and the flag ordering, since it
> > doesn't matter, you might as well not bother with the flag at all and
> > rely on list_add() and list_empty() ordering vs the hypothetical 'blah'
> > access. Both of these use READ/WRITE_ONCE() for setting
> > dma_io_tlb_dyn_slots.next. You only need an smp_wmb() after the
> > list_add() and an smp_rmb() before a list_empty() check in
                      ^^^^^^^^^
Got it, finally. Well, that's exactly something I don't want to do.
For example, on arm64 (seeing your email address), smp_rmb() translates
to a "dsb ld" instruction. I would expect that this is more expensive
than a "ldar", generated by smp_load_acquire().

I mean, for devices that do not need swiotlb, the flag never changes
from zero, so reading it must be as cheap as possible.

Petr T

> > is_swiotlb_buffer(), no dma_iotlb_have_dyn variable.  
> 
> Wait, let me check that I understand you right. Do you suggest that I
> convert dma_io_tlb_dyn_slots to a lockless list and get rid of the
> spinlock?
> 
> I'm sure it can be done for list_add() and list_del(). I'll have
> to think about list_move().

Hm, even the documentation of llist_empty() says that it is "not
guaranteed to be accurate or up to date". If it could be, I'm quite
sure the authors would have gladly implemented it as such.

Petr T


More information about the dri-devel mailing list