[PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers
Petr Tesařík
petr at tesarici.cz
Mon May 15 10:00:54 UTC 2023
On Mon, 15 May 2023 10:48:47 +0200
Petr Tesařík <petr at tesarici.cz> wrote:
> Hi Catalin,
>
> On Sun, 14 May 2023 19:54:27 +0100
> Catalin Marinas <catalin.marinas at arm.com> wrote:
>[...]
> > Now, thinking about the list_head access and the flag ordering, since it
> > doesn't matter, you might as well not bother with the flag at all and
> > rely on list_add() and list_empty() ordering vs the hypothetical 'blah'
> > access. Both of these use READ/WRITE_ONCE() for setting
> > dma_io_tlb_dyn_slots.next. You only need an smp_wmb() after the
> > list_add() and an smp_rmb() before a list_empty() check in
^^^^^^^^^
Got it, finally. Well, that's exactly something I don't want to do.
For example, on arm64 (seeing your email address), smp_rmb() translates
to a "dsb ld" instruction. I would expect that this is more expensive
than a "ldar", generated by smp_load_acquire().
I mean, for devices that do not need swiotlb, the flag never changes
from zero, so reading it must be as cheap as possible.
Petr T
> > is_swiotlb_buffer(), no dma_iotlb_have_dyn variable.
>
> Wait, let me check that I understand you right. Do you suggest that I
> convert dma_io_tlb_dyn_slots to a lockless list and get rid of the
> spinlock?
>
> I'm sure it can be done for list_add() and list_del(). I'll have
> to think about list_move().
Hm, even the documentation of llist_empty() says that it is "not
guaranteed to be accurate or up to date". If it could be, I'm quite
sure the authors would have gladly implemented it as such.
Petr T
More information about the dri-devel
mailing list