Phyr Starter
Matthew Wilcox
willy at infradead.org
Wed Jan 12 18:37:03 UTC 2022
On Tue, Jan 11, 2022 at 06:53:06PM -0400, Jason Gunthorpe wrote:
> IOMMU is not common in those cases, it is slow.
>
> So you end up with 16 bytes per entry then another 24 bytes in the
> entirely redundant scatter list. That is now 40 bytes/page for typical
> HPC case, and I can't see that being OK.
Ah, I didn't realise what case you wanted to optimise for.
So, how about this ...
Since you want to get to the same destination as I do (a
16-byte-per-entry dma_addr+dma_len struct), but need to get there sooner
than "make all sg users stop using it wrongly", let's introduce a
(hopefully temporary) "struct dma_range".
But let's go further than that (which only brings us to 32 bytes per
range). For the systems you care about which use an identity mapping,
and have sizeof(dma_addr_t) == sizeof(phys_addr_t), we can simply
point the dma_range pointer to the same memory as the phyr. We just
have to not free it too early. That gets us down to 16 bytes per range,
a saving of 33%.
More information about the dri-devel
mailing list