[PATCH RFC 31/35] crypto: remove nth_page() usage within SG entry
Linus Torvalds
torvalds at linux-foundation.org
Thu Aug 21 20:40:13 UTC 2025
On Thu, Aug 21, 2025 at 4:29 PM David Hildenbrand <david at redhat.com> wrote:
> > Because doing a 64-bit shift on x86-32 is like three cycles. Doing a
> > 64-bit signed division by a simple constant is something like ten
> > strange instructions even if the end result is only 32-bit.
>
> I would have thought that the compiler is smart enough to optimize that?
> PAGE_SIZE is a constant.
Oh, the compiler optimizes things. But dividing a 64-bit signed value
with a constant is still quite complicated.
It doesn't generate a 'div' instruction, but it generates something like this:
movl %ebx, %edx
sarl $31, %edx
movl %edx, %eax
xorl %edx, %edx
andl $4095, %eax
addl %ecx, %eax
adcl %ebx, %edx
and that's certainly a lot faster than an actual 64-bit divide would be.
An unsigned divide - or a shift - results in just
shrdl $12, %ecx, %eax
which is still not the fastest instruction (I think shrld gets split
into two uops), but it's certainly simpler and easier to read.
Linus
More information about the Intel-gfx
mailing list