[Liboil] yet another copy_u8
Adam D. Moss
adam at gimp.org
Wed Nov 16 11:23:59 PST 2005
David Schleef wrote:
> On Wed, Nov 16, 2005 at 01:09:44PM +0000, Adam D. Moss wrote:
>>I know I shouldn't be wasting time on the little C implementations,
>>but I was wondering if Duff's Device still held any merit - seems
>>it does (up to 30% improvement for non-tiny n here over
>>copy_u8_llints, which is itself up to around 30% faster than
>>copy_u8_ints - hope I didn't add an egregious bug).
> This technically isn't Duff's device, as the while block is outside
> of the switch block. But using uint64_t is indeed is pretty fast
> on my CPU.
Yeah, didn't really want to press the point - the core of
DD as I understand it is this general loop unrolling via the fallthrough,
which I enjoyed in Z80 assembly before finding this way to do it in C; if
I ever understood the C semantic advantage of putting the while inside
the switch then I don't any more (willing to listen to an explanation!).
Surprised to find that copy_u8_ints is the liboil copy_u8 of choice
on a friend's PIII, so I guess llints/llint_duff would be better
yet and thus have some real-world x86 merit, which I wasn't expecting.
More information about the Liboil