[PATCH v2] fb: fix fast-path detection
Daniel Kurtz
djkurtz at google.com
Thu Mar 27 06:37:45 PDT 2014
On Tue, Mar 25, 2014 at 11:28 PM, Keith Packard <keithp at keithp.com> wrote:
> Daniel Kurtz <djkurtz at chromium.org> writes:
>
>> Fix both of these by converting width from bits to FbBits, and let the
>> pointer arithmetic convert to byte addresses.
>
> Ok, I'm doing a bit more careful review this morning, and I think the
> width parameter isn't quite right.
>
> (Note that we can ignore bpp as fbBlt doesn't use it for computation at
> all; it's purely a bit-copy function.). Imagine copying at 8-bpp two
> pixels one place right:
>
> dstLine = srcLine
> srcX = 24
> dstX = 32
> width = 16
>
> width >> FB_SHIFT = 0
>
> srcLine < dstLine TRUE
> &srcLine[width >> FB_SHIFT] > dstLine FALSE
>
> dstLine < srcLine FALSE
> &dstLine[width >> FB_SHIFT] > srcLine TRUE
>
> You actually need to check whether there are any overlapping bits in the
> copy. There are two ways to do this -- either add in the X offset when
> computing the last address or by converting to byte addresses earlier
> and using those. I think the byte address version is a lot simpler:
>
> byteSrc = (uint8_t *) srcLine + (srcX >> 3);
> byteDst = (uint8_t *) dstLine + (dstX >> 3);
> byteWidth = width >> 3;
>
> disjoint = (byteSrc + byteWidth <= byteDst) || (byteDst + byteWidth <= byteSrc);
Hi Keith,
Thank you very much for your detailed review. Your patch looks great.
Just one small detail.
Are src/dst_byte_stride still "FbStride" if they count bytes not
FbBits? Perhaps they are size_t?
Either way, for your attached patch:
Reviewed-by: Daniel Kurtz <djkurtz at chromium.org>
>
> The extra check for (bpp & 7) is not necessary as the memcpy case
> already checks for byte alignment of source, dest and width.
>
> For our test case above
>
> dstLine = srcLine
> srcX = 24
> dstX = 32
> width = 16
>
> byteSrc = srcLine + 3
> byteDst = dstLine + 4
> byteWidth = 2
>
> disjoint = (srcLine + 3 + 2 <= dstLine + 4) || (dstLine + 4 + 2 <= srcLine + 3)
>
> = (srcLine + 5 <= srcLine + 4) || (srcLine + 6 <= srcLine + 3)
>
> = FALSE || FALSE
>
> = FALSE
>
> However, if we copy two pixels right:
>
> dstLine = srcLine
> srcX = 24
> dstX = 40
> width = 16
>
> byteSrc = srcLine + 3
> byteDst = dstLine + 5
> byteWidth = 2
>
> disjoint = (srcLine + 3 + 2 <= dstLine + 5) || (dstLine + 5 + 2 <= srcLine + 3)
>
> = (srcLine + 5 <= srcLine + 5) || (srcLine + 7 <= srcLine + 3)
>
> = TRUE || FALSE
>
> = TRUE
>
> Or, if we copy one pixel left:
>
> dstLine = srcLine
> srcX = 32
> dstX = 24
> width = 16
>
> byteSrc = srcLine + 4
> byteDst = dstLine + 3
> byteWidth = 2
>
> disjoint = (srcLine + 4 + 2 <= dstLine + 3) || (dstLine + 3 + 2 <= srcLine + 4)
>
> = (srcLine + 6 <= srcLine + 3) || (srcLine + 5 <= srcLine + 4)
>
> = FALSE || FALSE
>
> = FALSE
>
> However, if we copy two pixels left:
>
> dstLine = srcLine
> srcX = 40
> dstX = 24
> width = 16
>
> byteSrc = srcLine + 5
> byteDst = dstLine + 3
> byteWidth = 2
>
> disjoint = (srcLine + 5 + 2 <= dstLine + 3) || (dstLine + 3 + 2 <= srcLine + 5)
>
> = (srcLine + 7 <= srcLine + 3) || (srcLine + 5 <= srcLine + 5)
>
> = FALSE || TRUE
>
> = TRUE
>
> Of course, we can only do this in bytes if the copy is byte aligned. The
> current code already checks that with
>
> !(srcX & 7) && !(dstX & 7) && !(width & 7)
>
> So, our final code can look like this:
>
>
>
> --
> keith.packard at intel.com
>
--
Daniel Kurtz | Software Engineer | djkurtz at google.com | 650.204.0722
More information about the xorg-devel
mailing list