[PATCH 1/3] r600: add span support for 2D tiling
Frieder Ferlemann
frieder.ferlemann at web.de
Thu May 27 16:01:24 PDT 2010
Hi,
Am 28.05.2010 00:04, schrieb Conn Clark:
> On Thu, May 27, 2010 at 8:51 AM, Brian Paul <brianp at vmware.com> wrote:
>
> This code could be written with a faster algorithm requiring just 13 operations
>
> + pixel_number |= ((x >> 0) & 1) << 0; // pn[0] = x[0]
> + pixel_number |= ((y >> 0) & 1) << 1; // pn[1] = y[0]
> + pixel_number |= ((x >> 1) & 1) << 2; // pn[2] = x[1]
> + pixel_number |= ((y >> 1) & 1) << 3; // pn[3] = y[1]
> + pixel_number |= ((x >> 2) & 1) << 4; // pn[4] = x[2]
> + pixel_number |= ((y >> 2) & 1) << 5; // pn[5] = y[2]
>
> /* suitable for all 16 bit or greater processors that can do an
> unsigned 16 bit or greater multiply */
> /* tested and verified */
>
> pixel_number = ((((x & 0x07) * 0x1111 & 0x8421) * 0x1249 >> 9) & 0x55 ) |
> ((((y & 0x07) * 0x1111 & 0x8421) * 0x1249
>>> 8) & 0xAA );
>
> Note if it is known that x and y are less than or equal to 7 it can be
> done in 11 operations.
Cool. How does it compare to:
const unsigned char /*int*/ spread_bits[8] = {
0x00, /* 0b000 to 0b00000 */
0x01, /* 0b001 to 0b00001 */
0x04, /* 0b010 to 0b00100 */
0x05, /* 0b011 to 0b00101 */
0x10, /* 0b100 to 0b10000 */
0x11, /* 0b101 to 0b10001 */
0x14, /* 0b110 to 0b10100 */
0x15, /* 0b111 to 0b10101 */
};
pixel_number |= spread_bits[x & 0x07];
pixel_number |= spread_bits[y & 0x07] << 1;
Greetings,
Frieder
More information about the dri-devel
mailing list