[PATCH 1/3] r600: add span support for 2D tiling

Thu May 27 16:01:24 PDT 2010

Hi,

Am 28.05.2010 00:04, schrieb Conn Clark:
> On Thu, May 27, 2010 at 8:51 AM, Brian Paul <brianp at vmware.com> wrote:
> 
> This code could be written with a faster algorithm requiring  just 13 operations
> 
> +               pixel_number |= ((x >> 0) & 1) << 0; // pn[0] = x[0]
> +               pixel_number |= ((y >> 0) & 1) << 1; // pn[1] = y[0]
> +               pixel_number |= ((x >> 1) & 1) << 2; // pn[2] = x[1]
> +               pixel_number |= ((y >> 1) & 1) << 3; // pn[3] = y[1]
> +               pixel_number |= ((x >> 2) & 1) << 4; // pn[4] = x[2]
> +               pixel_number |= ((y >> 2) & 1) << 5; // pn[5] = y[2]
> 

> /* suitable for all 16 bit or greater processors that can do an
> unsigned 16 bit or greater multiply */
> /*  tested and verified  */
> 
> pixel_number = ((((x & 0x07) * 0x1111 & 0x8421) * 0x1249 >> 9) & 0x55 ) |
>                              ((((y & 0x07) * 0x1111 & 0x8421) * 0x1249
>>> 8) & 0xAA );
> 
> Note if it is known that x and y are less than or equal to 7 it can be
> done in 11 operations.

Cool. How does it compare to:

        const unsigned char /*int*/ spread_bits[8] = {
                0x00,  /* 0b000 to 0b00000 */
                0x01,  /* 0b001 to 0b00001 */
                0x04,  /* 0b010 to 0b00100 */
                0x05,  /* 0b011 to 0b00101 */
                0x10,  /* 0b100 to 0b10000 */
                0x11,  /* 0b101 to 0b10001 */
                0x14,  /* 0b110 to 0b10100 */
                0x15,  /* 0b111 to 0b10101 */
        };

        pixel_number |= spread_bits[x & 0x07];
        pixel_number |= spread_bits[y & 0x07] << 1;

Greetings,
Frieder