[Mesa-dev] [PATCH] intel: Improve teximage perf for Google Chrome paint rects

Chad Versace chad.versace at linux.intel.com
Thu Sep 20 12:54:32 PDT 2012


On 09/14/2012 07:01 PM, Paul Berry wrote:
> On 11 September 2012 12:04, Chad Versace <chad.versace at linux.intel.com>wrote:

>> +   if (!image->mt ||
>> +       image->mt->region->tiling != I915_TILING_X) {
>> +      /* The algorithm below is written only for X-tiled memory.
>> +       *
>> +       * Memcpy'ing to Y-tiled memory is likely faster when done the
>> +       * conventional way: that is, map the memory through a fence in
>> order to
>> +       * acquire a linear view of the memory, then memcpy one row at a
>> time.
>> +       * Due to the arrangement of Y-tiled memory, if we were to upload
>> +       * a Y-tiled buffer using the algorithm below, then we could only
>> copy
>> +       * one byte at a time.
>>
>
> Nit pick: actually Y tiling only rearranges octwords, so you could copy 16
> bytes at a time for y tiling.  Still might not be worth it but one of these
> days we should probably investigate.

In v3, I will remove the incorrect comments about Y-tiling, and thus reduce the
comment to just the first line: "The algorithm below is written only for X-tiled
memory."

>> +   /* In the tiling algorithm below, some variables are in units of
>> pixels,
>> +    * others are in units of bytes, and others (such as height) are
>> unitless.
>> +    * Each variable name is either prefixed or suffixed with its units.
>> +    */
>>
>
> Thank you for doing this--it makes the code a lot easier to follow.  It
> looks like everything uses suffixing except pixel_max_x and pixel_max_y.
> Could we just change those to max_x_pixels and max_y_pixels?  Also, for
> consistency it would be nice to rename tile_height to tile_height_pixels.

pixel_x, pixel_y, byte_offset_y, and byte_offset also used prefixing, just as
pixel_max_[xy]_pixels do. In v3, I will consistently use prefixing for all.

tile_height is unitless. For example,
  right: tile_size_bytes = tile_width_bytes * tile_height
  wrong: tile_size_bytes2 = tile_width_bytes * tile_height_bytes

>> +   uint8_t *map = bo->virtual
>> +               + image_offset_y * stride_bytes
>> +               + image_offset_x * cpp;
>>
>
> I don't think this will work.  The way the miplevel image offsets work is
> that the parts of the mipmap other than level/layer zero are offset within
> the existing X tiling pattern.  What you're doing is offsetting the origin
> of the map assuming no tiling, and then applying the X tiling pattern to
> the offset image.  In the best case you'll just wind up putting data at the
> wrong location.  In the worst case there's a danger of going outside the
> memory allocated for the texture.

Ouch! Thanks for catching that. In v3, I will simply restrict this function to
level 0 of BGRA 2D textures, thus avoiding image offsets altogether. Such  a
restriction won't diminish the benefit that this patch gives to Chrome OS
performance, and that's the main reason I wrote this patch. Sure, it would be
nice if this fastpath supported more texture image types, and we can add that
additional support in the future, but right now I just want to get concrete
improvements for Chrome OS commited to master.

>> +         intptr_t byte_offset = byte_offset_y + byte_offset_x;
>> +         if (intel->has_swizzling) {
>> +            bool bit6 = (byte_offset >> 6) & 1;
>> +            bool bit9 = (byte_offset >> 9) & 1;
>> +            bool bit10 = (byte_offset >> 10) & 1;
>> +
>> +            byte_offset += 64 * (bit9 ^ bit10) * (1 - 2 * bit6);
>>
>
> Would this be faster?
>
> byte_offset ^= ((byte_offset >> 3) ^ (byte_offset >> 4)) & (1 << 6);

Clever. I'll take it.

---

Thanks for the careful review.



More information about the mesa-dev mailing list