[Mesa-dev] st_TexSubImage: unaligned memcpy performance

Daniel Stone daniel at fooishbar.org
Wed Apr 8 03:24:08 PDT 2015


Hi,

On 8 April 2015 at 10:57, Vasilis Liaskovitis <vliaskov at gmail.com> wrote:
> I have an issue where st_TexSubImage causes very high CPU load in
> __memcpy_sse2_unaligned (Mesa 10.1.3, Xorg 1.15.1, radeon driver, HD 7870).
>
> Any obvious causes / tips for this? e.g. align textures or use different
> format/type? I 've tried using GL_BGRA/GL_UNSIGNED_BYTE and
> GL_BGRA/GL_UNSIGNED_INT_8_8_8_8_REV
>
> __memcpy_sse2_unaligned () at
> ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:85
> 85    ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or
> directory.
> (gdb) bt
> #0  __memcpy_sse2_unaligned () at
> ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:85
> #1  0x00007fffb572f154 in memcpy (__len=7680, __src=<optimized out>,
> __dest=0x7fff5835f800) at /usr/include/x86_64-linux-gnu/bits/string3.h:51
> #2  st_TexSubImage (ctx=0x1b91420, dims=<optimized out>, texImage=0x1f81710,
> xoffset=0, yoffset=0, zoffset=0, width=1920, height=1080, depth=1,
> format=32993, type=5121, pixels=0xdacf90, unpack=0x1bad590)
>     at ../../../../src/mesa/state_tracker/st_cb_texture.c:752

Your source (0xdacf90) is only aligned to a 16-byte boundary, not 32.
This will cause issues particularly on ARM, where natural alignment is
required (i.e. 32-byte load/stores must be on 32-byte boundaries). By
contrast, the destination is already aligned to a 128-byte boundary.
So fixing the caller, rather than Mesa, should take care of the
problem.

Cheers,
Daniel


More information about the mesa-dev mailing list