Very low throughput for small uploads when using glamor (XPutImage, ShmPixmaps)

Clemens Eisserer linuxhippy at gmail.com
Mon Nov 7 10:02:57 UTC 2016


Hello,

Java performs antialiased rendering by rasterizing coverage masks (<=
64x64 pixels in size, PictStandardA8) on the client instead of sending
trapezoid-lists to the XServer. The masks are uploaded using XPutImage
and are later used for XRenderComposite.
On the pro side this approach is *way* better than the
XGetImage+Client-Side-Fallback+XPutImage call involved before XRender
was used. However,
the downside is the really high per-primitive overhead and while SNA
does cope with this workload quite well, glamor does not.

Here are some upload bandwidth results for 64x64 masks +
XRenderComposite (src=vram_pixmap, mask=uploaded_mask, dst=window):

Haswell Laptop:
XPutImage-Glamor: 80MB/s
ShmPixmap-Glamor: 177.5MB/s
XPutImage-SNA: 585MB/s
ShmPixmap-SNA: 4000MB/s

Mullins Netbook:
XPutImage-Glamor: 30MB/s
ShmPixmap-Glamor: 33MB/s

My hope was using ShmPixmaps could help to lower the driver overhead,
by giving the driver a hint how the data was intended to be used
(exactly once). Unfourtunately, even the profile (at the end of the
email) doesn't contain
suspicious looking entries to me.

Any ideas to speed this process up are highly welcome.
Should I file a glamor-bug about this?

Thank you in advance, Clemens


Profile (amd mullins, radeonsi):

    SELF CUMULATIVE    FUNCTION
[   0,00%] [  92,31%]    [/usr/libexec/Xorg]
[   0,00%] [  10,31%]      In file [heap]
[   0,18%] [   5,37%]      ioctl
[   0,00%] [   4,68%]      r600_get_name
[   2,37%] [   2,40%]      surf_drm_to_winsys
[   2,37%] [   2,39%]      __memcpy_sse2_unaligned
[   2,01%] [   2,06%]      _int_malloc
[   0,08%] [   1,55%]      clock_gettime
[   1,49%] [   1,50%]      free
[   1,39%] [   1,41%]      si_draw_vbo
[   0,00%] [   1,33%]      util_blitter_get_next_surface_layer
[   1,25%] [   1,26%]      radeon_drm_cs_add_buffer
[   1,22%] [   1,23%]      radeon_winsys_surface_best
[   1,19%] [   1,21%]      radeon_winsys_surface_init
[   0,95%] [   0,96%]      r600_texture_create_object
[   0,94%] [   0,94%]      __GI_memset
[   0,93%] [   0,94%]      pthread_mutex_unlock
[   0,83%] [   0,88%]      set_tex_parameteri
[   0,83%] [   0,85%]      _int_free
[   0,80%] [   0,81%]      glamor_composite_clipped_region
[   0,04%] [   0,79%]      _mesa_Uniform1i
[   0,73%] [   0,74%]      unbind_texobj_from_image_units
[   0,63%] [   0,64%]      si_update_shaders
[   0,61%] [   0,61%]      si_make_texture_descriptor
[   0,60%] [   0,61%]      si_emit_framebuffer_state
[   0,59%] [   0,60%]      si_shader_select
[   0,59%] [   0,59%]      r600_texture_create
[   0,06%] [   0,55%]      RegionCreate
[   0,54%] [   0,55%]      hash_table_search
[   0,55%] [   0,55%]      st_choose_format
[   0,51%] [   0,53%]      __pthread_mutex_lock
[   0,00%] [   0,53%]      epoxy_glTexParameterfv_global_rewrite_ptr
[   0,50%] [   0,50%]      util_format_description
[   0,50%] [   0,50%]      _mesa_base_tex_format


More information about the xorg mailing list