Significant window rendering time of GPU-accelerated apps on xorg with Ryzen 7 Pro 3700U
Chen Chen
chenchen at fastmail.com
Thu Apr 9 23:50:30 UTC 2020
Hi all,
Greetings!
I'm using a laptop with Ryzen 7 Pro 3700U APU with xorg. However, I'm facing a high latency on those GPU-accelerated programs.
My system config:
---
Kernel: 5.6.2-arch1-2 x86_64 bits: 64 compiler: gcc v: 9.3.0 Desktop: i3 4.18 dm: startx
Distro: Arch Linux
CPU: Topology: Quad Core model: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx bits: 64 type: MT MCP arch: Zen+ rev: 1
L2 cache: 2048 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 36751
Speed: 1183 MHz min/max: 1400/2300 MHz Core speeds (MHz): 1: 1177 2: 1176 3: 1177 4: 1177 5: 1178 6: 1177 7: 2931
8: 2944
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Picasso vendor: Lenovo driver: amdgpu v: kernel bus ID: 05:00.0
chip ID: 1002:15d8
Display: server: X.Org 1.20.8 driver: amdgpu resolution: 3840x2160~60Hz
OpenGL: renderer: AMD RAVEN (DRM 3.36.0 5.6.2-arch1-2 LLVM 9.0.1) v: 4.6 Mesa 20.0.4 direct render: Yes
---
xorg config:
---
Section "Device"
Identifier "AMD"
Driver "amdgpu"
Option "TearFree" "true"
Option "VariableRefresh" "true"
EndSection
---
When I'm using GPU-accelerated terminal emulators (e.g. alacritty and kitty), the initial window rendering time is significantly high. For instance, `time alacritty -e false` gave 0.24s user 0.04s system 54% cpu 0.531 total, which means the window rendering costs ~500ms. This is weird for a GPU-accelerated application. In comparison, a laptop with Intel Core i7-1065G7 using the same config could render the window of this kind of applications within ~130ms, which is snappy.
For other programs, like the picom compositor, if I use "glx" as its backend, the whole rendering is extremely laggy. If I use "xrender" instead (which means no hw-acceleration I guess), it runs snappy.
After some basic profiling I thought this is a amdgpu-xorg related issue. I used callgrind to record the trace of `alacritty -e false` with xorg-amdgpu, xorg-modesetting and wayland and got different results (I will just show the first few lines, which is the hot spot in the program I think):
xorg-amdgpu:
---
280,109,580 ???:0x00000000008cd930 [/usr/lib/dri/radeonsi_dri.so]
107,423,760 ???:0x00000000008cd9e0 [/usr/lib/dri/radeonsi_dri.so]
99,890,145 ???:do_lookup_x [/usr/lib/ld-2.31.so]
71,160,050 ???:0x0000000000042210 [/usr/lib/libGLX_mesa.so.0.0.0]
54,633,060 ???:0x00000000008acca0 [/usr/lib/dri/radeonsi_dri.so]
50,114,882 ???:0x00000000000251c0 [/usr/lib/libfontconfig.so.1.12.0]
49,050,274 ???:0x000000000001d8b0 [/usr/lib/libfontconfig.so.1.12.0]
45,988,140 ???:0x00000000008a7600 [/usr/lib/dri/radeonsi_dri.so]
40,037,150 ???:0x000000000001d790 [/usr/lib/libfontconfig.so.1.12.0]
34,129,137 ???:__memcpy_avx_unaligned_erms [/usr/lib/libc-2.31.so]
30,595,302 ???:_dl_lookup_symbol_x [/usr/lib/ld-2.31.so]
---
xorg-modesetting (similar results):
---
281,058,300 ???:0x00000000008cd930 [/usr/lib/dri/radeonsi_dri.so]
107,787,600 ???:0x00000000008cd9e0 [/usr/lib/dri/radeonsi_dri.so]
99,890,145 ???:do_lookup_x [/usr/lib/ld-2.31.so]
71,160,050 ???:0x0000000000042210 [/usr/lib/libGLX_mesa.so.0.0.0]
54,818,100 ???:0x00000000008acca0 [/usr/lib/dri/radeonsi_dri.so]
50,112,204 ???:0x00000000000251c0 [/usr/lib/libfontconfig.so.1.12.0]
49,050,274 ???:0x000000000001d8b0 [/usr/lib/libfontconfig.so.1.12.0]
46,143,900 ???:0x00000000008a7600 [/usr/lib/dri/radeonsi_dri.so]
40,037,150 ???:0x000000000001d790 [/usr/lib/libfontconfig.so.1.12.0]
35,147,037 ???:__memcpy_avx_unaligned_erms [/usr/lib/libc-2.31.so]
30,595,302 ???:_dl_lookup_symbol_x [/usr/lib/ld-2.31.so]
---
wayland:
---
69,189,362 ???:do_lookup_x [/usr/lib/ld-2.31.so]
50,111,905 ???:0x00000000000251c0 [/usr/lib/libfontconfig.so.1.12.0]
49,050,274 ???:0x000000000001d8b0 [/usr/lib/libfontconfig.so.1.12.0]
40,946,389 ???:__memcpy_avx_unaligned_erms [/usr/lib/libc-2.31.so]
40,037,150 ???:0x000000000001d790 [/usr/lib/libfontconfig.so.1.12.0]
37,270,021 ???:fread [/usr/lib/libc-2.31.so]
30,019,946 ???:_dl_lookup_symbol_x [/usr/lib/ld-2.31.so]
23,285,006 /build/rust/src/rustc-1.42.0-src/src/libcore/iter/range.rs:core::iter::range::<impl core::iter::traits::iterator::Iterator for core::ops::range::Range<A>>::next [/home/chen/Downloads/alacritty-master/target/debug/alacritty]
22,951,556 ???:_IO_file_xsgetn [/usr/lib/libc-2.31.so]
21,423,383 ???:_int_malloc [/usr/lib/libc-2.31.so]
20,091,007 /build/rust/src/rustc-1.42.0-src/src/libcore/slice/mod.rs:<core::ops::range::Range<usize> as core::slice::SliceIndex<[T]>>::index [/home/chen/Downloads/alacritty-master/target/debug/alacritty]
19,161,509 ???:__strchr_avx2 [/usr/lib/libc-2.31.so]
...
2,936,304 ???:0x0000000000632910 [/usr/lib/dri/radeonsi_dri.so]
---
Sorry I am not using a source based distro so the debug symbols are not available. What I could see here is that when I'm running GPU-accelerated programs under xorg a same symbol in radeonsi_dri is overwhelmingly called. However, I do not know where the issue is. I did not see many similar issues so if this is related to xorg and amdgpu, this might be model-specific.
I don't know if this is a correct way to use this mailing list. If you have any advice, please let me know. I am also happy to provide more information if needed.
Thanks!
More information about the amd-gfx
mailing list