[virglrenderer-devel] A bit of performance analysis

Marc-André Lureau marcandre.lureau at gmail.com
Fri Sep 7 11:56:44 UTC 2018


Hi

On Fri, Sep 7, 2018 at 3:45 PM Gert Wollny <gert.wollny at collabora.com> wrote:
>
> Dear all,
>
> given that the deqp test suites are very close to pass without errors,
> and a release coming close I was thinking that it is time to look a bit
> closer a the performance numbers and to get a base line I ran some
> benchmarks and compared results obtained by running directly on the
> host, running within qemu, and via vtest (see end of this email)
>
> Benchmarks that use many textures and buffers, like Unigine Heaven and
> Unigine Valley running within Qemu slows the application down by the
> factor of approximately six on r600 and 20 on the Intel Kabylake. On
> the other hand, synthetic benchmarks from Gputest are less penalized
> on r600 and on Intel they actually run on par with the host system or
> even faster. My assumtion is that on Intel with these shader heavy
> applications the different shader optimizations running on the guest
> and the host actually improve the final code.
>
> Instrumenting by using perf on the r600 host running Unigine Valley in
> the guest doesn't reveal any specific hot spot on the host within qemu
> or virglrenderer. memcpy accounts for 6% of the total run time, but
> here only one third results from calls from qemu or virglrenderer,
> another 6% of the total run time goes to libpixman, apparently to
> update some cursor. The only notable function directly in virglrenderer
> is vrend_draw_bind_const_shader with 1.4%, and the memcpy calls
> triggered by IOV transfers account for approximately 1.2%.
>
> On Intel host another hot spot seems to be vmx_vcpu_run (ca 9%), this
> might point to some qemu configuration problem.
>
> The vtest results in Intel/Ubuntu are between running directly on the
> host and running in qemu as one would expect. On the r600/Gentoo system
> the picture is completely different, and my assumtion is that my kernel
> configuration might be off here.
>
> On the guest side things look a bit different. Here for the valley
> benchmark more then 33% of the time is spend in and below
> entry_SYSCALL_64 mostly initiated by mesa map_buffer_range
> (glMapBufferRange) / unmap_buffer:
>
>  32.12% entry_SYSCALL_64
>     - 31.96% do_syscall_64
>       - 23.46%  _x64_sys_ioctl
>         - 23.35% ksys_ioctl
>           - 22.35% do_vfs_ioctl
>             - 21.89% drm_ioctl
>               - 20.40% drm_ioctl_kernel
>                 + 7.47% virtio_gpu_wait_ioctl
>                 + 5.73% virtio_gpu_transfer_to_host_ioctl
>                 + 4.58% virtio_gpu_transfer_from_host_ioctl
>                   1.63% virtio_gpu_execbuffer_ioctl
>       + 5.06% __x64_sys_nanosleep
>       + 2.35% __x64_sys_futex
>
> Instrumenting on Intel/Ubuntu reveals another hotspot in the guest
> kernel's iowrite16 (self ca. 25%) that is not as prominent on the
> AMD/Gentoo system (self ca. 3%) (VM in both cases a Ubuntu bionic with
> the latests Ubuntu (cosmic) 4.17.0. kernel).
>
> Some of this will likely be alleviated by coherent memory support or
> udmabuf. However, given that these data transfer related hot spots
> takes such a big chunk of the run-time it is difficult to
> directly identify other hots spots where performance could be
> significantly improved. IOV linearization will help to cut down on
> memcpy but the instrumatation seems to indicate that for the tested
> benchmark this is not in a hot code path. Another improvement might be
> to do more asyncronous data transfer: i.e. I'm not sure whether sending
> the command stream always results in the guest waiting for an ACK, if
> this is so then there is certainly room for improvement.
>
> It would be interesting to know what benchmarking tools others are
> using. From Google I heard about glbench, but I'm unable to actually
> find it. Maybe this benchmark now uses a new name?
>
> best regards,
> Gert
>
>
> [1] https://gitlab.freedesktop.org/virgl/virglrenderer/issues/1
>
> -- Benchmark results:
>
> Host: Ubuntu 18.04, linux 4.15.0-33-generic
>
> CPU/GPU Intel Kabylake
> Driver: i965
> Mesa host/guest: git-19dbc7dd0f
> Virglrenderer: git-2766ae7e97
>
> ## Unigine Valley (1024x768, Q:High, AA:2x)
>
>  Driver      | FPS avrg (min, max) | Score |Score/host |     Remark
>  --------------------------------------------------------------------
> -----
>  Virgl/qemu  |   1.0 (1.0, 1.5)    |  42   |   0.04    | Some artifacts
>  Virgl/vtest |  12.3 (8.4,17.5)    |  515  |   0.40    | (Scenes 10
> -13)
>  Host        |  31.4 (17.9, 47.9)  | 1314  |   1       |
>
>
> ## Unigine Heaven (1024x768, Q:High, Tess: Normal, AA:2x)
>
>  Driver      | FPS avrg (min, max) | Score |Score/host
> --------------------------------------------------------
>  Virgl/qemu  |   2.1 (1.5, 3.9)    | 52   |   0.06
>  Virgl/vtest |  13.4 (5.8, 24.9)   | 337  |   0.36
>  Host        |  37.3 (8.3, 64.1)   | 940  |   1

You might be interested by qemu "[PATCH v4 00/29] vhost-user for input & GPU"

Unigine Heaven 4.0 on Intel® HD Graphics 530 (Skylake GT2)

host is fps:31.1 / score:784

qemu-gtk/egl+virtio-gpu: fps:2.6/ score: 64
qemu-gtk/egl+vhost-user-gpu: fps:12.9 / score: 329

spice+virtio-gpu: fps:2.8 / score: 70
spice+vhost-user-gpu: fps:12.1 / score: 304


There is some work to make it more acceptable (both in qemu &
libvirt), but hopefully this will happen some day..

>
>
> ## Gputest Furmark Windowed: 1024x640
>
> | Driver      | FPS  | Points | Points/host
> --------------------------------------------
> | Virgl/qemu  |  25  |  1554  |  1.12
> | Virgl/vtest |  24  |  1477  |  1.11
> | host        |  22  |  1329  |  1
>
> ## Gputest Pixmark Piano Windowed: 1024x640
>
> | Driver      | FPS  | Points | Points/host
> ---------------------------------------------
> | Virgl/qemu  |  6   |  416   |   0.96
> | Virgl/vtest |  6   |  418   |   0.96
> | Host        |  7   |  434   |   1
>
> ---------------------------------------------------------------------
> -------
>
> Host: Gentoo 4.14.52-gentoo
> CPU: AMD FX-6300
> GPU: AMD 6870 HD
> Driver: r600  (MESA_GL_VERSION_OVERRIDE=4.4)
> Mesa: git-52caee70a4
> virglrenderer: git-76670ade
>
> ## Unigine Heaven (1024x768, Q:High, Tess: Normal, AA:2x)
>
>  Driver      | FPS avrg (min, max) | Score | Score/host |  Remark
>  Virgl/qemu  | 6.2 (3.4, 24.0)     |  156  | 0.40       |
>  Virgl/vtest | 1.2 (1.0, 2,6)      |   30  | 0.08       | Makes the
>                                                         | system
>                                                         | nearly
>                                                         | unusable
>  Host        | 15.2  (4.3, 74.0)   |  382  | 1          |
>
> Since tesselation is very heavy on the shaders on r600 I also run this
> benchmark without it:
>
> ## Unigine Heaven (1024x768, Q:High, Tess: Disabled, AA:2x)
>
>  Driver      | FPS avrg (min, max) | Score | Score/host
>  Virgl/qemu  | 12,1 (7,3, 28.2)    |  304  |   0.18
>  Host        | 67,5  (19,4, 118.9) |  1701 |   1
>
> ## Unigine Valley (1024x768, Q:High, AA:2x)
>
>  Driver     | FPS avrg (min, max) | Score |Score/host |     Remark
>  Virgl/qemu |  8.4 (6.5 11.6)     | 353   | 0.17      | Some artifacts
>                                                       | (Scenes 10-13)
>  Virgl/vtest |  2.9 (2.3 4.3)      | 123   | 0.07     | Slows the
>                                                       | system down
>  Host        |  50,5 (22,7, 86,4)  | 2112  | 1        |
>
> ## Gputest Furmark Windowed: 1024x640
>
> | Driver      | FPS  | Points | Points/host
> | Virgl/qemu  | 23   | 1399   |  0.45
> | Virgl/vtest |  2   | 150    |  0.05
> | Host        | 52   | 3138   |  1
>
> ## Gputest Pixmark Piano Windowed: 1024x640
>
> | Driver      | FPS  | Points | Points/host
> | Virgl/qemu  | 11   | 672    |   0.68
> | Virgl/vtest | 0-1  | 39     |   0.04
> | Host        | 15   |  995   |   1
> _______________________________________________
> virglrenderer-devel mailing list
> virglrenderer-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/virglrenderer-devel



-- 
Marc-André Lureau


More information about the virglrenderer-devel mailing list