[Mesa-dev] [Bug 93686] Performance improvement ?=:=?UTF-8?Q? Please consider hardware ɢᴘᴜ rendering in llvmpipe

Thu Jan 14 06:30:12 PST 2016

https://bugs.freedesktop.org/show_bug.cgi?id=93686

--- Comment #4 from Roland Scheidegger <sroland at vmware.com> ---
(In reply to ytrezq from comment #3)

> I don’t think it’s necessary to combine 5 years old low end 90nm gpu with a
> 14nm high end cpu. For example (comparing the hd 2500 integrated graphic of
> my ivy bridge cpu and the cpu itself), glxgears (both case in full screen)
> run at 301 frame per second with the  gpu and 221 frames per second with
> llvmpipe.
Don't use glxgears as a benchmark like that. This is usually limited by lots of
other factors than "gpu" performance. The rendering is way too simple, the
framerate way too high - things don't just scale linearly down from glxgears...

> In a gpu intensive OpenGl 3D game (the game itself use 3% cpu), I got 11
> frames per seconds with the gpu and 6 frames with llvmpipe.
llvmpipe may look good here, but I suspect that's got more to do again with
something else rather than gpu performance. Maybe there's too much cpu<->gpu
synchronization going on (which typically kills your performance, but is pretty
much free for llvmpipe) or whatever.

> 
> I’m also not that sure about high level api : Nvidia is able to perform
> complete automatic load balancing with their proprietary drivers (at driver
> level) with both OpenGl and Direct3D. It works with all their gpu.
I'm not sure what you exactly mean here: They do "load balancing" with multiple
gpus as part of SLI, but they are just rendering one frame on on gpu and the
next one on the other, that's it, and it sometimes doesn't work well neither,
the scaling isn't always decent. Now something like that _could_ theoretically
be done with llvmpipe and some gpu I suppose, but it's not going to happen.
None of the open-source drivers even deemed it worthwile enough even for
multiple, same gpus yet...

> 
> There’s also a well known example of perfect load balancing between several
> gpu and several cpu : OpenCl. Though I agree re writing some parts of
> llvmpipe in OpenCl might add more overhead than it removes. Plus if it would
> be possible, it might remove some of the required manpower to maintain low
> level hardware instructions.
With OpenCL (as well as d3d12, Vulkan) multiple gpus are presented at the api
level. Thus, the application can chose which adapter to use for what, which is
pretty much the only way how this can work in a sane manner. There were some
demos for d3d12 which did (IIRC) all the rendering on the discrete gpu and then
ran some post-processing shader on the integrated graphics using exactly that.
So, yes, theoretically if we'd support Vulkan, something like that would be
doable, but only if the app decided to make use of it.

In theory, some rendering loads could be split other ways: for instance, I know
at least some i965 windows drivers executed vertex shader on the cpu instead of
the gpu (app dependent IIRC) because that was faster. But the IGP was really
weak back then, relative to the cpu it has probably increased by more than a
factor of 10 in performance. Plus, more modern graphics can't be split like
that easily, since the fragment shader may use the same resources as the vertex
shader, meaning you have to synchronize your buffers (which tends to kill
performance).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20160114/7792a983/attachment.html>