[Intel-gfx] [RFC PATCH 0/8] Add host i915 support for vGPU

Thu Oct 23 10:56:26 CEST 2014

On Thu, Oct 23, 2014 at 11:13:51AM +0800, Jike Song wrote:
> On 10/22/2014 05:48 PM, Daniel Vetter wrote:
> >So on a very high level I don't understand this design. For the guest side
> >it's completely clear that we need a bunch of hooks over the driver to
> >make paravirtualization work.
> >
> >But on the host side I expect the driver to be in full control of the
> >hardware. I haven't really seen the other side but it looks like with vgt
> >we actually have two drivers fighting over the hardware, which requires
> >the various hooks to avoid disaster. The drm subsystem is littered with
> >such attempts, and they all didn't end up in a pretty way.
> >
> >So way can't we have the vgt support for guests sit on top of i915, using
> >the i915 functions to set up pagetables, contexts and reserve gtt areas
> >for the guests? Then we'd have just one driver in control of the hardware,
> >and vgt on the host side would just look like a really crazy interace
> >layer between virtual hosts and the low-level driver, similar to who the
> >execbuf ioctl is a really crazy interface between userspace and the
> >low-level driver.
> 
>  Yes we can do this, but that also means lots of pollution to existing i915
> codes, only for virtualization purpose. Currently vgt has pretty abstractions
> for both host i915 and guest graphics drivers, mixing vgt and host i915 means
> breaking that.  I believe a vgt-integrated repository will be better to look
> at :)

The problem is that we need to have this integration anyway. Using the
current design it will be hidden behind some thin abstraction, but it will
be as invasive and fragile as if the interactions is made explicit.

Stuf like driver load/unload, suspend/resume, runtime pm and gpu reset are
already supre-fragile as-is. Every time we change something in there, a
bunch of related things fall apart. With vgt we'll have even more
complexity in there, and I really think we need to make that complexity
explicit. Otherwise we'll always break vgt support for host systems by
accident when working on upstream. So in my experience being explicit with
these depencies massively reduces maintaince headaches longterm.

Of course for starting up the vgt effort it's better to have as much
separation as possible. But the balance between development and
maintainability do change quite a bit when merging code upstream.

Another benefit of the inverted design with vgt sitting on top of normal
i915 for host support is better automated testing. At least with kvm we
could then simply create a kvm guest without any special boot parameters
and run some really basic testcases in the guest to make sure vgt doesn't
get broken. That would fit rather nicely into i-g-t. Of course with xengt
we unfortunately can't do that, since we'd need to boot with the
hypervisor.

But I think without such smoketesting in the automated upstream test suite
we'll break vgt support constantly in upstream. So I think we really need
this too, at least long term.

I hope that explains a bit where I'm coming from. Note that this is just
about the host side, imo the guest side can be merged as soon as detailed
review has been completed.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch