[Intel-gfx] About the iGVT-g's requirement to pin guest contexts in VM

Zhiyuan Lv zhiyuan.lv at intel.com
Wed Aug 26 18:50:03 PDT 2015


Hi Daniel,

On Wed, Aug 26, 2015 at 10:56:00AM +0200, Daniel Vetter wrote:
> On Tue, Aug 25, 2015 at 08:17:05AM +0800, Zhiyuan Lv wrote:
> > Hi Chris,
> > 
> > On Mon, Aug 24, 2015 at 11:23:13AM +0100, Chris Wilson wrote:
> > > On Mon, Aug 24, 2015 at 06:04:28PM +0800, Zhiyuan Lv wrote:
> > > > Hi Chris,
> > > > 
> > > > On Thu, Aug 20, 2015 at 09:36:00AM +0100, Chris Wilson wrote:
> > > > > On Thu, Aug 20, 2015 at 03:45:21PM +0800, Zhiyuan Lv wrote:
> > > > > > Intel GVT-g will perform EXECLIST context shadowing and ring buffer
> > > > > > shadowing. The shadow copy is created when guest creates a context.
> > > > > > If a context changes its LRCA address, the hypervisor is hard to know
> > > > > > whether it is a new context or not. We always pin context objects to
> > > > > > global GTT to make life easier.
> > > > > 
> > > > > Nak. Please explain why we need to workaround a bug in the host. We
> > > > > cannot pin the context as that breaks userspace (e.g. synmark) who can
> > > > > and will try to use more contexts than we have room.
> > > > 
> > > > Could you have a look at below reasons and kindly give us your inputs?
> > > > 
> > > > 1, Due to the GGTT partitioning, the global graphics memory available
> > > > inside virtual machines is much smaller than native case. We cannot
> > > > support some graphics memory intensive workloads anyway. So it looks
> > > > affordable to just pin contexts which do not take much GGTT.
> > > 
> > > Wrong. It exposes the guest to a trivial denial-of-service attack. A
> > 
> > Inside a VM, indeed.
> > 
> > > smaller GGTT does not actually limit clients (there is greater aperture
> > > pressure and some paths are less likely but an individual client will
> > > function just fine).
> > >  
> > > > 2, Our hypervisor needs to change i915 guest context in the shadow
> > > > context implementation. That part will be tricky if the context is not
> > > > always pinned. One scenario is that when a context finishes running,
> > > > we need to copy shadow context, which has been updated by hardware, to
> > > > guest context. The hypervisor knows context finishing by context
> > > > interrupt, but that time shrinker may have unpin the context and its
> > > > backing storage may have been swap-out. Such copy may fail. 
> > > 
> > > That is just a bug in your code. Firstly allowing swapout on an object
> > > you still are using, secondly not being able to swapin.
> > 
> > As Zhi replied in another email, we do not have the knowledge of guest
> > driver's swap operations. If we cannot pin context, we may have to ask
> > guest driver not to swap out context pages. Do you think that would be
> > the right way to go? Thanks!
> 
> It doesn't matter at all - if the guest unpins the ctx and puts something
> else in there before the host tells it that the ctx is completed, that's a
> bug in the guest. Same with real hw, we guarantee that the context stays
> around for long enough.

You are right. Previously I did not realize that shrinker will check
not only the seqno, but also "ACTIVE_TO_IDLE" context interrupt for
unpinning a context, then had above concern. Thanks for the
explanation!

> 
> Also you obviously have to complete the copying from shadow->guest ctx
> before you send the irq to the guest to signal ctx completion. Which means
> there's really no overall problem here from a design pov, the only thing

Right. We cannot control when guest driver sees seqno change, but we
can control when guest sees context interrupts. The guest CSB update
and interrupt injection will be after we finish writing guest
contexts.

So right now we have two options of context shadowing: one is to track
the whole life-cycle of guest context, and another is to do the shadow
work in context schedule-in/schedule-out time. Zhi draws a nice
picture of them.

Currently we do not have concrete performance comparison of the two
approaches. We will have a try and see. And about this patchset, I
will remove the "context notification" part and send out an updated
version. Thanks!

> you have to do is fix up bugs in the host code (probably you should just
> write through the ggtt).

Sorry could you elaborate a little more about this? Guest context may
not always be in aperture right?

> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list