[Intel-gfx] [PATCH 6/6] drm/i915: Migrate stolen objects before hibernation

Ankitprasad Sharma ankitprasad.r.sharma at intel.com
Thu Dec 10 05:17:50 PST 2015


On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
> Hi,
> 
> Two more comments below:
> 
> On 09/12/15 12:46, ankitprasad.r.sharma at intel.com wrote:
> > From: Chris Wilson <chris at chris-wilson.co.uk>
> >
> > Ville reminded us that stolen memory is not preserved across
> > hibernation, and a result of this was that context objects now being
> > allocated from stolen were being corrupted on S4 and promptly hanging
> > the GPU on resume.
> >
> > We want to utilise stolen for as much as possible (nothing else will use
> > that wasted memory otherwise), so we need a strategy for handling
> > general objects allocated from stolen and hibernation. A simple solution
> > is to do a CPU copy through the GTT of the stolen object into a fresh
> > shmemfs backing store and thenceforth treat it as a normal objects. This
> > can be refined in future to either use a GPU copy to avoid the slow
> > uncached reads (though it's hibernation!) and recreate stolen objects
> > upon resume/first-use. For now, a simple approach should suffice for
> > testing the object migration.
> 
> Mention of "testing" in the commit message and absence of a path to 
> migrate the objects back to stolen memory on resume makes me think this 
> is kind of half finished and note really ready for review / merge ?
> 
> Because I don't see how it is useful to migrate it one way and never 
> move back?
I think that this is not much of a problem, as the purpose here is to
keep the object intact, to avoid breaking anything.
So as far as objects are concerned they will be in shmem and can be used
without any issue, and the stolen memory will be free again for other
usage from the user.
> 
> >
> > v2:
> > Swap PTE for pinned bindings over to the shmemfs. This adds a
> > complicated dance, but is required as many stolen objects are likely to
> > be pinned for use by the hardware. Swapping the PTEs should not result
> > in externally visible behaviour, as each PTE update should be atomic and
> > the two pages identical. (danvet)
> >
> > safe-by-default, or the principle of least surprise. We need a new flag
> > to mark objects that we can wilfully discard and recreate across
> > hibernation. (danvet)
> >
> > Just use the global_list rather than invent a new stolen_list. This is
> > the slowpath hibernate and so adding a new list and the associated
> > complexity isn't worth it.
> >
> > v3: Rebased on drm-intel-nightly (Ankit)
> >
> > v4: Use insert_page to map stolen memory backed pages for migration to
> > shmem (Chris)
> >
> > v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma at intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
> >   drivers/gpu/drm/i915/i915_drv.h         |   7 +
> >   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
> >   drivers/gpu/drm/i915/intel_display.c    |   3 +
> >   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
> >   drivers/gpu/drm/i915/intel_pm.c         |   2 +
> >   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
> >   7 files changed, 261 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 9f55209..2bb9e9e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
> >   	return i915_drm_suspend(drm_dev);
> >   }
> >
> > +static int i915_pm_freeze(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> > +	if (ret)
> > +		return ret;
> 
> One of the first steps in idling GEM seems to be idling the GPU and 
> retiring requests.
> 
> Would it also make sense to do those steps before attempting to migrate 
> the stolen objects?
Here, we do that implicitly when trying to do a vma_unbind for the
object.

Thanks,
Ankit




More information about the Intel-gfx mailing list