[Intel-gfx] [PATCH 6/6] drm/i915: Migrate stolen objects before hibernation
Dave Gordon
david.s.gordon at intel.com
Wed Dec 9 11:35:26 PST 2015
On 09/12/15 12:46, ankitprasad.r.sharma at intel.com wrote:
> From: Chris Wilson <chris at chris-wilson.co.uk>
>
> Ville reminded us that stolen memory is not preserved across
> hibernation, and a result of this was that context objects now being
> allocated from stolen were being corrupted on S4 and promptly hanging
> the GPU on resume.
>
> We want to utilise stolen for as much as possible (nothing else will use
> that wasted memory otherwise), so we need a strategy for handling
> general objects allocated from stolen and hibernation. A simple solution
> is to do a CPU copy through the GTT of the stolen object into a fresh
> shmemfs backing store and thenceforth treat it as a normal objects. This
> can be refined in future to either use a GPU copy to avoid the slow
> uncached reads (though it's hibernation!) and recreate stolen objects
> upon resume/first-use. For now, a simple approach should suffice for
> testing the object migration.
>
> v2:
> Swap PTE for pinned bindings over to the shmemfs. This adds a
> complicated dance, but is required as many stolen objects are likely to
> be pinned for use by the hardware. Swapping the PTEs should not result
> in externally visible behaviour, as each PTE update should be atomic and
> the two pages identical. (danvet)
>
> safe-by-default, or the principle of least surprise. We need a new flag
> to mark objects that we can wilfully discard and recreate across
> hibernation. (danvet)
>
> Just use the global_list rather than invent a new stolen_list. This is
> the slowpath hibernate and so adding a new list and the associated
> complexity isn't worth it.
>
> v3: Rebased on drm-intel-nightly (Ankit)
>
> v4: Use insert_page to map stolen memory backed pages for migration to
> shmem (Chris)
>
> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma at intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.c | 17 ++-
> drivers/gpu/drm/i915/i915_drv.h | 7 +
> drivers/gpu/drm/i915/i915_gem.c | 232 ++++++++++++++++++++++++++++++--
> drivers/gpu/drm/i915/intel_display.c | 3 +
> drivers/gpu/drm/i915/intel_fbdev.c | 6 +
> drivers/gpu/drm/i915/intel_pm.c | 2 +
> drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +
> 7 files changed, 261 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 9f55209..2bb9e9e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
> return i915_drm_suspend(drm_dev);
> }
>
> +static int i915_pm_freeze(struct device *dev)
> +{
> + int ret;
> +
> + ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> + if (ret)
> + return ret;
> +
> + ret = i915_pm_suspend(dev);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +
> static int i915_pm_suspend_late(struct device *dev)
> {
> struct drm_device *drm_dev = dev_to_i915(dev)->dev;
> @@ -1700,7 +1715,7 @@ static const struct dev_pm_ops i915_pm_ops = {
> * @restore, @restore_early : called after rebooting and restoring the
> * hibernation image [PMSG_RESTORE]
> */
> - .freeze = i915_pm_suspend,
> + .freeze = i915_pm_freeze,
> .freeze_late = i915_pm_suspend_late,
> .thaw_early = i915_pm_resume_early,
> .thaw = i915_pm_resume,
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e0b09b0..0d18b07 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2080,6 +2080,12 @@ struct drm_i915_gem_object {
> * Advice: are the backing pages purgeable?
> */
> unsigned int madv:2;
> + /**
> + * Whereas madv is for userspace, there are certain situations
> + * where we want I915_MADV_DONTNEED behaviour on internal objects
> + * without conflating the userspace setting.
> + */
> + unsigned int internal_volatile:1;
Does this new flag need to be examined by other code that currently
checks 'madv', e.g. put_pages() ? Or does this indicate
not-really-volatile-in-normal-use-only-across-hibernation ?
.Dave.
More information about the Intel-gfx
mailing list