[Intel-gfx] [PATCH] drm/i915: Migrate stolen objects before hibernation

Chris Wilson chris at chris-wilson.co.uk
Tue Jun 30 04:32:38 PDT 2015


On Tue, Jun 30, 2015 at 01:22:59PM +0200, Daniel Vetter wrote:
> On Tue, Jun 30, 2015 at 12:03:44PM +0100, Chris Wilson wrote:
> > On Tue, Jun 30, 2015 at 12:54:02PM +0200, Daniel Vetter wrote:
> > > > +	memset(&node, 0, sizeof(node));
> > > > +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm,
> > > > +						  &node,
> > > > +						  4096, 0, I915_CACHE_NONE,
> > > > +						  0, i915->gtt.mappable_end,
> > > > +						  DRM_MM_SEARCH_DEFAULT,
> > > > +						  DRM_MM_CREATE_DEFAULT);
> > > 
> > > Hm, I think the plan with stolen is to mostly use it for giant scanout
> > > buffers where we never plan to access them with the gpu. Maybe go with a
> > > per-page loop here instead? You have a low-level pte writing call below
> > > anyway. Would mean we'd need a 1-entry onstack sg_table too, but that
> > > won't hurt.
> > 
> > I'm not understanding. This is a per-page loop (because we don't need to
> > bind the entire stolen vma into GGTT for copying with the CPU and
> > thereby increase the risk of failure). Speaking of failure, should
> > hibernation be interruptible? I guess it is usually called from an
> > interruptible process context.
> 
> I was blind and confused by the insert_entries we have in upstream, which
> takes a sg_table and hence can only map the full view without some jumping
> through hoops. Concern fully addressed already ;-)
> 
> Wrt uninterruptible: GPU should be idle already completely (and reset if
> something went wrong) so no need for interruptible.

Note that I put the migration loop before the suspend, i.e. before the
gpu_idle. Partly because, I felt the migration has the biggest chance of
failure so should go first, and the gpu idle in suspend is quite
convenient if we do use the GPU for blitting, but mainly because
after i915_gem_suspend() doing GEM operations feels very wrong (there is
a strong possibilty that we kick off some work queue or other that must
be idle).

> Hm, thinking about
> this: Do we handle a gpu death only detected in gpu_idle? Nasty igt:
> - inject hang, but be very careful to not cause any wait at all
> - suspend
> 
> BOOM or not?

In my kernels, no boom. GPU hang waiting for idle is business as usual!
In upstream, we have seen suspend/hibernate fail due to an untimely hang
(iirc, usually worked on the second attempt so the bug report in
question was about something else entirely except the logs contained the
hibernate failure).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list