[Intel-gfx] [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted

Chris Wilson chris at chris-wilson.co.uk
Mon Nov 23 01:37:32 PST 2015


On Mon, Nov 23, 2015 at 02:48:24PM +0530, akash goel wrote:
> On Wed, Nov 11, 2015 at 2:37 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > On Tue, Nov 10, 2015 at 03:27:36PM -0800, yu.dai at intel.com wrote:
> >> From: Alex Dai <yu.dai at intel.com>
> >>
> >> We keep a copy of GuC fw in a GEM obj. However its content is lost
> >> if the GEM obj is swapped (igt/gem_evict_*). Therefore, the later
> >> fw loading during GPU reset will fail. Mark the obj dirty after
> >> copying data into the pages. So its content will be kept during
> >> swapped out.
> >>
> >> Signed-off-by: Alex Dai <yu.dai at intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_gem.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >> index f1e3fde..3b15167 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> @@ -5199,6 +5199,7 @@ i915_gem_object_create_from_data(struct drm_device *dev,
> >>       i915_gem_object_pin_pages(obj);
> >>       sg = obj->pages;
> >>       bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
> >> +     obj->dirty = 1;
> >
> > That's one way of doing it, but atypical for our CPU access to the pages.
> > The root cause is still that sg_copy_from_buffer() isn't calling
> > set_page_dirty() and so there will be other places in the kernel that
> > fall foul of it. Also, not that we could have written this to not require
> > the whole object to be pinned at once as well.
> >
> > I would prefer this to be fixed in sg_copy_from_buffer() for the reason
> > that all callers are susceptible to this bug.
> > -Chris
> 
> Probably the sg_copy_from_buffer is written with an assumption that
> the pages it is supposed to write to,

Seems unlikely considering it is supposed to be a generic library
helper.

> would always be the kernel pages (GFP_KERNEL), which will always be
> resident in RAM. Thus no real need to mark the pages as dirty.
> Or the Clients/Users of  sg_copy_from_buffer function are expected to
> explicitly mark the pages as dirty.

In which case, we shouldn't be using it. So back to making the library
function correct, or using a better interface to write the shmemfs pages
directly (which would avoid the extra clears, or for more points an
internal object which we can write and avoid even the kmalloc through
the generic firmware loader).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list