[Intel-gfx] [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted

Daniel Vetter daniel at ffwll.ch
Wed Nov 25 00:40:45 PST 2015


On Tue, Nov 24, 2015 at 11:01:25PM +0000, Chris Wilson wrote:
> On Tue, Nov 24, 2015 at 07:06:21PM +0100, Daniel Vetter wrote:
> > Just setting obj->dirty only works if you also have the pages.
> 
> Exactly. The CPU access has historically always been page-by-page. The
> style here more or less to emulate the CPU mmap.
>  
> > But it's also not awesome that set_to_gtt_domain does this for callers.
> 
> Hmm, do you have an example where we want set-to-gtt(write), but not
> actually write through the backing storage? Internal use of set-to-gtt
> has never been ideal (e.g. context) but we haven't yet come up with a
> better semantic.

Just the inconsistency that Dave pointed out is a bit worrisome. At most
we can fix this with docs (which atm we have), which gives us a rather low
score on API design (still a positive one still it's possible to get
right). I agree that I don't have better semantics either.

> > For lack of clear solutions I'd go with sprinkling obj->dirty or
> > page_set_dirty over callers. Aside: relocate_entry_cpu probably gets away
> > because of the unconditional obj->dirty we do later on, and that we redo
> > all relocs if a fault happens. Still would be good to fix it, just for
> > safety.
> 
> [copy_batch() isn't a bug as the contents are invalidated after use
> anyway]

Just for consistency adding the obj->dirty after get_pages won't hurt
though.

> relocate_entry_cpu() is a bug we never caught. Indeed we've papered over
> it to mask some over userspace issues, but just adding the set_page_dirty()
> as required isn't going to be a big hardship.

Yeah.

> We have tons of swapthrash tests to check persistency of GPU buffers,
> but we never tried to thrash the batches themselves out to swap and then
> reuse them.
> 
> I guess that it is because userspace doesn't reuse batches that we never
> had report of the issue. Hibernating would be a good exercise of such.

Hm it's not just batches but any object with relocs. Could this explain
the oddball libva/uxa hang? Stuff like "after playing $game for hours my
desktop looked funny", but not for tiling issues.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list