[Intel-gfx] [PATCH 03/19] drm/i915: Micro-optimise gen8_ppgtt_insert_entries()
Chris Wilson
chris at chris-wilson.co.uk
Fri Feb 3 08:34:52 UTC 2017
On Thu, Feb 02, 2017 at 05:17:57PM +0000, Tvrtko Ursulin wrote:
>
> On 02/02/2017 17:05, Chris Wilson wrote:
> >On Thu, Feb 02, 2017 at 04:39:49PM +0000, Tvrtko Ursulin wrote:
> >>
> >>On 02/02/2017 16:10, Chris Wilson wrote:
> >>>On Thu, Feb 02, 2017 at 03:57:43PM +0000, Tvrtko Ursulin wrote:
> >>>>
> >>>>On 02/02/2017 15:32, Chris Wilson wrote:
> >>>>>On Thu, Feb 02, 2017 at 03:02:32PM +0000, Chris Wilson wrote:
> >>>>>>Improve the sg iteration and in hte process eliminate a bug in
> >>>>>>miscomputing the pml4 length as orig_nents<<PAGE_SHIFT is no longer the
> >>>>>>full length of the sg table.
> >>>>>>
> >>>>>
> >>>>>which fixes a corner case of 0c40ce130e38
> >>>>>Fixes: 0c40ce130e38 ("drm/i915: Trim the object sg table")
> >>>>
> >>>>What do you mean? oring_nents is definitely the full length of the
> >>>>sg table, especially after i915_sg_trim. Before it orig_nents was
> >>>>often larger than the real length of the sg table.
> >>>
> >>>The code is using orig_nents as obj->base.size/vma->size (a page count,
> >>>not the sg count), if I read it correctly as it is computing the address
> >>>range.
> >>
> >>Oh right, I was misled by the commit message ("is not longer the
> >>full length of the sg table"). Nasty. That means it was broken for
> >>userptr objects as well.
> >
> >And partial, and... pretty everything at some point.
>
> Partials won't be in ppgtt. Neither the stolen ones, and internal
> ones are only gen7 right?
Yup. So far it looks like just get_pages_gtt and get_pages_userptr,
get_pages_dmabuf would be susceptible.
> >Hmm, maybe it is the recent spat of NULL deref for gen8 ppgtt. I'd been
> >assuming that they had a similar unknown cause to the gen6 which have
> >been around for yonks.
> >
> >Tart up the commit message and this could 4.10-rc material.
>
> Yeah, but how this actually manifests? It should be blowing up left
> right and center because sg trim in my testing managed to trim
> impressively. I don't get it. I'll leave the thinking for tomorrow.
Hmm. I was thinking that we didn't allocate our bookkeeping for the full
range and so would hit find a NULL in the tree - but we do that
allocation pass in allocate_va_range() beforehand, and that has the full
node size. So it would appear that we might stop early (leaving scratch
pages for the GPU to read) but I can't see an oops.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list