[Intel-gfx] [PATCH 43/46] drm/i915: Allocate a status page for each timeline

Wed Jan 16 21:15:06 UTC 2019

Quoting John Harrison (2019-01-16 21:06:36)
> On 1/15/2019 10:43, Chris Wilson wrote:
> > Quoting John Harrison (2019-01-15 18:17:21)
> >> On 1/15/2019 01:50, Chris Wilson wrote:
> >>> Quoting John Harrison (2019-01-15 00:56:13)
> >>>> On 1/7/2019 03:55, Chris Wilson wrote:
> >>>>> +static int alloc_hwsp(struct i915_timeline *timeline)
> >>>>> +{
> >>>>> +     struct drm_i915_private *i915 = timeline->i915;
> >>>>> +     struct i915_vma *vma;
> >>>>> +     int offset;
> >>>>> +
> >>>>> +     mutex_lock(&i915->gt.timeline_lock);
> >>>>> +
> >>>>> +restart:
> >>>>> +     offset = find_first_cacheline(i915);
> >>>>> +     if (offset == NBITS && i915->gt.timeline_hwsp) {
> >>>>> +             i915_vma_put(i915->gt.timeline_hwsp);
> >>>>> +             i915->gt.timeline_hwsp = NULL;
> >>>>> +     }
> >>>>> +
> >>>>> +     vma = i915->gt.timeline_hwsp;
> >>>>> +     if (!vma) {
> >>>>> +             struct drm_i915_gem_object *bo;
> >>>>> +
> >>>>> +             /* Drop the lock before allocations */
> >>>>> +             mutex_unlock(&i915->gt.timeline_lock);
> >>>>> +
> >>>>> +             BUILD_BUG_ON(NBITS * CACHELINE_BYTES > PAGE_SIZE);
> >>>>> +             bo = i915_gem_object_create_internal(i915, PAGE_SIZE);
> >>>>> +             if (IS_ERR(bo))
> >>>>> +                     return PTR_ERR(bo);
> >>>>> +
> >>>>> +             i915_gem_object_set_cache_level(bo, I915_CACHE_LLC);
> >>>>> +
> >>>>> +             vma = i915_vma_instance(bo, &i915->ggtt.vm, NULL);
> >>>>> +             if (IS_ERR(vma))
> >>>>> +                     return PTR_ERR(vma);
> >>>>> +
> >>>>> +             mutex_lock(&i915->gt.timeline_lock);
> >>>>> +             if (i915->gt.timeline_hwsp) {
> >>>>> +                     i915_gem_object_put(bo);
> >>>>> +                     goto restart;
> >>>>> +             }
> >>>>> +
> >>>>> +             i915->gt.timeline_hwsp = vma;
> >>>>> +             i915->gt.timeline_free = ~0ull;
> >>>>> +             offset = 0;
> >>>>> +     }
> >>>>> +
> >>>>> +     i915->gt.timeline_free &= ~BIT_ULL(offset);
> >>>>> +
> >>>>> +     timeline->hwsp_ggtt = i915_vma_get(vma);
> >>>>> +     timeline->hwsp_offset = offset * CACHELINE_BYTES;
> >>>>> +
> >>>>> +     mutex_unlock(&i915->gt.timeline_lock);
> >>>>> +
> >>>>> +     return 0;
> >>>>> +}
> >>>> If I'm reading this correctly then gt.timeline_hwsp/free is the a cached
> >>>> copy of the most recently allocated but not yet filled bank of seqno
> >>>> locations. When it gets full, the i915->gt reference gets dropped and a
> >>>> new page is allocated and used up line by line. Meanwhile, each timeline
> >>>> has it's own private reference to the page so dropping the i915->gt
> >>>> reference is safe. And once the last timeline using a given page is
> >>>> freed, the last reference to that page will be dropped and so the page
> >>>> itself will also be freed. If a timeline is freed before the currently
> >>>> cached page is filled, then that timeline's slot will be released and
> >>>> re-used by the next timeline to be created.
> >>>>
> >>>> But what about the scenario of a long running system with a small but
> >>>> growing number of persistent tasks interspersed with many short lived
> >>>> tasks? In that case, you would end up with many sparsely populated pages
> >>>> that whose free slots will not get re-used. You could have a linked list
> >>>> of cached pages. When a page is filled, move it to a 'full' list. When a
> >>>> timeline is freed, if it's page was on the 'full' list, clear the slot
> >>>> and move it back to the 'available' list.
> >>> Yes. My thinking was a plain slab cache was a quick-and-dirty
> >>> improvement over a page-per-timeline. And a freelist would be the next
> >>> step.
> >>>
> >>>> Or is the idea that a worst case of a single page vma allocation per
> >>>> timeline is the least of our worries if there is an ever growing number
> >>>> of timelines/contexts/users in the system?
> >>> Nah, it was just an attempt to quickly reduce the number of allocations,
> >>> where the worst case of one page+vma per timeline was the starting
> >>> point.
> >>>
> >>> We should break this patch down into 1) one-page-per-timeline, 2) slab
> >>> cache, 3) free list 4) profit.
> >>>
> >>> At other times we have been wanting to be able to suballocate pages,
> >>> something to keep in mind would be extending this to arbitrary cacheline
> >>> allocations.
> >> The multi-stage approach sounds good. Keep things simple in this patch
> >> and then improve the situation later. One thing to be careful of with a
> >> cacheline allocator would be make sure whatever is being converted
> >> wasn't using full pages for security reasons. I.e. a page can be private
> >> to a process, a cacheline will be shared by many. I guess that would
> >> only really apply to allocations being passed to user land as the kernel
> >> is considered secure? Or can a user batch buffer write to arbitrary
> >> locations within the ppHWSP and thereby splat someone else's seqno?
> > ppHWSP, yes. But for internal allocations, only accessible via the ring
> > + GGTT, should be no problem. I agree that we definitely don't want to
> > expose subpage sharing across the userspace boundary (all isolation
> > controls are only on pages and above).
> >
> > If userspace wants suballocations, it can (and does) do them for itself
> > and should regulate its own sharing.
> 
> I'm a little confused. Are you saying that a rogue batch buffer could 
> splat some other context's ppHWSP seqno or that it can't? It would be 
> bad if one dodgy user could cause hangchecks in another user's batch by 
> splatting their seqnos.

It can't access another context's ppHWSP, only the kernel doing it bit
in the ring between batches. So not without a kernel bug, as with a
great many things.

> >>>>> +     if (global_hwsp) {
> >>>>> +             timeline->hwsp_ggtt = i915_vma_get(global_hwsp);
> >>>>> +             timeline->hwsp_offset = I915_GEM_HWS_SEQNO_ADDR;
> >>>>> +     } else {
> >>>>> +             err = alloc_hwsp(timeline);
> >>>>> +             if (err)
> >>>>> +                     return err;
> >>>>> +     }
> >>>>> +
> >>>>> +     vaddr = i915_gem_object_pin_map(timeline->hwsp_ggtt->obj, I915_MAP_WB);
> >>>>> +     if (IS_ERR(vaddr)) { /* leak the cacheline, but will clean up later */
> >>>> Can you explain this comment more? Where/when is the later?
> >>> On failure here, the cacheline is still marked as allocated in the slab,
> >>> but the reference to the page is released. So the backing page will be
> >>> released when everyone else finally drops their reference.
> >>>
> >>> Just laziness, since we have the ability to return the cacheline later
> >>> on...
> >> Meaning the actual leak is the bit in 'i915->gt.timeline_free' that says
> >> this cacheline can or can't be used for the next allocation? Presumably
> >> you could do the bit map munging in the case that 'global_hwsp' is null,
> >> but the code would certainly be messier for not a lot of gain.
> > Having been pointed out that I was being lazy, a bit of refactoring
> > later showed how lazy I was.
> Does that mean you are going to re-work this patch or follow it up with 
> a subsequent one?

I've new patches, just CI has been a little dead. First it didn't like
series with more than 10 patches, then it didn't like utf8 patches, and
now it is not responding...

https://patchwork.freedesktop.org/patch/277702/
https://patchwork.freedesktop.org/patch/277691/
https://patchwork.freedesktop.org/patch/277692/
-Chris