[Intel-gfx] [RFC 4/4] drm/i915/gt: Pipelined page migration

Mon Nov 30 14:11:51 UTC 2020

Quoting Chris Wilson (2020-11-30 13:39:54)
> Quoting Tvrtko Ursulin (2020-11-30 13:12:55)
> > 
> > On 28/11/2020 18:40, Chris Wilson wrote:
> > > +struct i915_request *
> > > +intel_context_migrate_pages(struct intel_context *ce,
> > > +                         struct scatterlist *src,
> > > +                         struct scatterlist *dst)
> > > +{
> > > +     struct sgt_dma it_s = sg_sgt(src), it_d = sg_sgt(dst);
> > > +     u64 encode = ce->vm->pte_encode(0, I915_CACHE_LLC, 0); /* flags */
> > > +     struct i915_request *rq;
> > > +     int len;
> > > +     int err;
> > > +
> > > +     /* GEM_BUG_ON(ce->vm != migrate_vm); */
> > > +
> > > +     err = intel_context_pin(ce);
> > > +     if (err)
> > > +             return ERR_PTR(err);
> > > +
> > > +     GEM_BUG_ON(ce->ring->size < SZ_64K);
> > > +
> > > +     do {
> > > +             rq = i915_request_create(ce);
> > > +             if (IS_ERR(rq)) {
> > > +                     err = PTR_ERR(rq);
> > > +                     goto out_ce;
> > > +             }
> > > +
> > > +             len = emit_pte(rq, &it_s, encode, 0, CHUNK_SZ);
> > > +             if (len <= 0) {
> > > +                     err = len;
> > > +                     goto out_rq;
> > > +             }
> > > +
> > > +             if (emit_pte(rq, &it_d, encode, CHUNK_SZ, len) < len) {
> > > +                     err = -EINVAL;
> > > +                     goto out_rq;
> > > +             }
> > 
> > Source and destination PTEs into the reserved [0, sz * 2) area?
> 
> Yes.
> 
> > 
> > > +
> > > +             err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
> > > +             if (err)
> > > +                     goto out_rq;
> > > +
> > > +             err = emit_copy(rq, len);
> > 
> > Right so copy can use fixed offsets.
> > 
> > > +             if (err)
> > > +                     goto out_rq;
> > > +
> > > +             if (!it_s.sg)
> > > +                     i915_request_get(rq);
> > > +out_rq:
> > > +             i915_request_add(rq);
> > > +             if (it_s.sg)
> > > +                     cond_resched();
> > 
> >  From what context does this run? No preemptible?
> 
> Has to be process context; numerous allocations, implicit waits (that we
> want to avoid in practice), and the timeline (per-context) mutex to
> guard access to the ringbuffer.

Another thought was to allocate new contexts on the fly; as we can only
copy about 500MiB before stalling using GPU PTE updates.

However, I thought reallocations would be worse for the code flow.
-Chris