[Intel-xe] [PATCH] drm/xe: Use large rings for compute contexts

Matthew Brost matthew.brost at intel.com
Mon Jun 5 17:04:11 UTC 2023


On Mon, Jun 05, 2023 at 10:33:29AM +0200, Thomas Hellström wrote:
> Hi, Niranjana,
> 
> On 6/2/23 20:07, Niranjana Vishwanathapura wrote:
> > Allow compute contexts to submit the maximal amount of work without
> > blocking userspace.
> > 
> > The original size for user LRC ring's (SZ_16K) was chosen to minimise
> > memory consumption, without being so small as to frequently stall in the
> > middle of workloads. With the main consumers being GL / media pipelines
> > of 2 or 3 batches per frame, we want to support ~10 requests in flight
> > to allow for the application to control throttling without stalling
> > within a frame.
> > 
> > Signed-off-by: Chris Wilson <chris.p.wilson at intel.com>
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
> > Signed-off-by: Andi Shyti <andi.shyti at linux.intel.com>
> > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura at intel.com>
> 
> With the DRM scheduler, jobs are written to the ring from a drm scheduler
> notifier when all dependencies are satisfied and there is ring space
> available, and I believe that's the case even after Matthew Brost's changes
> to xe / drm-scheduler on how compute jobs are handled.
> 
> So this patch shouldn't be needed for xe.
> 
> /Thomas

On tip of Xe we just hold jobs in the DRM scheduler if there isn't room
the ring and write the ring in run_job once space frees up. But I guess
we can't do this because dma-fencing rules for long running jobs
(compute or fault mode VM), so after my update we return -EWOULDBLOCK in
the exec IOCTL [1]. So yes, this is needed if the user will regularly
write more than 85 jobs (16k / 192) at a time on long running VMs.

[1] https://patchwork.freedesktop.org/patch/538665/?series=118091&rev=3

> 
> > ---
> >   drivers/gpu/drm/xe/xe_engine.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_engine.c b/drivers/gpu/drm/xe/xe_engine.c
> > index b3036c4a8ec3..47c9f119fb29 100644
> > --- a/drivers/gpu/drm/xe/xe_engine.c
> > +++ b/drivers/gpu/drm/xe/xe_engine.c
> > @@ -29,6 +29,7 @@ static struct xe_engine *__xe_engine_create(struct xe_device *xe,
> >   {
> >   	struct xe_engine *e;
> >   	struct xe_gt *gt = hwe->gt;
> > +	u32 ring_size;
> >   	int err;
> >   	int i;
> > @@ -65,8 +66,9 @@ static struct xe_engine *__xe_engine_create(struct xe_device *xe,
> >   		e->bind.fence_seqno = XE_FENCE_INITIAL_SEQNO;
> >   	}
> > +	ring_size = (e->class == XE_ENGINE_CLASS_COMPUTE) ? SZ_512K : SZ_16K;

SZ_512K seems like overkill. Also we'd want to check if VM is a long
running as Thomas is correct this not needed for normal VMs.

Matt

> >   	for (i = 0; i < width; ++i) {
> > -		err = xe_lrc_init(e->lrc + i, hwe, e, vm, SZ_16K);
> > +		err = xe_lrc_init(e->lrc + i, hwe, e, vm, ring_size);
> >   		if (err)
> >   			goto err_lrc;
> >   	}


More information about the Intel-xe mailing list