[Mesa-dev] [PATCH 9/9] i965/drm: Add stall warnings when mapping or waiting on BOs.

Kenneth Graunke kenneth at whitecape.org
Mon Apr 10 17:29:50 UTC 2017


On Monday, April 10, 2017 1:31:11 AM PDT Chris Wilson wrote:
> On Mon, Apr 10, 2017 at 10:09:17AM +0200, Daniel Vetter wrote:
> > On Mon, Apr 10, 2017 at 12:18:54AM -0700, Kenneth Graunke wrote:
> > > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > index 8ccc5a276b9..6e4b55cf9ec 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > @@ -100,7 +100,7 @@ intel_batchbuffer_reset(struct intel_batchbuffer *batch,
> > >  
> > >     batch->bo = brw_bo_alloc(bufmgr, "batchbuffer", BATCH_SZ, 4096);
> > >     if (has_llc) {
> > > -      brw_bo_map(batch->bo, true);
> > > +      brw_bo_map(NULL, batch->bo, true);
> > 
> > Why NULL here? Mapping a fresh buffer might incur a clflush, which isn't
> > cheap. I think for atom tuning you want to hear about those.
> 
> I thought it was because there is no brw pointer at this point.

Chris is right - there's no brw pointer so we can't report anything.
We could easily plumb one through, but I was lazy.  I figured this
already gives a ton more coverage than we used to have, and it wasn't
that interesting of a case.

> For !llc, please do a WB mapping of the batch on first use, then a WC
> mapping thereafter. The clflush at execbuf is "free" - or rather it is
> done asynchronously, after taking advantage of the WB for any fixups
> required. Afterwards, you want to avoid clflushing which is where the
> pwrite was useful but now you can use the WC mmap to avoid the penalty
> of performing a copy and avoiding the WB/clflush 2-pass.
> 
> In general, does 10us resolution require compensation for clock_gettime()
> overhead and checking against clock_getres()?

FWIW, I copied the 10us threshold from your brw-batch series.  I'm happy
to adjust it.

On my system, clock_getres(CLOCK_MONOTONIC[_RAW], &res) reports a
resolution of 1 nanosecond, so given a 10us = 10000ns threshold, I
doubt we need to consider it.

> (I hope getime is using MONOTONIC_RAW!)

It isn't.  It should.  I'll send a patch.

> Longer term feeding the callsite down to set-domain is
> useful to make diagnosing the problem easier. I tried to give the name
> as being the closest GL entry point along with the function/line of the
> culprit.

Yeah.  bo->name is often enough to find the offender, but it'd
definitely be nicer to pass all that through.  Lots of "miptree" BOs
and lots of ways those can go wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170410/187e3519/attachment-0001.sig>


More information about the mesa-dev mailing list