[Mesa-dev] [PATCH 4/4] i965: Drop non-LLC lunacy in the program cache code.

Wed Jul 12 12:00:43 UTC 2017

Quoting Chris Wilson (2017-07-12 10:40:43)
> Quoting Kenneth Graunke (2017-07-12 08:22:25)
> > The non-LLC story was a horror show.  We uploaded data via pwrite
> > (drm_intel_bo_subdata), which would stall if the cache BO was in
> > use (being read) by the GPU.  Obviously, we wanted to avoid that.
> > So, we tried to detect whether the buffer was busy, and if so, we'd
> > allocate a new BO, map the old one read-only (hopefully not stalling),
> > copy all shaders compiled since the dawn of time to the new buffer,
> > upload our new one, toss the old BO, and let the state upload code
> > know that our program cache BO changed.  This was a lot of extra data
> > copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new
> > STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline.
> > 
> > Not only that, but our rudimentary busy tracking consistented of a flag
> > set at execbuf time, and not cleared until we threw out the program
> > cache BO.  So, the first shader upload after any drawing would hit this
> > "abandon the cache and start over" copying path.
> > 
> > None of this is necessary - it's just ancient crufty code.  We can
> > use the same persistent mapping paths on all platforms.  On non-LLC
> > platforms, this should use a write combining map, which should be
> > decently fast.  (On ancient kernels, this will fall through to an
> > uncached GTT map, which will be less efficient, but...upgrade your
> > kernel, seriously...)
> > 
> > This is not only better, but the code is significantly simpler.
> 
> Another on the insta-kill list is the handling of !llc batches.

Ah, it looks like some of the old gen may be using readback from the
batch when constructing their brw_state_batch(). So maybe some work
required first.
-Chris