[Intel-gfx] [PATCH 00/68] Broadwell 48b addressing and prelocations (no relocs)

Jesse Barnes jbarnes at virtuousgeek.org
Tue Aug 26 00:42:39 CEST 2014


On Fri, 22 Aug 2014 22:38:22 +0200
Daniel Vetter <daniel at ffwll.ch> wrote:

> On Fri, Aug 22, 2014 at 3:38 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > On Fri, Aug 22, 2014 at 03:30:12PM +0200, Daniel Vetter wrote:
> >> On Fri, Aug 22, 2014 at 9:03 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >> >> > > If a GPU
> >> >> > > client uses only prelocations, the relocation process can be entirely
> >> >> > > skipped. This sounds like a big win initially,
> >> >> >
> >> >> > Close to zero if the client uses existing interfaces.
> >> >> > -Chris
> >> >>
> >> >> Chris,
> >> >>
> >> >> I don't know if you've seen Ben's libdrm and Mesa patches, but with a few patches to libdrm and virtually zero Mesa changes, he's apparently eliminated our need to do any relocations for the 3D driver.  It wasn't invasive at all---I was surprised.
> >> >
> >> > Indeed, you could do everything inside libdrm with the code I posted 2
> >> > years ago.
> >>
> >> I915_EXEC_NO_RELOC can be used to tell the kernel that it doesn't need
> >> to walk all the reloc tables (if nothing moved) because userspace
> >> didn't go insane and reuse reloc trees. So you'd need to implement a
> >> flag + a libdrm function to set that (iirc mesa has been non-stupid
> >> since years). And yeah I kinda expect any new reloc-less thing to get
> >> benchmarked against an implementation using that, since the 48bit
> >> specific thing proposed looks like a fairly short-lived stop-gap, and
> >> since the current no-reloc we already have would work everywhere. And
> >> yeah I've been poking people to look at this for years. too.
> >
> > Here, I was referring to soft-pinning. The API here is essentially
> > comprised of two parts:
> >
> > 1: a pin into the vm upon creation
> > 2: implicit no-relocation upon execbuffer
> >
> > By making those two steps independent, the API as I see is, is more
> > flexible and powerful.
> 
> Well I admit to not having read the patches over the terrible wifi
> here, but I presumed Ben's patches did implement softpin. I guess I've
> made a mess of all of this now. In any case I still want to see
> relative improvements over what we have since the prelocated stuff
> looks like a gen8 oneshot. And we still can't do relocation-less
> execbuf because the gpu can't fault, so I'm not sure at all whether
> this is actually useful for opencl 2.0.

It is.  OCL 2.0 has two modes of operation: bufferless and buffered.
Both modes require CPU/GPU pointer sharing, but in the latter case
for us the kernel GPU driver will be involved in all allocations.

I'm not sure whether this is BDW only either, so don't shoot it down or
discount it based on that.

-- 
Jesse Barnes, Intel Open Source Technology Center



More information about the Intel-gfx mailing list