[Intel-gfx] [PATCH] intel: New libdrm interface to create unbound wc user mappings for objects

Chris Wilson chris at chris-wilson.co.uk
Sun Oct 26 09:41:20 CET 2014


On Sun, Oct 26, 2014 at 02:06:57PM +0530, Akash Goel wrote:
> On Sat, 2014-10-25 at 13:45 +0100, Damien Lespiau wrote:
> > On Sat, Oct 25, 2014 at 05:21:53PM +0530, akash.goel at intel.com wrote:
> > > From: Akash Goel <akash.goel at intel.com>
> > > 
> > > A new libdrm interface 'drm_intel_gem_bo_map_wc' is provided by this
> > > patch. Through this interface Gfx clients can create write combining
> > > virtual mappings of the Gem object. It will provide the same funtionality
> > > of 'mmap_gtt' interface without the constraints of limited aperture space,
> > > but provided clients handles the linear to tile conversion on their own.
> > > This patch is intended for improving the CPU write operation performance,
> > > as with such mapping, writes are almost 50% faster than with mmap_gtt.
> > > Also it avoids the Cache flush after update from CPU side, when object is
> > > passed on to GPU, which will be the case if regular mmap interface is used.
> > > This type of mapping is specially useful in case of sub-region
> > > update, i.e. when only a portion of the object is to be updated.
> > > Also there is a support for the unsynchronized version of this interface
> > > named 'drm_intel_gem_bo_map_wc_unsynchronized', through which wc virtual
> > > mappings, but unsynchronized one, can be created of the Gem object.
> > > To ensure the cache coherency, before using this mapping, the GTT domain has
> > > been reused here. This provides the required Cache flush if the object is in
> > > CPU domain or synchronization against the concurrent rendering
> > > 
> > > The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
> > > extended with a new flags field (defaulting to 0 for existent users). In
> > > order for userspace to detect the extended ioctl, a new parameter
> > > I915_PARAM_HAS_EXT_MMAP has been added for versioning the ioctl interface.
> > > 
> > > v2: Aligned with the v2 of the corresponding kernel patch (Chris)
> > > 
> > > Signed-off-by: Akash Goel <akash.goel at intel.com>
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

I don't get s-o-b for header contributions ;-)

> > From a quick glance at the patch:
> > 
> > It looks like you copy/pasted the map() implementation and changed a few
> > things here and there instead of adding flag to drm_intel_gem_bo_map()
> > and reusing the current code.
> > 
> > Can we expose another version of map that takes flags (_WC,
> > _UNSYNCHRONIZED, ...) instead of starting to have every single
> > combination possible?
> > 
> 
> Thanks for the review. Yes its almost a copy-paste of the 'mmap_gtt'
> implementation.
> Yes adding a new flag parameter could be an alternative.
> 
> > Do we really need a separate mem_wc_virtual? Using _map() or _map_wc()
> > in an exclusively way makes some sense to me.
> > 
> 
> Initially made _map() & _map_wc() exclusive to each other. Had a
> discussion with Chris, he suggested to maintain all 3 mappings i.e allow
> them to co-exist.  

For example, in the ddx we may want to use either GTT or WC mapping
depending on the actual path. WC if we know we can use manual detiling,
otherwise we use GTT and benefit from the linear view (or we have to
wrap all tiled access with much slower accessors). We may even mix
WC/CPU mappings - e.g. first write through the CPU mapping whilst we
know the object is still in the CPU domain, and then switch to updating
the object when it is on the GPU using WC. (Again see examples in the
ddx). I can foresee that we can benefit from keeping each of the
mmappings cached.

> > With the introduction of mem_wc_virtual, you left aside the vma cache
> > handling and so we'll never end up unmapping it, drm_intel_gem_bo_free()
> > doesn't unmap it either, ...
> > 
> Sorry I completely missed this. Thanks for spotting this. Will do the
> needful. 

Also, before trying to use the mmap(wc), you should check that it
exists! Otherwise you just return an ordinary mmap(wb) to the user and
corruption ensues.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list