[Intel-gfx] [PATCH] drm/i915: Support creation of unbound wc user mappings for objects

Daniel Vetter daniel at ffwll.ch
Thu Nov 6 15:50:58 CET 2014


On Wed, Nov 05, 2014 at 12:48:35PM +0000, Chris Wilson wrote:
> On Thu, Oct 23, 2014 at 05:55:47PM +0100, Chris Wilson wrote:
> > From: Akash Goel <akash.goel at intel.com>
> > 
> > This patch provides support to create write-combining virtual mappings of
> > GEM object. It intends to provide the same funtionality of 'mmap_gtt'
> > interface without the constraints and contention of a limited aperture
> > space, but requires clients handles the linear to tile conversion on their
> > own. This is for improving the CPU write operation performance, as with such
> > mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
> > to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
> > flush after update from CPU side, when object is passed onto GPU.  This
> > type of mapping is specially useful in case of sub-region update,
> > i.e. when only a portion of the object is to be updated. Using a CPU mmap
> > in such cases would normally incur a clflush of the whole object, and
> > using a GTT mmapping would likely require eviction of an active object or
> > fence and thus stall. The write-combining CPU mmap avoids both.
> > 
> > To ensure the cache coherency, before using this mapping, the GTT domain
> > has been reused here. This provides the required cache flush if the object
> > is in CPU domain or synchronization against the concurrent rendering.
> > Although the access through an uncached mmap should automatically
> > invalidate the cache lines, this may not be true for non-temporal write
> > instructions and also not all pages of the object may be updated at any
> > given point of time through this mapping.  Having a call to get_pages in
> > set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
> > Broaden application of set-domain(GTT)', would guarantee the clflush and
> > so there will be no cachelines holding the data for the object before it
> > is accessed through this map.
> > 
> > The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
> > extended with a new flags field (defaulting to 0 for existent users). In
> > order for userspace to detect the extended ioctl, a new parameter
> > I915_PARAM_HAS_EXT_MMAP has been added for versioning the ioctl interface.
> > 
> > v2: Fix error handling, invalid flag detection, renaming (ickle)
> > 
> > The new mmapping is exercised by igt/gem_mmap_wc,
> > igt/gem_concurrent_blit and igt/gem_gtt_speed.
> > 
> > Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
> > Signed-off-by: Akash Goel <akash.goel at intel.com>
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> 
> Daniel, since both Akash and myself developed this, we need a third body
> for the review. Tag.

Oh, patch itself looks good imo. It's the testcases that need review
(which Akash could do since you've written them), plus the missing bits
(like nasty invalid args checking for the mmap ioctl), which also Akash or
you could supply. Once that's done (plus the review from Akash on your
patch) I'll vavuum up all the kernel parts.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list