[Intel-gfx] [PATCH 06/10] drm/i915: Stop using AGP layer for GEN6+

Thu Oct 25 22:54:29 CEST 2012

On Tue, 23 Oct 2012 07:57:12 -0700
Ben Widawsky <ben at bwidawsk.net> wrote:

> On 2012-10-23 02:59, Chris Wilson wrote:
> > On Mon, 22 Oct 2012 18:34:11 -0700, Ben Widawsky <ben at bwidawsk.net> 
> > wrote:
> >> +/*
> >> + * Binds an object into the global gtt with the specified cache 
> >> level. The object
> >> + * will be accessible to the GPU via commands whose operands 
> >> reference offsets
> >> + * within the global GTT as well as accessible by the GPU through 
> >> the GMADR
> >> + * mapped BAR (dev_priv->mm.gtt->gtt).
> >> + */
> >> +static void gen6_ggtt_bind_object(struct drm_i915_gem_object *obj,
> >> +				  enum i915_cache_level level)
> >> +{
> >> +	struct drm_device *dev = obj->base.dev;
> >> +	struct drm_i915_private *dev_priv = dev->dev_private;
> >> +	struct sg_table *st = obj->pages;
> >> +	struct scatterlist *sg = st->sgl;
> >> +	const int first_entry = obj->gtt_space->start >> PAGE_SHIFT;
> >> +	const int max_entries = dev_priv->mm.gtt->gtt_total_entries - 
> >> first_entry;
> >> +	gtt_pte_t __iomem *gtt_entries = dev_priv->mm.gtt->gtt + 
> >> first_entry;
> >> +	int unused, i = 0;
> >> +	unsigned int len, m = 0;
> >> +
> >> +	for_each_sg(st->sgl, sg, st->nents, unused) {
> >> +		len = sg_dma_len(sg) >> PAGE_SHIFT;
> >> +		for (m = 0; m < len; m++) {
> >> +			dma_addr_t addr = sg_dma_address(sg) + (m << PAGE_SHIFT);
> >> +			gtt_entries[i] = pte_encode(dev, addr, level);
> >> +			i++;
> >> +			if (WARN_ON(i > max_entries))
> >> +				goto out;
> >> +		}
> >> +	}
> >> +
> >> +out:
> >> +	/* XXX: This serves as a posting read preserving the way the old 
> >> code
> >> +	 * works. It's not clear if this is strictly necessary or just 
> >> voodoo
> >> +	 * based on what I've tried to gather from the docs.
> >> +	 */
> >> +	readl(&gtt_entries[i-1]);
> >
> > It will be required until we replace the voodoo with more explicit 
> > mb().
> > -Chris
> 
> Actually, after we introduce the FLSH_CNTL patch from Jesse/me later in 
> the series, I think we just want a POSTING_READ on that register. It is 
> technically "required" by our desire to some day WC the registers, and 
> should synchronize everything else for us.
> 
> After a quick read of memory_barriers.txt (again), I think mmiowb is 
> actually what we might want in addition to the POSTING_READ I'd add.

On a big NUMA system maybe (i.e. on nothing we run on yet), but on x86
mmiowb doesn't do anything other than act as a compiler optimization
barrier.

-- 
Jesse Barnes, Intel Open Source Technology Center