[Intel-gfx] [PATCH] drm/i915: Setup all page directories for gen8

Ville Syrjälä ville.syrjala at linux.intel.com
Wed Mar 4 11:14:39 PST 2015


On Wed, Mar 04, 2015 at 02:55:17PM +0200, Mika Kuoppala wrote:
> If the requested size is less than what the full range
> of pdps can address, we end up setting pdps for only the
> requested area.
> 
> The logical context however needs all pdp entries to be valid.
> Prior to commit 06fda602dbca ("drm/i915: Create page table allocators")
> we have been writing pdp entries with dma address of zero instead
> of valid pdps. This is supposedly bad even if those pdps are not
> addressed.
> 
> As commit 06fda602dbca ("drm/i915: Create page table allocators")
> introduced more dynamic structure for pdps, we ended up oopsing
> when we populated the lrc context. Analyzing this oops revealed
> the fact that we have not been writing valid pdps with bsw, as
> it is doing the ppgtt init with 2GB limit in some cases.
> 
> We should do the right thing and setup the non addressable part
> pdps/pde/pte to scratch page through the minimal structure by
> having just pdp with pde entries pointing to same page with
> pte entries pointing to scratch page.
> 
> But instead of going through that trouble, setup all the pdps
> through individual pd pages and pt entries, even for non
> addressable parts. And let the clear range point them to scratch
> page. This way we populate the lrc with valid pdps and wait
> for dynamic page allocation work to land, and do the heavy lifting
> for truncating page table tree according to usage.
> 
> The regression of oopsing in init was introduced by
> commit 06fda602dbca ("drm/i915: Create page table allocators")
> 
> v2: Clear the range for the unused part also (Ville)
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89350
> Cc: Michel Thierry <michel.thierry at intel.com>
> Cc: Ben Widawsky <benjamin.widawsky at intel.com>
> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Tested-by: Valtteri Rantala <valtteri.rantala at intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>

Seems sane enough assuming we're willing to pay the cost.
Reviewed-by: Ville Syrjälä <ville.syrjala at linux.intel.com>

And then we could also throw another patch on top to actually increase the
PPGTT size to 4GiB since there's no extra cost anymore :) Well, assuming
everything else is 4GiB safe. But I guess it ought to be if BDW machines
have already been running with 4GiB GGTT/PPGTT.

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index bd95776..1aa2a2c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -716,15 +716,19 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>  	if (size % (1<<30))
>  		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
>  
> -	/* 1. Do all our allocations for page directories and page tables. */
> -	ret = gen8_ppgtt_alloc(ppgtt, max_pdp);
> +	/* 1. Do all our allocations for page directories and page tables.
> +	 * We allocate more than was asked so that we can point the unused parts
> +	 * to valid entries that point to scratch page. Dynamic page tables
> +	 * will fix this eventually.
> +	 */
> +	ret = gen8_ppgtt_alloc(ppgtt, GEN8_LEGACY_PDPES);
>  	if (ret)
>  		return ret;
>  
>  	/*
>  	 * 2. Create DMA mappings for the page directories and page tables.
>  	 */
> -	for (i = 0; i < max_pdp; i++) {
> +	for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
>  		ret = gen8_ppgtt_setup_page_directories(ppgtt, i);
>  		if (ret)
>  			goto bail;
> @@ -744,7 +748,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>  	 * plugged in correctly. So we do that now/here. For aliasing PPGTT, we
>  	 * will never need to touch the PDEs again.
>  	 */
> -	for (i = 0; i < max_pdp; i++) {
> +	for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
>  		struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i];
>  		gen8_ppgtt_pde_t *pd_vaddr;
>  		pd_vaddr = kmap_atomic(ppgtt->pdp.page_directory[i]->page);
> @@ -764,9 +768,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
>  	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
>  	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
>  	ppgtt->base.start = 0;
> -	ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * PAGE_SIZE;
>  
> -	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
> +	/* This is the area that we advertise as usable for the caller */
> +	ppgtt->base.total = max_pdp * GEN8_PDES_PER_PAGE * GEN8_PTES_PER_PAGE * PAGE_SIZE;
> +
> +	/* Set all ptes to a valid scratch page. Also above requested space */
> +	ppgtt->base.clear_range(&ppgtt->base, 0,
> +				ppgtt->num_pd_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE,
> +				true);
>  
>  	DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d wasted)\n",
>  			 ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);
> -- 
> 1.9.1

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list