[Intel-gfx] [PATCH] drm/i915: Setup all page directories for gen8
Ville Syrjälä
ville.syrjala at linux.intel.com
Wed Mar 4 11:14:39 PST 2015
On Wed, Mar 04, 2015 at 02:55:17PM +0200, Mika Kuoppala wrote:
> If the requested size is less than what the full range
> of pdps can address, we end up setting pdps for only the
> requested area.
>
> The logical context however needs all pdp entries to be valid.
> Prior to commit 06fda602dbca ("drm/i915: Create page table allocators")
> we have been writing pdp entries with dma address of zero instead
> of valid pdps. This is supposedly bad even if those pdps are not
> addressed.
>
> As commit 06fda602dbca ("drm/i915: Create page table allocators")
> introduced more dynamic structure for pdps, we ended up oopsing
> when we populated the lrc context. Analyzing this oops revealed
> the fact that we have not been writing valid pdps with bsw, as
> it is doing the ppgtt init with 2GB limit in some cases.
>
> We should do the right thing and setup the non addressable part
> pdps/pde/pte to scratch page through the minimal structure by
> having just pdp with pde entries pointing to same page with
> pte entries pointing to scratch page.
>
> But instead of going through that trouble, setup all the pdps
> through individual pd pages and pt entries, even for non
> addressable parts. And let the clear range point them to scratch
> page. This way we populate the lrc with valid pdps and wait
> for dynamic page allocation work to land, and do the heavy lifting
> for truncating page table tree according to usage.
>
> The regression of oopsing in init was introduced by
> commit 06fda602dbca ("drm/i915: Create page table allocators")
>
> v2: Clear the range for the unused part also (Ville)
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89350
> Cc: Michel Thierry <michel.thierry at intel.com>
> Cc: Ben Widawsky <benjamin.widawsky at intel.com>
> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Tested-by: Valtteri Rantala <valtteri.rantala at intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
Seems sane enough assuming we're willing to pay the cost.
Reviewed-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
And then we could also throw another patch on top to actually increase the
PPGTT size to 4GiB since there's no extra cost anymore :) Well, assuming
everything else is 4GiB safe. But I guess it ought to be if BDW machines
have already been running with 4GiB GGTT/PPGTT.
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index bd95776..1aa2a2c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -716,15 +716,19 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
> if (size % (1<<30))
> DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
>
> - /* 1. Do all our allocations for page directories and page tables. */
> - ret = gen8_ppgtt_alloc(ppgtt, max_pdp);
> + /* 1. Do all our allocations for page directories and page tables.
> + * We allocate more than was asked so that we can point the unused parts
> + * to valid entries that point to scratch page. Dynamic page tables
> + * will fix this eventually.
> + */
> + ret = gen8_ppgtt_alloc(ppgtt, GEN8_LEGACY_PDPES);
> if (ret)
> return ret;
>
> /*
> * 2. Create DMA mappings for the page directories and page tables.
> */
> - for (i = 0; i < max_pdp; i++) {
> + for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
> ret = gen8_ppgtt_setup_page_directories(ppgtt, i);
> if (ret)
> goto bail;
> @@ -744,7 +748,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
> * plugged in correctly. So we do that now/here. For aliasing PPGTT, we
> * will never need to touch the PDEs again.
> */
> - for (i = 0; i < max_pdp; i++) {
> + for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
> struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i];
> gen8_ppgtt_pde_t *pd_vaddr;
> pd_vaddr = kmap_atomic(ppgtt->pdp.page_directory[i]->page);
> @@ -764,9 +768,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
> ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
> ppgtt->base.cleanup = gen8_ppgtt_cleanup;
> ppgtt->base.start = 0;
> - ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * PAGE_SIZE;
>
> - ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
> + /* This is the area that we advertise as usable for the caller */
> + ppgtt->base.total = max_pdp * GEN8_PDES_PER_PAGE * GEN8_PTES_PER_PAGE * PAGE_SIZE;
> +
> + /* Set all ptes to a valid scratch page. Also above requested space */
> + ppgtt->base.clear_range(&ppgtt->base, 0,
> + ppgtt->num_pd_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE,
> + true);
>
> DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d wasted)\n",
> ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);
> --
> 1.9.1
--
Ville Syrjälä
Intel OTC
More information about the Intel-gfx
mailing list