[Intel-gfx] [PATCH 1/7] drm/i915: preallocate pdps for 32 bit vgpu
Joonas Lahtinen
joonas.lahtinen at linux.intel.com
Thu Aug 20 03:56:20 PDT 2015
Hi,
Added Michel and Dave as CC too, to notice this, as they are specified
in the patch as CC.
On to, 2015-08-20 at 15:45 +0800, Zhiyuan Lv wrote:
> This is based on Mika Kuoppala's patch below:
> http://article.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/61
> 104/match=workaround+hw+preload
>
> The patch will preallocate the page directories for 32-bit PPGTT when
> i915 runs inside a virtual machine with Intel GVT-g. With this
> change,
> the root pointers in EXECLIST context will always keep the same.
>
> The change is needed for vGPU because Intel GVT-g will do page table
> shadowing, and needs to track all the page table changes from guest
> i915 driver. However, if guest PPGTT is modified through GPU commands
> like LRI, it is not possible to trap the operations in the right
> time,
> so it will be hard to make shadow PPGTT to work correctly.
>
> Shadow PPGTT could be much simpler with this change. Meanwhile
> hypervisor could simply prohibit any attempt of PPGTT modification
> through GPU command for security.
>
> The function gen8_preallocate_top_level_pdps() in the patch is from
> Mika, with only one change to set "used_pdpes" to avoid duplicated
> allocation later.
>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> Cc: Dave Gordon <david.s.gordon at intel.com>
> Cc: Michel Thierry <michel.thierry at intel.com>
> Signed-off-by: Zhiyuan Lv <zhiyuan.lv at intel.com>
> Signed-off-by: Zhi Wang <zhi.a.wang at intel.com>
>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
I'm just wondering if it's worth keeping the LRI method of updating the
PDPS at all, for the sake of a couple of KBs per PPGTT, now that we
have an occasional need for making them static. So this patch is R-b:d,
but I'd suggest discussion about removing the LRI update method, and
favoring static PDPS always for 32-bit.
Regards, Joonas
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 33
> +++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> 2 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 4a76807..ed10e77 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1441,6 +1441,33 @@ static void gen8_dump_ppgtt(struct
> i915_hw_ppgtt *ppgtt, struct seq_file *m)
> }
> }
>
> +static int gen8_preallocate_top_level_pdps(struct i915_hw_ppgtt
> *ppgtt)
> +{
> + unsigned long *new_page_dirs, **new_page_tables;
> + uint32_t pdpes = I915_PDPES_PER_PDP(dev);
> + int ret;
> +
> + /* We allocate temp bitmap for page tables for no gain
> + * but as this is for init only, lets keep the things simple
> + */
> + ret = alloc_gen8_temp_bitmaps(&new_page_dirs,
> &new_page_tables, pdpes);
> + if (ret)
> + return ret;
> +
> + /* Allocate for all pdps regardless of how the ppgtt
> + * was defined.
> + */
> + ret = gen8_ppgtt_alloc_page_directories(&ppgtt->base, &ppgtt
> ->pdp,
> + 0, 1ULL << 32,
> + new_page_dirs);
> + if (!ret)
> + *ppgtt->pdp.used_pdpes = *new_page_dirs;
> +
> + free_gen8_temp_bitmaps(new_page_dirs, new_page_tables,
> pdpes);
> +
> + return ret;
> +}
> +
> /*
> * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP
> registers
> * with a net effect resembling a 2-level page table in normal x86
> terms. Each
> @@ -1484,6 +1511,12 @@ static int gen8_ppgtt_init(struct
> i915_hw_ppgtt *ppgtt)
> trace_i915_page_directory_pointer_entry_alloc(&ppgtt
> ->base,
> 0, 0,
>
> GEN8_PML4E_SHIFT);
> +
> + if (intel_vgpu_active(ppgtt->base.dev)) {
> + ret =
> gen8_preallocate_top_level_pdps(ppgtt);
> + if (ret)
> + goto free_scratch;
> + }
> }
>
> return 0;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> b/drivers/gpu/drm/i915/intel_lrc.c
> index e77b6b0..2dc8709 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1540,7 +1540,8 @@ static int gen8_emit_bb_start(struct
> drm_i915_gem_request *req,
> * not needed in 48-bit.*/
> if (req->ctx->ppgtt &&
> (intel_ring_flag(req->ring) & req->ctx->ppgtt
> ->pd_dirty_rings)) {
> - if (!USES_FULL_48BIT_PPGTT(req->i915)) {
> + if (!USES_FULL_48BIT_PPGTT(req->i915) &&
> + !intel_vgpu_active(req->i915->dev)) {
> ret = intel_logical_ring_emit_pdps(req);
> if (ret)
> return ret;
More information about the Intel-gfx
mailing list