[Intel-gfx] [PATCH 02/18] drm/i915: introduce drm_i915_gem_object page_size members

Wed Apr 5 06:49:17 UTC 2017

On Tue, Apr 04, 2017 at 11:11:12PM +0100, Matthew Auld wrote:
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c        | 5 +++++
>  drivers/gpu/drm/i915/i915_gem_object.h | 3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 4ca88f2539c0..cbf97f4bbb72 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2441,6 +2441,8 @@ static int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
>  	struct sg_table *pages;
>  
>  	GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
> +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->page_size));
> +	GEM_BUG_ON(!is_valid_gtt_page_size(obj->gtt_page_size));
>  
>  	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
>  		DRM_DEBUG("Attempting to obtain a purgeable object\n");
> @@ -4159,6 +4161,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
>  
>  	obj->ops = ops;
>  
> +	obj->page_size = PAGE_SIZE;
> +	obj->gtt_page_size = I915_GTT_PAGE_SIZE;
> +
>  	reservation_object_init(&obj->__builtin_resv);
>  	obj->resv = &obj->__builtin_resv;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
> index 174cf923c236..b1dacbfe5173 100644
> --- a/drivers/gpu/drm/i915/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/i915_gem_object.h
> @@ -107,6 +107,9 @@ struct drm_i915_gem_object {
>  	unsigned int cache_level:3;
>  	unsigned int cache_dirty:1;
>  
> +	unsigned int page_size; /* CPU pov - 4K(default), 2M, 1G */
> +	unsigned int gtt_page_size; /* GPU pov - 4K(default), 64K, 2M, 1G */

Just kinda archecture review, with a long-term view: Is the plan to
eventually become more flexible here, i.e. allow mixed mode? We can of
course ask shmem to try really hard to give us huge pages, but at the end
it might not be able to give us a huge page (if the obj size isn't rounded
to 2M), and there's no hw reason to not map everything else as hugepage.
Through sg table coalescing we can cope with that, and we can check fairly
cheaply whether an entry is big enough to be eligible for huge page
mapping.

That also means in the pte functions we'd not make a top-level decision
whether to use huge entries or not, but do that at each level by looking
at the sg table. This should also make it easier for stolen, which is
always contiguous but rather often not size-rounded.

It's a bit more tricky for 64kb pages, but I think those only can be used
for an object which already has huge pages/is contiguous, but where the
size is only rounded to 64kb and not 2m (because 2m would wast too much
space). Then we can map the partial 2m using 64kb entries.

Just some long-term thoughts on this here, wher I expect things will head
towards eventually.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch