[Intel-gfx] [PATCH 5/7] drm/i915/vlv: Increase the utilization of stolen memory on VLV.
Daniel Vetter
daniel at ffwll.ch
Thu Jan 9 08:46:33 CET 2014
One thing I've forgotten: The Threading in your patch-bomb seems
busted, the patches are somehow not linked up to the cover letter. If
you use a recent git and format-patch/send-email with default settings
it should all work out ok though.
With proper in-reply linking patch bomb threads tend to get splattered
all over the place in mail clients making it hard to keep an overview
of the discussions.
-Daniel
On Thu, Jan 9, 2014 at 8:27 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Thu, Jan 09, 2014 at 11:01:02AM +0530, akash.goel at intel.com wrote:
>> From: Akash Goel <akash.goel at intel.com>
>>
>> On VLV, 64MB of system memory was being reserved for stolen
>> area, but ~8MB of it was being utilized.
>> Increased the utilization of Stolen area by allocating
>> User created Frame buffers(only X Tiled) from it.
>> User Frame buffers are suitable for allocation from stolen area,
>> as its very unlikely that they are not accessed from CPU side.
>> And its even more unlikely that the Tiled(X) buffers will be
>> accessed directly from the CPU side. And any allocation
>> from stolen area is not directly CPU accessible, but
>> accessible only through the aperture space.
>> With 1080p sized frame buffers (32 bpp), the allocation of 6-7
>> frame buffers itself gives almost the full utilization of
>> stolen area.
>>
>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 1 +
>> drivers/gpu/drm/i915/i915_gem.c | 21 ++++++++
>> drivers/gpu/drm/i915/i915_gem_stolen.c | 93 ++++++++++++++++++++++++++++++++++
>> 3 files changed, 115 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 1bcc543..db29537 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2376,6 +2376,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>> u32 stolen_offset,
>> u32 gtt_offset,
>> u32 size);
>> +void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj);
>> void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
>>
>> /* i915_gem_tiling.c */
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 656406d..24f93ef 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3915,6 +3915,27 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>> }
>> }
>>
>> + /* Try to allocate the physical space for the GEM object,
>> + * representing the User frame buffer, from the stolen area.
>> + * But if there is no sufficient free space left in stolen
>> + * area, will fallback to shmem.
>> + */
>> + if (obj->user_fb == 1) {
>> + if (obj->pages == NULL) {
>> + if (obj->tiling_mode == I915_TILING_X) {
>> + /* Tiled(X) Scanout buffers are more suitable
>> + * for allocation from stolen area, as its very
>> + * unlikely that they will be accessed directly
>> + * from the CPU side and any allocation from
>> + * stolen area is not directly CPU accessible,
>> + * but accessible only through the aperture
>> + * space.
>> + */
>> + i915_gem_object_move_to_stolen(obj);
>> + }
>> + }
>> + }
>
> Neat hack ;-) But I think for upstream we need to address a few more
> concerns. Random comments:
>
> - The vlv patch subject is a bit misleading here since this is about
> stolen memory usage in general across all platforms.
>
> - object_pin is the wrong place to change the backing storage since
> someone else could already have pinned the it (e.g. through dma-buf). We
> need to do this earlier before calling down into ->get_pages.
>
> - If we allow opportunistical placement of objects into stolen memory I
> think we should also fully support purgeable objects so that we can
> dynamically scale to workloads. Or at least to be able to kick out stuff
> in case the kernel needs the contiguous memory for something more
> important. So if we couldn't find a suitable stolen block we'd scan
> through all objects with stolen memory backing and check whether any
> purgeable objects could be kicked out.
>
> - For your usecase, have you tried to simply reduce the stolen area as
> much as possible? Our friendly competition on ARM SoCs seems to have
> mostly moved away from gfx reserved chunks towards more flexible
> approaches like CMA. Giving stolen back to the linux page allocator
> should be possible, but I haven't really looked into that yet all that
> much ...
>
> - For upstream I think changing the personality of buffer objects behind
> userspace's back is a bit too frisky wrt breaking something. I prefer if
> userspace opts-in explicitly by passing a flag to the create ioctl
> stating that stolen memory is allowed/preferred for this allocation.
> Chris Wilson posted an rfc patch a while back to add a create2 ioctl
> (which at the same time also allows us to set the caching, tiling and
> any other object parameters).
>
> - We've had some "fun" last time around we've tried to use stolen more
> seriously with scanout buffers being strangely offset/corrupted.
> Hopefully this is fixed by your sg->offset patch, but I can't find the
> reports nor recall the details right now.
>
> Cheers, Daniel
>> +
>> if (!i915_gem_obj_bound(obj, vm)) {
>> ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
>> map_and_fenceable,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> index 5cf97d6..29c22f9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> @@ -29,6 +29,7 @@
>> #include <drm/drmP.h>
>> #include <drm/i915_drm.h>
>> #include "i915_drv.h"
>> +#include <linux/shmem_fs.h>
>>
>> /*
>> * The BIOS typically reserves some of the system's memory for the exclusive
>> @@ -370,6 +371,98 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
>> return NULL;
>> }
>>
>> +void
>> +i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj)
>> +{
>> + struct drm_device *dev = obj->base.dev;
>> + struct drm_i915_private *dev_priv = dev->dev_private;
>> + struct drm_mm_node *stolen;
>> + u32 size = obj->base.size;
>> + int ret = 0;
>> +
>> + if (!IS_VALLEYVIEW(dev)) {
>> + return;
>> + }
>> +
>> + if (obj->stolen) {
>> + BUG_ON(obj->pages == NULL);
>> + return;
>> + }
>> +
>> + if (!drm_mm_initialized(&dev_priv->mm.stolen))
>> + return;
>> +
>> + if (size == 0)
>> + return;
>> +
>> + /* Check if already shmem space has been allocated for the object
>> + * or not. We cannot rely upon on the value of 'pages' field for this.
>> + * As even though if the 'pages' field is NULL, it does not actually
>> + * indicate that the backing physical space (shmem) is currently not
>> + * reserved for the object, as the object may not get purged/truncated
>> + * on the calll to 'put_pages_gtt'.
>> + */
>> + if (obj->base.filp) {
>> + struct inode *inode = file_inode(obj->base.filp);
>> + struct shmem_inode_info *info = SHMEM_I(inode);
>> + if (!inode)
>> + return;
>> + spin_lock(&info->lock);
>> + /* The alloced field stores how many data pages are
>> + * allocated to the file.
>> + */
>> + ret = info->alloced;
>> + spin_unlock(&info->lock);
>> + if (ret > 0) {
>> + DRM_DEBUG_DRIVER(
>> + "Already shmem space alloced, %d pges\n", ret);
>> + return;
>> + }
>> + }
>> +
>> + stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
>> + if (!stolen)
>> + return;
>> +
>> + ret = drm_mm_insert_node(&dev_priv->mm.stolen, stolen, size,
>> + 4096, DRM_MM_SEARCH_DEFAULT);
>> + if (ret) {
>> + kfree(stolen);
>> + DRM_DEBUG_DRIVER("ran out of stolen space\n");
>> + return;
>> + }
>> +
>> + /* Set up the object to use the stolen memory,
>> + * backing store no longer managed by shmem layer */
>> + drm_gem_object_release(&(obj->base));
>> + obj->base.filp = NULL;
>> + obj->ops = &i915_gem_object_stolen_ops;
>> +
>> + obj->pages = i915_pages_create_for_stolen(dev,
>> + stolen->start, stolen->size);
>> + if (obj->pages == NULL)
>> + goto cleanup;
>> +
>> + i915_gem_object_pin_pages(obj);
>> + list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list);
>> + obj->has_dma_mapping = true;
>> + obj->stolen = stolen;
>> +
>> + DRM_DEBUG_DRIVER("Obj moved to stolen, ptr = %p, size = %x\n",
>> + obj, size);
>> +
>> + obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
>> + obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
>> +
>> + /* No zeroing-out of buffers allocated from stolen area */
>> + return;
>> +
>> +cleanup:
>> + drm_mm_remove_node(stolen);
>> + kfree(stolen);
>> + return;
>> +}
>> +
>> struct drm_i915_gem_object *
>> i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>> u32 stolen_offset,
>> --
>> 1.8.5.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list