[Intel-gfx] [PATCH 5/7] drm/i915/vlv: Increase the utilization of stolen memory on VLV.

Daniel Vetter daniel at ffwll.ch
Thu Jan 9 08:46:33 CET 2014


One thing I've forgotten: The Threading in your patch-bomb seems
busted, the patches are somehow not linked up to the cover letter. If
you use a recent git and format-patch/send-email with default settings
it should all work out ok though.

With proper in-reply linking patch bomb threads tend to get splattered
all over the place in mail clients making it hard to keep an overview
of the discussions.
-Daniel

On Thu, Jan 9, 2014 at 8:27 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Thu, Jan 09, 2014 at 11:01:02AM +0530, akash.goel at intel.com wrote:
>> From: Akash Goel <akash.goel at intel.com>
>>
>> On VLV, 64MB of system memory was being reserved for stolen
>> area, but ~8MB of it was being utilized.
>> Increased the utilization of Stolen area by allocating
>> User created Frame buffers(only X Tiled) from it.
>> User Frame buffers are suitable for allocation from stolen area,
>> as its very unlikely that they are not accessed from CPU side.
>> And its even more unlikely that the Tiled(X) buffers will be
>> accessed directly from the CPU side. And any allocation
>> from stolen area is not directly CPU accessible, but
>> accessible only through the aperture space.
>> With 1080p sized frame buffers (32 bpp), the allocation of 6-7
>> frame buffers itself gives almost the full utilization of
>> stolen area.
>>
>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h        |  1 +
>>  drivers/gpu/drm/i915/i915_gem.c        | 21 ++++++++
>>  drivers/gpu/drm/i915/i915_gem_stolen.c | 93 ++++++++++++++++++++++++++++++++++
>>  3 files changed, 115 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 1bcc543..db29537 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2376,6 +2376,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>>                                              u32 stolen_offset,
>>                                              u32 gtt_offset,
>>                                              u32 size);
>> +void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj);
>>  void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
>>
>>  /* i915_gem_tiling.c */
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 656406d..24f93ef 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3915,6 +3915,27 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>>               }
>>       }
>>
>> +     /* Try to allocate the physical space for the GEM object,
>> +      * representing the User frame buffer, from the stolen area.
>> +      * But if there is no sufficient free space left in stolen
>> +      * area, will fallback to shmem.
>> +      */
>> +     if (obj->user_fb == 1) {
>> +             if (obj->pages == NULL) {
>> +                     if (obj->tiling_mode == I915_TILING_X) {
>> +                             /* Tiled(X) Scanout buffers are more suitable
>> +                              * for allocation from stolen area, as its very
>> +                              * unlikely that they will be accessed directly
>> +                              * from the CPU side and any allocation from
>> +                              * stolen area is not directly CPU accessible,
>> +                              * but accessible only through the aperture
>> +                              * space.
>> +                              */
>> +                             i915_gem_object_move_to_stolen(obj);
>> +                     }
>> +             }
>> +     }
>
> Neat hack ;-) But I think for upstream we need to address a few more
> concerns. Random comments:
>
> - The vlv patch subject is a bit misleading here since this is about
>   stolen memory usage in general across all platforms.
>
> - object_pin is the wrong place to change the backing storage since
>   someone else could already have pinned the it (e.g. through dma-buf). We
>   need to do this earlier before calling down into ->get_pages.
>
> - If we allow opportunistical placement of objects into stolen memory I
>   think we should also fully support purgeable objects so that we can
>   dynamically scale to workloads. Or at least to be able to kick out stuff
>   in case the kernel needs the contiguous memory for something more
>   important. So if we couldn't find a suitable stolen block we'd scan
>   through all objects with stolen memory backing and check whether any
>   purgeable objects could be kicked out.
>
> - For your usecase, have you tried to simply reduce the stolen area as
>   much as possible? Our friendly competition on ARM SoCs seems to have
>   mostly moved away from gfx reserved chunks towards more flexible
>   approaches like CMA. Giving stolen back to the linux page allocator
>   should be possible, but I haven't really looked into that yet all that
>   much ...
>
> - For upstream I think changing the personality of buffer objects behind
>   userspace's back is a bit too frisky wrt breaking something. I prefer if
>   userspace opts-in explicitly by passing a flag to the create ioctl
>   stating that stolen memory is allowed/preferred for this allocation.
>   Chris Wilson posted an rfc patch a while back to add a create2 ioctl
>   (which at the same time also allows us to set the caching, tiling and
>   any other object parameters).
>
> - We've had some "fun" last time around we've tried to use stolen more
>   seriously with scanout buffers being strangely offset/corrupted.
>   Hopefully this is fixed by your sg->offset patch, but I can't find the
>   reports nor recall the details right now.
>
> Cheers, Daniel
>> +
>>       if (!i915_gem_obj_bound(obj, vm)) {
>>               ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
>>                                                map_and_fenceable,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> index 5cf97d6..29c22f9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> @@ -29,6 +29,7 @@
>>  #include <drm/drmP.h>
>>  #include <drm/i915_drm.h>
>>  #include "i915_drv.h"
>> +#include <linux/shmem_fs.h>
>>
>>  /*
>>   * The BIOS typically reserves some of the system's memory for the exclusive
>> @@ -370,6 +371,98 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
>>       return NULL;
>>  }
>>
>> +void
>> +i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj)
>> +{
>> +     struct drm_device *dev = obj->base.dev;
>> +     struct drm_i915_private *dev_priv = dev->dev_private;
>> +     struct drm_mm_node *stolen;
>> +     u32 size = obj->base.size;
>> +     int ret = 0;
>> +
>> +     if (!IS_VALLEYVIEW(dev)) {
>> +             return;
>> +     }
>> +
>> +     if (obj->stolen) {
>> +             BUG_ON(obj->pages == NULL);
>> +             return;
>> +     }
>> +
>> +     if (!drm_mm_initialized(&dev_priv->mm.stolen))
>> +             return;
>> +
>> +     if (size == 0)
>> +             return;
>> +
>> +     /* Check if already shmem space has been allocated for the object
>> +      * or not. We cannot rely upon on the value of 'pages' field for this.
>> +      * As even though if the 'pages' field is NULL, it does not actually
>> +      * indicate that the backing physical space (shmem) is currently not
>> +      * reserved for the object, as the object may not get purged/truncated
>> +      * on the calll to 'put_pages_gtt'.
>> +      */
>> +     if (obj->base.filp) {
>> +             struct inode *inode = file_inode(obj->base.filp);
>> +             struct shmem_inode_info *info = SHMEM_I(inode);
>> +             if (!inode)
>> +                     return;
>> +             spin_lock(&info->lock);
>> +             /* The alloced field stores how many data pages are
>> +              * allocated to the file.
>> +              */
>> +             ret = info->alloced;
>> +             spin_unlock(&info->lock);
>> +             if (ret > 0) {
>> +                     DRM_DEBUG_DRIVER(
>> +                             "Already shmem space alloced, %d pges\n", ret);
>> +                     return;
>> +             }
>> +     }
>> +
>> +     stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
>> +     if (!stolen)
>> +             return;
>> +
>> +     ret = drm_mm_insert_node(&dev_priv->mm.stolen, stolen, size,
>> +                              4096, DRM_MM_SEARCH_DEFAULT);
>> +     if (ret) {
>> +             kfree(stolen);
>> +             DRM_DEBUG_DRIVER("ran out of stolen space\n");
>> +             return;
>> +     }
>> +
>> +     /* Set up the object to use the stolen memory,
>> +      * backing store no longer managed by shmem layer */
>> +     drm_gem_object_release(&(obj->base));
>> +     obj->base.filp = NULL;
>> +     obj->ops = &i915_gem_object_stolen_ops;
>> +
>> +     obj->pages = i915_pages_create_for_stolen(dev,
>> +                                             stolen->start, stolen->size);
>> +     if (obj->pages == NULL)
>> +             goto cleanup;
>> +
>> +     i915_gem_object_pin_pages(obj);
>> +     list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list);
>> +     obj->has_dma_mapping = true;
>> +     obj->stolen = stolen;
>> +
>> +     DRM_DEBUG_DRIVER("Obj moved to stolen, ptr = %p, size = %x\n",
>> +                      obj, size);
>> +
>> +     obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
>> +     obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
>> +
>> +     /* No zeroing-out of buffers allocated from stolen area */
>> +     return;
>> +
>> +cleanup:
>> +     drm_mm_remove_node(stolen);
>> +     kfree(stolen);
>> +     return;
>> +}
>> +
>>  struct drm_i915_gem_object *
>>  i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>>                                              u32 stolen_offset,
>> --
>> 1.8.5.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list