[Intel-gfx] [PATCH v2] drm/i915: Introduce an internal allocator for disposable private objects
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Oct 10 08:25:08 UTC 2016
On 10/10/2016 09:19, Chris Wilson wrote:
> On Mon, Oct 10, 2016 at 09:11:49AM +0100, Tvrtko Ursulin wrote:
>> On 08/10/2016 09:34, Chris Wilson wrote:
>>> Quite a few of our objects used for internal hardware programming do not
>>> benefit from being swappable or from being zero initialised. As such
>>> they do not benefit from using a shmemfs backing storage and since they
>>> are internal and never directly exposed to the user, we do not need to
>>> worry about providing a filp. For these we can use an
>>> drm_i915_gem_object wrapper around a sg_table of plain struct page. They
>>> are not swap backed and not automatically pinned. If they are reaped
>>> by the shrinker, the pages are released and the contents discarded. For
>>> the internal use case, this is fine as for example, ringbuffers are
>>> pinned from being written by a request to be read by the hardware. Once
>>> they are idle, they can be discarded entirely. As such they are a good
>>> match for execlist ringbuffers and a small variety of other internal
>>> objects.
>>>
>>> In the first iteration, this is limited to the scratch batch buffers we
>>> use (for command parsing and state initialisation).
>>>
>>> v2: Allocate physically contiguous pages, where possible.
>> Since the allocator will be used constantly at runtime, my
>> recollection is that higher order allocations can easily become next
>> to impossible, so I am wondering why? Also, on your last reply you
>> did not write why you think this is interesting to try?
>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>> ---
>>> drivers/gpu/drm/i915/Makefile | 1 +
>>> drivers/gpu/drm/i915/i915_drv.h | 5 +
>>> drivers/gpu/drm/i915/i915_gem_batch_pool.c | 28 ++---
>>> drivers/gpu/drm/i915/i915_gem_internal.c | 162 +++++++++++++++++++++++++++
>>> drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +-
>>> drivers/gpu/drm/i915/intel_engine_cs.c | 2 +-
>>> drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++-
>>> 7 files changed, 190 insertions(+), 24 deletions(-)
>>> create mode 100644 drivers/gpu/drm/i915/i915_gem_internal.c
>>>
>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>>> index a998c2bce70a..b94a90f34d2d 100644
>>> --- a/drivers/gpu/drm/i915/Makefile
>>> +++ b/drivers/gpu/drm/i915/Makefile
>>> @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
>>> i915_gem_execbuffer.o \
>>> i915_gem_fence.o \
>>> i915_gem_gtt.o \
>>> + i915_gem_internal.o \
>>> i915_gem.o \
>>> i915_gem_render_state.o \
>>> i915_gem_request.o \
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index fee5cc92e2f2..bad97f1e5265 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3538,6 +3538,11 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>>> u32 gtt_offset,
>>> u32 size);
>>> +/* i915_gem_internal.c */
>>> +struct drm_i915_gem_object *
>>> +i915_gem_object_create_internal(struct drm_device *dev,
>>> + unsigned int size);
>>> +
>>> /* i915_gem_shrinker.c */
>>> unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv,
>>> unsigned long target,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
>>> index cb25cad3318c..3934c9103cf2 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
>>> @@ -97,9 +97,9 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
>>> size_t size)
>>> {
>>> struct drm_i915_gem_object *obj = NULL;
>>> - struct drm_i915_gem_object *tmp, *next;
>>> + struct drm_i915_gem_object *tmp;
>>> struct list_head *list;
>>> - int n;
>>> + int n, ret;
>>> lockdep_assert_held(&pool->engine->i915->drm.struct_mutex);
>>> @@ -112,19 +112,12 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
>>> n = ARRAY_SIZE(pool->cache_list) - 1;
>>> list = &pool->cache_list[n];
>>> - list_for_each_entry_safe(tmp, next, list, batch_pool_link) {
>>> + list_for_each_entry(tmp, list, batch_pool_link) {
>>> /* The batches are strictly LRU ordered */
>>> if (!i915_gem_active_is_idle(&tmp->last_read[pool->engine->id],
>>> &tmp->base.dev->struct_mutex))
>>> break;
>>> - /* While we're looping, do some clean up */
>>> - if (tmp->madv == __I915_MADV_PURGED) {
>>> - list_del(&tmp->batch_pool_link);
>>> - i915_gem_object_put(tmp);
>>> - continue;
>>> - }
>>> -
>>> if (tmp->base.size >= size) {
>>> obj = tmp;
>>> break;
>>> @@ -132,19 +125,16 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
>>> }
>>> if (obj == NULL) {
>>> - int ret;
>>> -
>>> - obj = i915_gem_object_create(&pool->engine->i915->drm, size);
>>> + obj = i915_gem_object_create_internal(&pool->engine->i915->drm,
>>> + size);
>>> if (IS_ERR(obj))
>>> return obj;
>>> -
>>> - ret = i915_gem_object_get_pages(obj);
>>> - if (ret)
>>> - return ERR_PTR(ret);
>>> -
>>> - obj->madv = I915_MADV_DONTNEED;
>>> }
>>> + ret = i915_gem_object_get_pages(obj);
>>> + if (ret)
>>> + return ERR_PTR(ret);
>>> +
>>> list_move_tail(&obj->batch_pool_link, list);
>>> i915_gem_object_pin_pages(obj);
>>> return obj;
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
>>> new file mode 100644
>>> index 000000000000..255b9232b8ef
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/i915/i915_gem_internal.c
>>> @@ -0,0 +1,162 @@
>>> +/*
>>> + * Copyright © 2014-2016 Intel Corporation
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a
>>> + * copy of this software and associated documentation files (the "Software"),
>>> + * to deal in the Software without restriction, including without limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice (including the next
>>> + * paragraph) shall be included in all copies or substantial portions of the
>>> + * Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>>> + * IN THE SOFTWARE.
>>> + *
>>> + */
>>> +
>>> +#include <drm/drmP.h>
>>> +#include <drm/i915_drm.h>
>>> +#include "i915_drv.h"
>>> +
>>> +#define QUIET (__GFP_NORETRY | __GFP_NOWARN)
>>> +
>>> +static void internal_free_pages(struct sg_table *st)
>>> +{
>>> + struct scatterlist *sg;
>>> +
>>> + for (sg = st->sgl; sg; sg = __sg_next(sg))
>>> + __free_pages(sg_page(sg), get_order(sg->length));
>> Fragile since wont work together with coalescing, which I am sad to
>> see not implemented. For some reason it makes me feel real good when
>> it is there.
> We do the coalescing.
Where? It is not in this patch? One allocation == one st->nents++. If
the next allocation is contiguous there is no coalescing.
>>> + do {
>>> + page = alloc_pages(gfp | (order ? QUIET : 0), order);
>>> + if (page)
>>> + break;
>>> + if (!order--)
>>> + goto err;
>>> + } while (1);
>> Feels like it could hammer hard on a fragmented system, I have big
>> concerns about this.
> You did notice that these were marked as temporary allocations (because
> they are reclaimed under memory pressure), and on an already fragemented
> system we do the right thing anyway?
You try smaller and then for the next allocation start from the initial
order which did not work previously.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list