[PATCH v3 1/5] drm/i915/gem: Implement object migration
Ruhl, Michael J
michael.j.ruhl at intel.com
Mon Jun 28 19:50:18 UTC 2021
>-----Original Message-----
>From: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>Sent: Monday, June 28, 2021 3:03 PM
>To: Ruhl, Michael J <michael.j.ruhl at intel.com>; intel-
>gfx at lists.freedesktop.org; dri-devel at lists.freedesktop.org
>Cc: Auld, Matthew <matthew.auld at intel.com>; lkp <lkp at intel.com>
>Subject: Re: [PATCH v3 1/5] drm/i915/gem: Implement object migration
>
>
>On 6/28/21 8:11 PM, Ruhl, Michael J wrote:
>>> -----Original Message-----
>>> From: dri-devel <dri-devel-bounces at lists.freedesktop.org> On Behalf Of
>>> Thomas Hellström
>>> Sent: Monday, June 28, 2021 10:46 AM
>>> To: intel-gfx at lists.freedesktop.org; dri-devel at lists.freedesktop.org
>>> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>; Auld,
>Matthew
>>> <matthew.auld at intel.com>; lkp <lkp at intel.com>
>>> Subject: [PATCH v3 1/5] drm/i915/gem: Implement object migration
>>>
>>> Introduce an interface to migrate objects between regions.
>>> This is primarily intended to migrate objects to LMEM for display and
>>> to SYSTEM for dma-buf, but might be reused in one form or another for
>>> performance-based migration.
>>>
>>> v2:
>>> - Verify that the memory region given as an id really exists.
>>> (Reported by Matthew Auld)
>>> - Call i915_gem_object_{init,release}_memory_region() when switching
>>> region
>>> to handle also switching region lists. (Reported by Matthew Auld)
>>> v3:
>>> - Fix i915_gem_object_can_migrate() to return true if object is already in
>>> the correct region, even if the object ops doesn't have a migrate()
>>> callback.
>>> - Update typo in commit message.
>>> - Fix kerneldoc of i915_gem_object_wait_migration().
>>>
>>> Reported-by: kernel test robot <lkp at intel.com>
>>> Signed-off-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>>> ---
>>> drivers/gpu/drm/i915/gem/i915_gem_object.c | 96
>>> +++++++++++++++++++
>>> drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +++
>>> .../gpu/drm/i915/gem/i915_gem_object_types.h | 9 ++
>>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 69 +++++++++----
>>> drivers/gpu/drm/i915/gem/i915_gem_wait.c | 19 ++++
>>> 5 files changed, 188 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> index 07e8ff9a8aae..1c18be067b58 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> @@ -513,6 +513,102 @@ bool i915_gem_object_has_iomem(const struct
>>> drm_i915_gem_object *obj)
>>> return obj->mem_flags & I915_BO_FLAG_IOMEM;
>>> }
>>>
>>> +/**
>>> + * i915_gem_object_can_migrate - Whether an object likely can be
>migrated
>>> + *
>>> + * @obj: The object to migrate
>>> + * @id: The region intended to migrate to
>>> + *
>>> + * Check whether the object backend supports migration to the
>>> + * given region. Note that pinning may affect the ability to migrate.
>>> + *
>>> + * Return: true if migration is possible, false otherwise.
>>> + */
>>> +bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
>>> + enum intel_region_id id)
>>> +{
>>> + struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>> + unsigned int num_allowed = obj->mm.n_placements;
>>> + struct intel_memory_region *mr;
>>> + unsigned int i;
>>> +
>>> + GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
>>> + GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
>>> +
>>> + mr = i915->mm.regions[id];
>>> + if (!mr)
>>> + return false;
>>> +
>>> + if (obj->mm.region == mr)
>>> + return true;
>>> +
>>> + if (!i915_gem_object_evictable(obj))
>>> + return false;
>>> +
>>> + if (!obj->ops->migrate)
>>> + return false;
>>> +
>>> + if (!(obj->flags & I915_BO_ALLOC_USER))
>>> + return true;
>>> +
>>> + if (num_allowed == 0)
>>> + return false;
>>> +
>>> + for (i = 0; i < num_allowed; ++i) {
>>> + if (mr == obj->mm.placements[i])
>>> + return true;
>>> + }
>> Hi Thomas,
>>
>> I am a little confused over the difference in checks between this function
>> and i915_gem_object_migrate().
>>
>> Why is the lack of an mr a BUG_ON in _object_migrate(), but here it just
>> false?
>>
>> So that means that under certain circumstances, you could not have a mr?
>>
>> If that is the case, when?
>>
>> Would that be when the I915_BO_ALLOC_USER is set?
>>
>> If so, should there be a check for "non" user vs user?
>>
>> Or maybe document this function pointing out why there are differences
>> and why?
>
>Yes, I'll give it some more documentation. The basic idea is that the
>above function also could be
>used to validate user supplied data, whereas there might be cases where
>we want to use the gem_object_migrate() function and override the above.
>
>
>>
>>> + return false;
>>> +}
>>> +
>>> +/**
>>> + * i915_gem_object_migrate - Migrate an object to the desired region id
>>> + * @obj: The object to migrate.
>>> + * @ww: An optional struct i915_gem_ww_ctx. If NULL, the backend may
>>> + * not be successful in evicting other objects to make room for this object.
>> Is the ww for future consideration? (I don't see any use of it in the patch).
>
>Yes, but it will remain optional.
>
>
>>
>>> + * @id: The region id to migrate to.
>>> + *
>>> + * Attempt to migrate the object to the desired memory region. The
>>> + * object backend must support migration and the object may not be
>>> + * pinned, (explicitly pinned pages or pinned vmas). The object must
>>> + * be locked.
>>> + * On successful completion, the object will have pages pointing to
>>> + * memory in the new region, but an async migration task may not have
>>> + * completed yet, and to accomplish that,
>>> i915_gem_object_wait_migration()
>>> + * must be called.
>>> + *
>>> + * Return: 0 on success. Negative error code on failure. In particular may
>>> + * return -ENXIO on lack of region space, -EDEADLK for deadlock
>avoidance
>>> + * if @ww is set, -EINTR or -ERESTARTSYS if signal pending, and
>>> + * -EBUSY if the object is pinned.
>>> + */
>>> +int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
>>> + struct i915_gem_ww_ctx *ww,
>>> + enum intel_region_id id)
>>> +{
>>> + struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>> + struct intel_memory_region *mr;
>>> +
>>> + GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
>>> + GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
>>> + assert_object_held(obj);
>>> +
>>> + mr = i915->mm.regions[id];
>>> + GEM_BUG_ON(!mr);
>>> +
>>> + if (obj->mm.region == mr)
>>> + return 0;
>>> +
>>> + if (!i915_gem_object_evictable(obj))
>>> + return -EBUSY;
>>> +
>>> + if (!obj->ops->migrate)
>>> + return -EOPNOTSUPP;
>> Why aren't you using _can_migrate here?
>
>It's just in case we want to override. I'll add some more comments about
>this.
>
>>
>>> + return obj->ops->migrate(obj, mr);
>>> +}
>>> +
>>> void i915_gem_init__objects(struct drm_i915_private *i915)
>>> {
>>> INIT_WORK(&i915->mm.free_work, __i915_gem_free_work);
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> index ea3224a480c4..8cbd7a5334e2 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> @@ -17,6 +17,8 @@
>>> #include "i915_gem_ww.h"
>>> #include "i915_vma_types.h"
>>>
>>> +enum intel_region_id;
>>> +
>>> /*
>>> * XXX: There is a prevalence of the assumption that we fit the
>>> * object's page count inside a 32bit _signed_ variable. Let's document
>>> @@ -597,6 +599,16 @@ bool i915_gem_object_migratable(struct
>>> drm_i915_gem_object *obj);
>>>
>>> bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object
>>> *obj);
>>>
>>> +int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
>>> + struct i915_gem_ww_ctx *ww,
>>> + enum intel_region_id id);
>>> +
>>> +bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
>>> + enum intel_region_id id);
>>> +
>>> +int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
>>> + unsigned int flags);
>>> +
>>> #ifdef CONFIG_MMU_NOTIFIER
>>> static inline bool
>>> i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> index 441f913c87e6..ef3de2ae9723 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> @@ -18,6 +18,7 @@
>>>
>>> struct drm_i915_gem_object;
>>> struct intel_fronbuffer;
>>> +struct intel_memory_region;
>>>
>>> /*
>>> * struct i915_lut_handle tracks the fast lookups from handle to vma used
>>> @@ -77,6 +78,14 @@ struct drm_i915_gem_object_ops {
>>> * delayed_free - Override the default delayed free implementation
>>> */
>>> void (*delayed_free)(struct drm_i915_gem_object *obj);
>>> +
>>> + /**
>>> + * migrate - Migrate object to a different region either for
>>> + * pinning or for as long as the object lock is held.
>>> + */
>>> + int (*migrate)(struct drm_i915_gem_object *obj,
>>> + struct intel_memory_region *mr);
>>> +
>>> void (*release)(struct drm_i915_gem_object *obj);
>>>
>>> const struct vm_operations_struct *mmap_ops;
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> index c39d982c4fa6..8f89185b6507 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> @@ -617,7 +617,8 @@ struct ttm_device_funcs *i915_ttm_driver(void)
>>> return &i915_ttm_bo_driver;
>>> }
>>>
>>> -static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
>>> +static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
>>> + struct ttm_placement *placement)
>>> {
>>> struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>>> struct ttm_operation_ctx ctx = {
>>> @@ -625,19 +626,12 @@ static int i915_ttm_get_pages(struct
>>> drm_i915_gem_object *obj)
>>> .no_wait_gpu = false,
>>> };
>>> struct sg_table *st;
>>> - struct ttm_place requested, busy[I915_TTM_MAX_PLACEMENTS];
>>> - struct ttm_placement placement;
>>> int real_num_busy;
>>> int ret;
>>>
>>> - GEM_BUG_ON(obj->mm.n_placements >
>>> I915_TTM_MAX_PLACEMENTS);
>>> -
>>> - /* Move to the requested placement. */
>>> - i915_ttm_placement_from_obj(obj, &requested, busy, &placement);
>>> -
>>> /* First try only the requested placement. No eviction. */
>>> - real_num_busy =
>>> fetch_and_zero(&placement.num_busy_placement);
>>> - ret = ttm_bo_validate(bo, &placement, &ctx);
>>> + real_num_busy = fetch_and_zero(&placement-
>>>> num_busy_placement);
>>> + ret = ttm_bo_validate(bo, placement, &ctx);
>>> if (ret) {
>>> ret = i915_ttm_err_to_gem(ret);
>>> /*
>>> @@ -652,8 +646,8 @@ static int i915_ttm_get_pages(struct
>>> drm_i915_gem_object *obj)
>>> * If the initial attempt fails, allow all accepted placements,
>>> * evicting if necessary.
>>> */
>>> - placement.num_busy_placement = real_num_busy;
>>> - ret = ttm_bo_validate(bo, &placement, &ctx);
>>> + placement->num_busy_placement = real_num_busy;
>>> + ret = ttm_bo_validate(bo, placement, &ctx);
>>> if (ret)
>>> return i915_ttm_err_to_gem(ret);
>>> }
>>> @@ -668,16 +662,56 @@ static int i915_ttm_get_pages(struct
>>> drm_i915_gem_object *obj)
>>> i915_ttm_adjust_gem_after_move(obj);
>>> }
>>>
>>> - /* Object either has a page vector or is an iomem object */
>>> - st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
>>> - if (IS_ERR(st))
>>> - return PTR_ERR(st);
>>> + if (!obj->mm.pages) {
>>> + /* Object either has a page vector or is an iomem object */
>>> + st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
>>>> ttm.cached_io_st;
>>> + if (IS_ERR(st))
>>> + return PTR_ERR(st);
>>>
>>> - __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
>>> + __i915_gem_object_set_pages(obj, st,
>>> i915_sg_dma_sizes(st->sgl));
>>> + }
>>>
>>> return ret;
>>> }
>>>
>>> +static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
>>> +{
>>> + struct ttm_place requested, busy[I915_TTM_MAX_PLACEMENTS];
>>> + struct ttm_placement placement;
>>> +
>>> + GEM_BUG_ON(obj->mm.n_placements >
>>> I915_TTM_MAX_PLACEMENTS);
>>> +
>>> + /* Move to the requested placement. */
>>> + i915_ttm_placement_from_obj(obj, &requested, busy, &placement);
>>> +
>>> + return __i915_ttm_get_pages(obj, &placement);
>>> +}
>>> +
>>> +static int i915_ttm_migrate(struct drm_i915_gem_object *obj,
>>> + struct intel_memory_region *mr)
>>> +{
>>> + struct ttm_place requested;
>>> + struct ttm_placement placement;
>>> + int ret;
>>> +
>>> + i915_ttm_place_from_region(mr, &requested, obj->flags);
>>> + placement.num_placement = 1;
>>> + placement.num_busy_placement = 1;
>>> + placement.placement = &requested;
>>> + placement.busy_placement = &requested;
>>> +
>>> + ret = __i915_ttm_get_pages(obj, &placement);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (obj->mm.region != mr) {
>>> + i915_gem_object_release_memory_region(obj);
>>> + i915_gem_object_init_memory_region(obj, mr);
>>> + }
>> Perhaps a minor nit:
>>
>> Doing this after we have done the _get_pages() just doesn't seem right.
>>
>> I.e. we do work on the object, and then we init some portion of it.
>>
>> Do we need to do this incase the migration/placement fails? If so,
>> maybe a comment to that effect?
>
>This is simply switching memory region under the lock, and to also move
>to another memory region list. Is it the naming _release_ and _init_
>that is confusing?
Hmm, re-reading my question, I am wondering if I was unclear.
My "real" question was, can the release/init occur before the _get_pages()?
But looking at this some more, I can see answer is probably no.
I was going to suggest calling _init_ _set_, but when I looked at it, it was
doing init things as well as setting things
Maybe just a comment like:
/* Complete the migration by updating the memory region info. */
if (object->mm...)
?
M
>/Thomas
>
>
>>
>> Thanks,
>>
>> Mike
>>
>>> + return 0;
>>> +}
>>> +
>>> static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
>>> struct sg_table *st)
>>> {
>>> @@ -814,6 +848,7 @@ static const struct drm_i915_gem_object_ops
>>> i915_gem_ttm_obj_ops = {
>>> .truncate = i915_ttm_purge,
>>> .adjust_lru = i915_ttm_adjust_lru,
>>> .delayed_free = i915_ttm_delayed_free,
>>> + .migrate = i915_ttm_migrate,
>>> .mmap_offset = i915_ttm_mmap_offset,
>>> .mmap_ops = &vm_ops_ttm,
>>> };
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> index 1070d3afdce7..f909aaa09d9c 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> @@ -290,3 +290,22 @@ i915_gem_wait_ioctl(struct drm_device *dev, void
>>> *data, struct drm_file *file)
>>> i915_gem_object_put(obj);
>>> return ret;
>>> }
>>> +
>>> +/**
>>> + * i915_gem_object_wait_migration - Sync an accelerated migration
>>> operation
>>> + * @obj: The migrating object.
>>> + * @flags: waiting flags. Currently supports only
>I915_WAIT_INTERRUPTIBLE.
>>> + *
>>> + * Wait for any pending async migration operation on the object,
>>> + * whether it's explicitly (i915_gem_object_migrate()) or implicitly
>>> + * (swapin, initial clearing) initiated.
>>> + *
>>> + * Return: 0 if successful, -ERESTARTSYS if a signal was hit during waiting.
>>> + */
>>> +int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
>>> + unsigned int flags)
>>> +{
>>> + might_sleep();
>>> + /* NOP for now. */
>>> + return 0;
>>> +}
>>> --
>>> 2.31.1
More information about the dri-devel
mailing list