[Intel-gfx] [PATCH 05/18] drm/i915: Move GEM activity tracking into a common struct reservation_object
Chris Wilson
chris at chris-wilson.co.uk
Fri Sep 16 11:40:14 UTC 2016
On Wed, Sep 14, 2016 at 07:52:37AM +0100, Chris Wilson wrote:
> In preparation to support many distinct timelines, we need to expand the
> activity tracking on the GEM object to handle more than just a request
> per engine. We already use the struct reservation_object on the dma-buf
> to handle many fence contexts, so integrating that into the GEM object
> itself is the preferred solution. (For example, we can now share the same
> reservation_object between every consumer/producer using this buffer and
> skip the manual import/export via dma-buf.)
>
> Caveats:
>
> * busy-ioctl: the modern interface is completely thrown away,
> regressing us back to before
>
> commit e9808edd98679680804dfbc42c5ee8f1aa91f617 [v3.10]
> Author: Chris Wilson <chris at chris-wilson.co.uk>
> Date: Wed Jul 4 12:25:08 2012 +0100
>
> drm/i915: Return a mask of the active rings in the high word of busy_ioctl
There is an alternative here, as we can walk through the obj->resv and
convert *i915* fences to their respective read/write engines.
However, this means that busy-ioctl and wait-ioctl(.timeout=0) are no
longer equivalent as that busy-ioctl may report an object as idle but
still has external rendering (and so wait-ioctl, set-to-domain-ioctl,
pread, pwrite may stall).
Walking the resv object is also more expensive than doing a simple test.
Though that can be mitigated later by opencoding the test (once we can
do the whole busy-ioctl under only RCU protection that makes more
sense). But it makes busy-ioctl slower than ideal, and busy-ioctl is
high frequency (so long as we keep avoiding struct_mutex, the difference
is likely only visible in the microbenchmarks).
Pro: we keep the finese of being able to report which engines are busy
Con: not equivalent to wait-ioctl, polling the dmabuf, etc.
That we have alternative mechanisms to check may be considered a good
thing, i.e. we can migrate to using wait-ioctl for the general am I
going to stall test, and using busy-ioctl for engine selection. I think
I am leaning towards keeping the ABI more or less intact for the time
being. Though we are getting pretty close to its limit on NUM_ENGINES.
int
i915_gem_busy_ioctl(struct drm_device *dev, void *data,
struct drm_file *file)
{
struct drm_i915_gem_busy *args = data;
struct drm_i915_gem_object *obj;
struct fence **shared, *excl;
unsigned int count, i;
int ret;
ret = -ENOENT;
obj = i915_gem_object_lookup(file, args->handle);
if (obj) {
ret = reservation_object_get_fences_rcu(obj->resv,
&excl, &count, &shared);
i915_gem_object_put_unlocked(obj);
}
if (unlikely(ret))
return ret;
/* A discrepancy here is that we do not report the status of
* non-i915 fences, i.e. even though we may report the object as idle,
* a call to set-domain may still stall waiting for foreign rendering.
* This also means that wait-ioctl may report an object as busy,
* where busy-ioctl considers it idle.
*
* We trade the ability to warn of foreign fences to report on which
* i915 engines are active for the object.
*
* Alternatively, we can trade that extra information on read/write
* activity with
* args->busy =
* !reservation_object_test_signaled_rcu(obj->resv, true);
* to report the overall busyness. This is what the wait-ioctl does.
*/
args->busy = 0;
/* Translate shared fences to READ set of engines */
for (i = 0; i < count; i++) {
args->busy |= busy_check_reader(shared[i]);
fence_put(shared[i]);
}
kfree(shared);
/* Translate the exclusive fence to the READ *and* WRITE engine */
if (excl) {
args->busy |= busy_check_writer(excl);
fence_put(excl);
}
return 0;
}
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list