[Intel-gfx] [PATCH 50/64] drm/i915: Prepare i915_gem_active for annotations

Wed Jul 13 15:58:50 UTC 2016

On Wed, Jul 13, 2016 at 04:40:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 07/07/16 09:41, Chris Wilson wrote:
> >In the future, we will want to add annotations to the i915_gem_active
> >struct. The API is thus expanded to hide direct access to the contents
> >of i915_gem_active and mediated instead through a number of helpers.
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >---
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 5f302faf86e7..9c371e84b1bb 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -1349,27 +1349,30 @@ int
> >  i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
> >  			       bool readonly)
> >  {
> >+	struct drm_i915_gem_request *request;
> >  	struct reservation_object *resv;
> >  	int ret, i;
> >
> >  	if (readonly) {
> >-		if (obj->last_write.request) {
> >-			ret = i915_wait_request(obj->last_write.request);
> >+		request = i915_gem_active_peek(&obj->last_write);
> 
> Why not get_request since you have get_engine? Or later there will
> be get_request with different semantics?

It wasn't meant to be adding the reference...

> >@@ -1428,20 +1431,20 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
> >  	if (readonly) {
> >  		struct drm_i915_gem_request *req;
> >
> >-		req = obj->last_write.request;
> >+		req = i915_gem_active_peek(&obj->last_write);
> >  		if (req == NULL)
> >  			return 0;
> >
> >-		requests[n++] = i915_gem_request_get(req);
> >+		requests[n++] = req;
> 
> It used to take a reference and now it doesn't.

and this should still take the reference...
This patch wasn't meant to be making such changes, only translating
between functions.

> >  	} else {
> >  		for (i = 0; i < I915_NUM_ENGINES; i++) {
> >  			struct drm_i915_gem_request *req;
> >
> >-			req = obj->last_read[i].request;
> >+			req = i915_gem_active_peek(&obj->last_read[i]);
> >  			if (req == NULL)
> >  				continue;
> >
> >-			requests[n++] = i915_gem_request_get(req);
> >+			requests[n++] = req;
> >  		}
> >  	}
> >
> >@@ -2383,25 +2386,27 @@ void i915_vma_move_to_active(struct i915_vma *vma,
> >  static void
> >  i915_gem_object_retire__write(struct drm_i915_gem_object *obj)
> >  {
> >-	GEM_BUG_ON(!obj->last_write.request);
> >-	GEM_BUG_ON(!(obj->active & intel_engine_flag(obj->last_write.request->engine)));
> >+	GEM_BUG_ON(!__i915_gem_active_is_busy(&obj->last_write));
> >+	GEM_BUG_ON(!(obj->active & intel_engine_flag(i915_gem_active_get_engine(&obj->last_write))));
> >
> >-	i915_gem_request_assign(&obj->last_write.request, NULL);
> >+	i915_gem_active_set(&obj->last_write, NULL);
> 
> Aha!

Drat. Didn't think I did that...

Oh well, no excuses now but to go back in time and make the change
earlier. It does get removed eventually!

> >@@ -2822,20 +2827,21 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> >  	i915_gem_object_put(obj);
> >
> >  	for (i = 0; i < I915_NUM_ENGINES; i++) {
> >-		if (!obj->last_read[i].request)
> >-			continue;
> >+		struct drm_i915_gem_request *req;
> >
> >-		req[n++] = i915_gem_request_get(obj->last_read[i].request);
> >+		req = i915_gem_active_get(&obj->last_read[i]);
> 
> Oh right there is a get request one, this time preserving the
> reference taking behaviour.

Yup, you spotted my mix up earlier.

> >+/**
> >+ * i915_gem_active_get - return a reference to the active request
> >+ * @active - the active tracker
> >+ *
> >+ * i915_gem_active_get() returns a reference to the active request, or NULL
> >+ * if the active tracker is idle. The caller must hold struct_mutex.
> >+ */
> >+static inline struct drm_i915_gem_request *
> >+i915_gem_active_get(const struct i915_gem_active *active)
> >+{
> >+	struct drm_i915_gem_request *request;
> >+
> >+	request = i915_gem_active_peek(active);
> >+	if (!request || i915_gem_request_completed(request))
> 
> The check for request_completed feels like a hack - why it is needed?

Because retirement is lazy and the GPU runs asynchronously. We want to
whilst building an execbuf check whether or not the wait is required. If
we detect a wait we have to backout of an atomic section inside execbuf.
Whilst execbuf is more demanding, the shortcircuit applies equally in
many places. This becomes especially apparent when looking at the
request locklessly, and I wanted to keep this path equivalent.

> >+/**
> >+ * i915_gem_active_retire - waits until the request is retired
> >+ * @active - the active request on which to wait
> >+ *
> >+ * Unlike i915_gem_active_eait(), this i915_gem_active_retire() will
> 
> s/eait/wait/
> 
> >+ * make sure the request is retired before returning.
> >+ */
> >+static inline int __must_check
> >+i915_gem_active_retire(const struct i915_gem_active *active)
> >+{
> >+	return i915_gem_active_wait(active);
> >+}
> 
> But how does it ensure anything different than i915_gem_active_wait
> when it calls the very function? Maybe in future patches there will
> be a difference?

Right. It changes in the future, at the moment it is just trying to
illustrate the semantic difference - in some places the retirement is
required, but only of this active tracker, in others we need to retire
all requests up to this point, others simply need to know the request is
passed. 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre