[Intel-gfx] [RFC 2/8] drm/i915: Add mechanism for forwarding the timestamp data through perf

Wed Aug 5 02:55:06 PDT 2015

On Wed, Aug 05, 2015 at 11:25:38AM +0530, sourab.gupta at intel.com wrote:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 66f9ee9..08235582 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1676,6 +1676,13 @@ struct i915_oa_rcs_node {
>  	u32 tag;
>  };
>  
> +struct i915_gen_pmu_node {
> +	struct list_head head;

This not the head of the list, but a node.

> +	struct drm_i915_gem_request *req;

Use request since this is a less often used member name and brevity
isn't saving thousands of keystrokes.

> +	u32 offset;
> +	u32 ctx_id;
> +};

> +static int i915_gen_pmu_wait_gpu(struct drm_i915_private *dev_priv)
> +{
> +	struct i915_gen_pmu_node *last_entry = NULL;
> +	int ret;
> +
> +	/*
> +	 * Wait for the last scheduled request to complete. This would
> +	 * implicitly wait for the prior submitted requests. The refcount
> +	 * of the requests is not decremented here.
> +	 */
> +	spin_lock(&dev_priv->gen_pmu.lock);
> +
> +	if (!list_empty(&dev_priv->gen_pmu.node_list)) {
> +		last_entry = list_last_entry(&dev_priv->gen_pmu.node_list,
> +			struct i915_gen_pmu_node, head);
> +	}
> +	spin_unlock(&dev_priv->gen_pmu.lock);

Because you issue requests on all rings that are not actually
serialised, the order of writes and retirements are also not serialised,
i.e. this does not do a complete wait for all activity on the object.

>  static void forward_gen_pmu_snapshots(struct drm_i915_private *dev_priv)
>  {
> -	WARN_ON(!dev_priv->gen_pmu.buffer.addr);
> +	struct i915_gen_pmu_node *entry, *next;
> +	LIST_HEAD(deferred_list_free);
> +	int ret;
>  
> -	/* TODO: routine for forwarding snapshots to userspace */
> +	list_for_each_entry_safe
> +		(entry, next, &dev_priv->gen_pmu.node_list, head) {
> +		if (!i915_gem_request_completed(entry->req, true))
> +			break;

Again the list is not actually in retirement order since you combine
multiple rings into one list.

These problems magically disappear with a list per-ring and a page
per-ring. You also need to be more careful with overwriting unflushed
entries. A dynamic approach to page allocation overcomes that.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre