[Intel-gfx] [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.

Wed Jun 17 04:51:14 PDT 2015

On Tue, Jun 16, 2015 at 04:54:14PM +0100, Tomas Elf wrote:
> On 16/06/2015 14:49, Daniel Vetter wrote:
> >On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> >>In preparation for per-engine reset add way for setting context reset stats.
> >>
> >>OPEN QUESTIONS:
> >>1. How do we deal with get_reset_stats and the GL robustness interface when
> >>introducing per-engine resets?
> >>
> >>	a. Do we set context that cause per-engine resets as guilty? If so, how
> >>	does this affect context banning?
> >>
> >>	b. Do we extend the publically available reset stats to also contain
> >>	per-engine reset statistics? If so, would this break the ABI?
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf at intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
> >>  drivers/gpu/drm/i915/i915_gem.c |   14 ++++++++++++++
> >>  2 files changed, 16 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index 47be4a5..ab5dfdc 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -2781,6 +2781,8 @@ static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
> >>  }
> >>
> >>  void i915_gem_reset(struct drm_device *dev);
> >>+void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> >>+			   struct intel_engine_cs *engine);
> >>  bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
> >>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> >>  int __must_check i915_gem_init(struct drm_device *dev);
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index 8ce363a..4c88e5c 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -2676,6 +2676,20 @@ void i915_gem_reset(struct drm_device *dev)
> >>  	i915_gem_restore_fences(dev);
> >>  }
> >>
> >>+void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> >>+			   struct intel_engine_cs *engine)
> >>+{
> >>+	u32 completed_seqno;
> >>+	struct drm_i915_gem_request *req;
> >>+
> >>+	completed_seqno = engine->get_seqno(engine, false);
> >>+
> >>+	/* Find pending batch buffers and mark them as such  */
> >>+	list_for_each_entry(req, &engine->request_list, list)
> >>+	        if (req && (req->seqno > completed_seqno))
> >>+	                i915_set_reset_status(dev_priv, req->ctx, false);
> >>+}
> >
> >Please don't add dead code since it makes it impossible to review the
> >patch without peeking ahead. And that makes the split-up useless - the
> >point of splitting patches it to make review easier by presenting
> >logically self-contained small changes, not to evenly spread out changes
> >across a lot of mails.
> >-Daniel
> 
> I did actually split this out from the main TDR patch (drm/i915: Adding TDR
> / per-engine reset support for gen8) by mistake. But since this is a RFC
> series, which I thought had the purpose of acting as material for a design
> discussion rather than serving as an actual code submission, I didn't spend
> too much time thinking about splitting the patch series into sensible
> chunks. If that is a problem I would expect people to take issue with the
> fact that e.g. the main TDR patch is a huge, monolithic chunk of code
> spanning more than 2000 lines. Obviously, that will be subdivided sensibly
> at a later stage and the code in this patch mail will be properly associated
> with the calling code.
> 
> Is it ok if we leave the patch subdivision discussion to after the initial
> RFC stage or how do these things typically work at this point in the
> process?

No need to resend, was just boilerplate comment from your maintainer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch