[Intel-gfx] [RFCv2 03/12] drm/i915: Adding TDR / per-engine reset support for gen8.

Tomas Elf tomas.elf at intel.com
Tue Jul 21 06:58:46 PDT 2015


This change introduces support for TDR-style per-engine reset as a initial,
less intrusive hang recovery option to be attempted before falling back to the
legacy full GPU reset recovery mode if necessary. Initially we're only
supporting gen8 but adding support for gen7 is straight-forward since we've
already established an extensible framework where gen7 support can be plugged
in (add corresponding versions of intel_ring_enable, intel_ring_disable,
intel_ring_save, intel_ring_restore, etc.).

1. Per-engine recovery vs. Full GPU recovery

To capture the state of a single engine being detected as hung there is now a
new flag for every engine that can be set once the decision has been made to
schedule hang recovery for that particular engine.

The following algorithm is used to determine when to use which recovery mode:

	a. Once the hang check score reaches level HUNG hang recovery is
	scheduled as usual. The hang checker aggregates all engines currently
	detected as hung into a single engine flag mask and passes that to the
	error handler, which allows us to schedule hang recovery for all
	currently hung engines in a single call.

	b. The error handler checks all engines that have been marked as hung
	by the hang checker and - more specifically - checks how long ago it
	was since it last attempted to do per-engine hang recovery for each
	respective, currently hung engine. If the measured time period is
	within a certain time window, i.e. the last per-engine hang recovery
	was done too recently, it is determined that per-engine hang recovery
	is ineffective and the step is taken to promote a full GPU reset.

	c. If the error handler determines that no currently hung engine has
	recently had hang recovery a per-engine hang recovery is scheduled.

	d. Additionally, if the hang checker detects that the hang check score
	has grown too high (currently defined as twice the HUNG level) it
	determines that previous hang recovery attempts have failed for
	whatever reason and it will bypass the error checker full GPU reset
	promotion logic. One case where this is important is if the hang
	checker and error handler thinks that per-engine hang recovery is a
	suitable option and several such attempts are made - infrequently
	enough - but no effective reset is done, perhaps due to inconsistent
	context submission status, which is described further down below.

NOTE: Gen 7 and earlier will always promote to full GPU reset since there is
currently no per-engine reset support for these gens.

2. Context Submission Status Consistency.

Per-engine hang recovery on gen8 relies on the basic concept of context
submission status consistency. What this means is that we make sure that the
status of the hardware and the driver when it comes to the submission status of
the current context on any engine is consistent. For example, when submitting a
context to the corresponding ELSP port of an engine we expect the owning
request of that context to be at the head of the corresponding execution list
queue. Likewise, as long as the context is executing on the GPU we expect the
EXECLIST_STATUS register and the context status buffer to reflect this. Thus,
if the context submission status is consistent the ID of the currently
executing context should be in EXECLIST_STATUS and it should be consistent
with the context of the head request element in the execution list queue
corresponding to that engine.

The reason why this is important for per-engine hang recovery on gen8 is
because this recovery mode relies on context resubmission to resume execution
following the recovery. If a context has been determined to be hung and the
per-engine hang recovery mode is engaged leading to the resubmission of that
context it's important that the hardware is in fact not busy doing something
else or being idle since a resubmission during this state would cause unforseen
side-effects such as unexpected preemptions.

There are rare, although consistently reproducable, situations that have shown
up in practice where the driver and hardware are no longer consistent with each
other, e.g. due to lost context completion interrupts after which the hardware
would be idle but the driver would still think that a context would still be
active.

3. There is a new reset path for engine reset alongside the legacy full GPU
reset path. This path does the following:

	1) Check for context submission consistency to make sure that the
	context that the hardware is currently stuck on is actually what the
	driver is working on. If not then clearly we're not in a consistently
	hung state and we bail out early.

	2) Disable/idle the engine. This is done through reset handshaking on
	gen8+ unlike earlier gens where this was done by clearing the ring
	valid bits in MI_MODE and ring control registers, which are no longer
	supported on gen8+. Reset handshaking translates to setting the reset
	request bit in the reset control register.

	3) Save the current engine state.

	What this translates to on gen8 is simply to read the current value of
	the head register and nudge it so that it points to the next valid
	instruction in the ring buffer. Since we assume that the execution is
	currently stuck in a batch buffer the effect of this is that the
	batchbuffer start instruction of the hung batch buffer is skipped so
	that when execution resumes, following the hang recovery completion, it
	resumes immediately following the batch buffer.

	This effectively means that we're forcefully terminating the currently
	active, hung batch buffer. Obviously, the outcome of this intervention
	is potentially undefined but there are not many good options in this
	scenario. It's better than resetting the entire GPU in the vast
	majority of cases.

	Save the nudged head value to be applied later.

	4) Reset the engine.

	5) Apply the nudged head value to the head register.

	6) Reenable the engine. For gen8 this means resubmitting the fixed-up
	context, allowing execution to resume. In order to resubmit a context
	without relying on the currently hung execution list queues we use a
	privileged API that is dedicated for TDR use only. This submission API
	bypasses any currently queued work and gets exclusive access to the
	ELSP ports.

	7) If the engine hang recovery procedure fails at any point in between
	disablement and reenablement of the engine there is a back-off
	procedure: For gen8 it's possible to back out of the reset handshake by
	clearing the reset request bit in the reset control register.

NOTE:
It's possible that some of Ben Widawsky's original per-engine reset patches
from 3 years ago are in this commit but since this work has gone through the
hands of at least 3 people already any kind of ownership tracking has been lost
a long time ago. If you think that you should be on the sob list just let me
know.

* v2: (Chris Wilson / Daniel Vetter)
- Simply use the previously private function i915_gem_reset_ring_status() from
  the engine hang recovery path to set active/pending context status. This
  replicates the same behaviour as in full GPU reset but for a single,
  targetted engine.

- Remove all additional uevents for both full GPU reset and per-engine reset.
  Adapted uevent behaviour to the new per-engine hang recovery mode in that it
  will only send one uevent regardless of which form of recovery is employed.
  If a per-engine reset is attempted first then one uevent will be dispatched.
  If that recovery mode fails and the hang is promoted to a full GPU reset no
  further uevents will be dispatched at that point.

- Removed the 2*HUNG hang threshold from i915_hangcheck_elapsed in order to not
  make the hang detection algorithm too complicated. This threshold was
  introduced to compensate for the possibility that hang recovery might be
  delayed due to inconsistent context submission status that would prevent
  per-engine hang recovery from happening. In a later patch we introduce faked
  context event interrupts and inconsistency rectification at the onset of
  per-engine hang recovery instead of relying on the hang checker to do this
  for us. Therefore, since we do not delay and defer to future hang detections,
  we never allow hangs to go addressed beyond the HUNG threshold and
  therefore there is no need for any further thresholds.

- Tidied up the TDR context resubmission path in intel_lrc.c . Reduced the
  amount of duplication by relying entirely on the normal unqueue function.
  Added a new parameter to the unqueue function that takes into consideration
  if the unqueue call is for a first-time context submission or a resubmission
  and adapts the handling of elsp_submitted accordingly. The reason for this is
  that for context resubmission we don't expect any further interrupts for the
  submission or the following context completion. A more elegant way of
  handling this would be to phase out elsp_submitted altogether, however that's
  part of a LRC/execlist cleanup effort that is happening independently of this
  RFC. For now we make this change as simple as possible with as few
  non-TDR-related side-effects as possible.

Signed-off-by: Tomas Elf <tomas.elf at intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery at intel.com>
Signed-off-by: Ian Lister <ian.lister at intel.com>
Cc: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |   2 +-
 drivers/gpu/drm/i915/i915_dma.c         |  18 ++
 drivers/gpu/drm/i915/i915_drv.c         | 198 ++++++++++++
 drivers/gpu/drm/i915/i915_drv.h         |  63 +++-
 drivers/gpu/drm/i915/i915_irq.c         | 199 ++++++++++--
 drivers/gpu/drm/i915/i915_params.c      |  10 +
 drivers/gpu/drm/i915/i915_reg.h         |   6 +
 drivers/gpu/drm/i915/intel_lrc.c        | 556 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.h        |  14 +
 drivers/gpu/drm/i915/intel_lrc_tdr.h    |  36 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  84 ++++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  64 ++++
 drivers/gpu/drm/i915/intel_uncore.c     | 199 ++++++++++++
 13 files changed, 1400 insertions(+), 49 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8446ef4..e33e105 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4183,7 +4183,7 @@ i915_wedged_set(void *data, u64 val)
 
 	intel_runtime_pm_get(dev_priv);
 
-	i915_handle_error(dev, val,
+	i915_handle_error(dev, 0x0, val,
 			  "Manually setting wedged to %llu", val);
 
 	intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index e44116f..cf01e84 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -776,6 +776,22 @@ static void intel_device_info_runtime_init(struct drm_device *dev)
 			 info->has_eu_pg ? "y" : "n");
 }
 
+static void
+i915_hangcheck_init(struct drm_device *dev)
+{
+	int i;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		struct intel_engine_cs *engine = &dev_priv->ring[i];
+		struct intel_ring_hangcheck *hc = &engine->hangcheck;
+
+		i915_hangcheck_reinit(engine);
+		hc->reset_count = 0;
+		hc->tdr_count = 0;
+	}
+}
+
 /**
  * i915_driver_load - setup chip and create an initial config
  * @dev: DRM device
@@ -956,6 +972,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_gem_load(dev);
 
+	i915_hangcheck_init(dev);
+
 	/* On the 945G/GM, the chipset reports the MSI capability on the
 	 * integrated graphics even though the support isn't actually there
 	 * according to the published specs.  It doesn't appear to function
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c3fdbb0..c7ba64e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -34,6 +34,7 @@
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_lrc_tdr.h"
 
 #include <linux/console.h>
 #include <linux/module.h>
@@ -581,6 +582,7 @@ static int i915_drm_suspend(struct drm_device *dev)
 	struct drm_crtc *crtc;
 	pci_power_t opregion_target_state;
 	int error;
+	int i;
 
 	/* ignore lid events during suspend */
 	mutex_lock(&dev_priv->modeset_restore_lock);
@@ -602,6 +604,16 @@ static int i915_drm_suspend(struct drm_device *dev)
 		return error;
 	}
 
+	/*
+	 * Clear any pending reset requests. They should be picked up
+	 * after resume when new work is submitted
+	 */
+	for (i = 0; i < I915_NUM_RINGS; i++)
+		atomic_set(&dev_priv->ring[i].hangcheck.flags, 0);
+
+	atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG,
+		&dev_priv->gpu_error.reset_counter);
+
 	intel_suspend_gt_powersave(dev);
 
 	/*
@@ -905,6 +917,192 @@ int i915_reset(struct drm_device *dev)
 	return 0;
 }
 
+/**
+ * i915_reset_engine - reset GPU engine after a hang
+ * @engine: engine to reset
+ *
+ * Reset a specific GPU engine. Useful if a hang is detected. Returns zero on successful
+ * reset or otherwise an error code.
+ *
+ * Procedure is fairly simple:
+ *
+ *	- Force engine to idle.
+ *
+ *	- Save current head register value and nudge it past the point of the hang in the
+ *	  ring buffer, which is typically the BB_START instruction of the hung batch buffer,
+ *	  on to the following instruction.
+ *
+ *	- Reset engine.
+ *
+ *	- Restore the previously saved, nudged head register value.
+ *
+ *	- Re-enable engine to resume running. On gen8 this requires the previously hung
+ *	  context to be resubmitted to ELSP via the dedicated TDR-execlists interface.
+ *
+ */
+int i915_reset_engine(struct intel_engine_cs *engine)
+{
+	struct drm_device *dev = engine->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_request *current_request = NULL;
+	uint32_t head;
+	bool force_advance = false;
+	int ret = 0;
+	int err_ret = 0;
+
+	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
+
+        /* Take wake lock to prevent power saving mode */
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
+	i915_gem_reset_ring_status(dev_priv, engine);
+
+	if (i915.enable_execlists) {
+		enum context_submission_status status =
+			intel_execlists_TDR_get_current_request(engine, NULL);
+
+		/*
+		 * If the hardware and driver states do not coincide
+		 * or if there for some reason is no current context
+		 * in the process of being submitted then bail out and
+		 * try again. Do not proceed unless we have reliable
+		 * current context state information.
+		 */
+		if (status != CONTEXT_SUBMISSION_STATUS_OK) {
+			ret = -EAGAIN;
+			goto reset_engine_error;
+		}
+	}
+
+	ret = intel_ring_disable(engine);
+	if (ret != 0) {
+		DRM_ERROR("Failed to disable %s\n", engine->name);
+		goto reset_engine_error;
+	}
+
+	if (i915.enable_execlists) {
+		enum context_submission_status status;
+		bool inconsistent;
+
+		status = intel_execlists_TDR_get_current_request(engine,
+				&current_request);
+
+		inconsistent = (status != CONTEXT_SUBMISSION_STATUS_OK);
+		if (inconsistent) {
+			/*
+			 * If we somehow have reached this point with
+			 * an inconsistent context submission status then
+			 * back out of the previously requested reset and
+			 * retry later.
+			 */
+			WARN(inconsistent,
+			     "Inconsistent context status on %s: %u\n",
+			     engine->name, status);
+
+			ret = -EAGAIN;
+			goto reenable_reset_engine_error;
+		}
+	}
+
+	/* Sample the current ring head position */
+	head = I915_READ_HEAD(engine) & HEAD_ADDR;
+
+	if (head == engine->hangcheck.last_head) {
+		/*
+		 * The engine has not advanced since the last
+		 * time it hung so force it to advance to the
+		 * next QWORD. In most cases the engine head
+		 * pointer will automatically advance to the
+		 * next instruction as soon as it has read the
+		 * current instruction, without waiting for it
+		 * to complete. This seems to be the default
+		 * behaviour, however an MBOX wait inserted
+		 * directly to the VCS/BCS engines does not behave
+		 * in the same way, instead the head pointer
+		 * will still be pointing at the MBOX instruction
+		 * until it completes.
+		 */
+		force_advance = true;
+	}
+
+	engine->hangcheck.last_head = head;
+
+	ret = intel_ring_save(engine, current_request, force_advance);
+	if (ret) {
+		DRM_ERROR("Failed to save %s engine state\n", engine->name);
+		goto reenable_reset_engine_error;
+	}
+
+	ret = intel_gpu_engine_reset(engine);
+	if (ret) {
+		DRM_ERROR("Failed to reset %s\n", engine->name);
+		goto reenable_reset_engine_error;
+	}
+
+	ret = intel_ring_restore(engine, current_request);
+	if (ret) {
+		DRM_ERROR("Failed to restore %s engine state\n", engine->name);
+		goto reenable_reset_engine_error;
+	}
+
+	/* Correct driver state */
+	intel_gpu_engine_reset_resample(engine, current_request);
+
+	/*
+	 * Reenable engine
+	 *
+	 * In execlist mode on gen8+ this is implicit by simply resubmitting
+	 * the previously hung context. In ring buffer submission mode on gen7
+	 * and earlier we need to actively turn on the engine first.
+	 */
+	if (i915.enable_execlists)
+		intel_execlists_TDR_context_resubmission(engine);
+	else
+		ret = intel_ring_enable(engine);
+
+	if (ret) {
+		DRM_ERROR("Failed to enable %s again after reset\n",
+			engine->name);
+
+		goto reset_engine_error;
+	}
+
+	/* Clear reset flags to allow future hangchecks */
+	atomic_set(&engine->hangcheck.flags, 0);
+
+	/* Wake up anything waiting on this engine's queue */
+	wake_up_all(&engine->irq_queue);
+
+	if (i915.enable_execlists && current_request)
+		i915_gem_request_unreference(current_request);
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	return ret;
+
+reenable_reset_engine_error:
+
+	err_ret = intel_ring_enable(engine);
+	if (err_ret)
+		DRM_ERROR("Failed to reenable %s following error during reset (%d)\n",
+			engine->name, err_ret);
+
+reset_engine_error:
+
+	/* Clear reset flags to allow future hangchecks */
+	atomic_set(&engine->hangcheck.flags, 0);
+
+	/* Wake up anything waiting on this engine's queue */
+	wake_up_all(&engine->irq_queue);
+
+	if (i915.enable_execlists && current_request)
+		i915_gem_request_unreference(current_request);
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	return ret;
+}
+
 static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct intel_device_info *intel_info =
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c32c502..be4c95c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2280,6 +2280,48 @@ struct drm_i915_cmd_table {
 	int count;
 };
 
+/*
+ * Context submission status
+ *
+ * CONTEXT_SUBMISSION_STATUS_OK:
+ *	Context submitted to ELSP and state of execlist queue is the same as
+ *	the state of EXECLIST_STATUS register. Software and hardware states
+ *	are consistent and can be trusted.
+ *
+ * CONTEXT_SUBMISSION_STATUS_INCONSISTENT:
+ *	Context has been submitted to the execlist queue but the state of the
+ *	EXECLIST_STATUS register is different from the execlist queue state.
+ *	This could mean any of the following:
+ *
+ *		1. The context is in the head position of the execlist queue
+ *		   but has not yet been submitted to ELSP.
+ *
+ *		2. The hardware just recently completed the context but the
+ *		   context is pending removal from the execlist queue.
+ *
+ *		3. The driver has lost a context state transition interrupt.
+ *		   Typically what this means is that hardware has completed and
+ *		   is now idle but the driver thinks the hardware is still
+ *		   busy.
+ *
+ *	Overall what this means is that the context submission status is
+ *	currently in transition and cannot be trusted until it settles down.
+ *
+ * CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED:
+ *	No context submitted to the execlist queue and the EXECLIST_STATUS
+ *	register shows no context being processed.
+ *
+ * CONTEXT_SUBMISSION_STATUS_NONE_UNDEFINED:
+ *	Initial state before submission status has been determined.
+ *
+ */
+enum context_submission_status {
+	CONTEXT_SUBMISSION_STATUS_OK = 0,
+	CONTEXT_SUBMISSION_STATUS_INCONSISTENT,
+	CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED,
+	CONTEXT_SUBMISSION_STATUS_UNDEFINED
+};
+
 /* Note that the (struct drm_i915_private *) cast is just to shut up gcc. */
 #define __I915__(p) ({ \
 	struct drm_i915_private *__p; \
@@ -2478,6 +2520,7 @@ struct i915_params {
 	int enable_ips;
 	int invert_brightness;
 	int enable_cmd_parser;
+	unsigned int gpu_reset_promotion_time;
 	/* leave bools at the end to not create holes */
 	bool enable_hangcheck;
 	bool fastboot;
@@ -2508,18 +2551,34 @@ extern long i915_compat_ioctl(struct file *filp, unsigned int cmd,
 			      unsigned long arg);
 #endif
 extern int intel_gpu_reset(struct drm_device *dev);
+extern int intel_gpu_engine_reset(struct intel_engine_cs *engine);
+extern int intel_request_gpu_engine_reset(struct intel_engine_cs *engine);
+extern int intel_unrequest_gpu_engine_reset(struct intel_engine_cs *engine);
 extern int i915_reset(struct drm_device *dev);
+extern int i915_reset_engine(struct intel_engine_cs *engine);
 extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
 extern unsigned long i915_mch_val(struct drm_i915_private *dev_priv);
 extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv);
 extern void i915_update_gfx_val(struct drm_i915_private *dev_priv);
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
 void intel_hpd_cancel_work(struct drm_i915_private *dev_priv);
+static inline void i915_hangcheck_reinit(struct intel_engine_cs *engine)
+{
+	struct intel_ring_hangcheck *hc = &engine->hangcheck;
+
+	hc->acthd = 0;
+	hc->max_acthd = 0;
+	hc->seqno = 0;
+	hc->score = 0;
+	hc->action = HANGCHECK_IDLE;
+	hc->deadlock = 0;
+}
+
 
 /* i915_irq.c */
 void i915_queue_hangcheck(struct drm_device *dev);
-__printf(3, 4)
-void i915_handle_error(struct drm_device *dev, bool wedged,
+__printf(4, 5)
+void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 		       const char *fmt, ...);
 
 extern void intel_irq_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4e8e722..e869823 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2312,10 +2312,70 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
-	int ret;
+	bool reset_complete = false;
+	struct intel_engine_cs *ring;
+	int ret = 0;
+	int i;
+
+	mutex_lock(&dev->struct_mutex);
 
 	kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, error_event);
 
+	for_each_ring(ring, dev_priv, i) {
+
+		/*
+		 * Skip further individual engine reset requests if full GPU
+		 * reset requested.
+		 */
+		if (i915_reset_in_progress(error))
+			break;
+
+		if (atomic_read(&ring->hangcheck.flags) &
+			I915_ENGINE_RESET_IN_PROGRESS) {
+
+			if (!reset_complete)
+				kobject_uevent_env(&dev->primary->kdev->kobj,
+						   KOBJ_CHANGE,
+						   reset_event);
+
+			reset_complete = true;
+
+			ret = i915_reset_engine(ring);
+
+			/*
+			 * Execlist mode only:
+			 *
+			 * -EAGAIN means that between detecting a hang (and
+			 * also determining that the currently submitted
+			 * context is stable and valid) and trying to recover
+			 * from the hang the current context changed state.
+			 * This means that we are probably not completely hung
+			 * after all. Just fail and retry by exiting all the
+			 * way back and wait for the next hang detection. If we
+			 * have a true hang on our hands then we will detect it
+			 * again, otherwise we will continue like nothing
+			 * happened.
+			 */
+			if (ret == -EAGAIN) {
+				DRM_ERROR("Reset of %s aborted due to " \
+					  "change in context submission " \
+					  "state - retrying!", ring->name);
+				ret = 0;
+			}
+
+			if (ret) {
+				DRM_ERROR("Reset of %s failed! (%d)", ring->name, ret);
+
+				atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
+						&dev_priv->gpu_error.reset_counter);
+				break;
+			}
+		}
+	}
+
+	/* The full GPU reset will grab the struct_mutex when it needs it */
+	mutex_unlock(&dev->struct_mutex);
+
 	/*
 	 * Note that there's only one work item which does gpu resets, so we
 	 * need not worry about concurrent gpu resets potentially incrementing
@@ -2328,8 +2388,13 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	 */
 	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
 		DRM_DEBUG_DRIVER("resetting chip\n");
-		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
-				   reset_event);
+
+		if (!reset_complete)
+			kobject_uevent_env(&dev->primary->kdev->kobj,
+					   KOBJ_CHANGE,
+					   reset_event);
+
+		reset_complete = true;
 
 		/*
 		 * In most cases it's guaranteed that we get here with an RPM
@@ -2362,23 +2427,36 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 			 *
 			 * Since unlock operations are a one-sided barrier only,
 			 * we need to insert a barrier here to order any seqno
-			 * updates before
-			 * the counter increment.
+			 * updates before the counter increment.
+			 *
+			 * The increment clears I915_RESET_IN_PROGRESS_FLAG.
 			 */
 			smp_mb__before_atomic();
 			atomic_inc(&dev_priv->gpu_error.reset_counter);
 
-			kobject_uevent_env(&dev->primary->kdev->kobj,
-					   KOBJ_CHANGE, reset_done_event);
+			/*
+			 * If any per-engine resets were promoted to full GPU
+			 * reset don't forget to clear those reset flags.
+			 */
+			for_each_ring(ring, dev_priv, i)
+				atomic_set(&ring->hangcheck.flags, 0);
 		} else {
+			/* Terminal wedge condition */
+			WARN(1, "i915_reset failed, declaring GPU as wedged!\n");
 			atomic_set_mask(I915_WEDGED, &error->reset_counter);
 		}
+	}
 
-		/*
-		 * Note: The wake_up also serves as a memory barrier so that
-		 * waiters see the update value of the reset counter atomic_t.
-		 */
+	/*
+	 * Note: The wake_up also serves as a memory barrier so that
+	 * waiters see the update value of the reset counter atomic_t.
+	 */
+	if (reset_complete) {
 		i915_error_wake_up(dev_priv, true);
+
+		if (ret == 0)
+			kobject_uevent_env(&dev->primary->kdev->kobj,
+					   KOBJ_CHANGE, reset_done_event);
 	}
 }
 
@@ -2476,21 +2554,42 @@ static void i915_report_and_clear_eir(struct drm_device *dev)
 
 /**
  * i915_handle_error - handle a gpu error
- * @dev: drm device
  *
- * Do some basic checking of regsiter state at error time and
+ * @dev: 		drm device
+ *
+ * @engine_mask: 	Bit mask containing the engine flags of all engines
+ *			associated with one or more detected errors.
+ *			May be 0x0.
+ *
+ *			If wedged is set to true this implies that one or more
+ *			engine hangs were detected. In this case we will
+ *			attempt to reset all engines that have been detected
+ *			as hung.
+ *
+ *			If a previous engine reset was attempted too recently
+ *			or if one of the current engine resets fails we fall
+ *			back to legacy full GPU reset.
+ *
+ * @wedged: 		true = Hang detected, invoke hang recovery.
+ * @fmt, ...: 		Error message describing reason for error.
+ *
+ * Do some basic checking of register state at error time and
  * dump it to the syslog.  Also call i915_capture_error_state() to make
  * sure we get a record and make it available in debugfs.  Fire a uevent
  * so userspace knows something bad happened (should trigger collection
- * of a ring dump etc.).
+ * of a ring dump etc.). If a hang was detected (wedged = true) try to
+ * reset the associated engine. Failing that, try to fall back to legacy
+ * full GPU reset recovery mode.
  */
-void i915_handle_error(struct drm_device *dev, bool wedged,
+void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 		       const char *fmt, ...)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	va_list args;
 	char error_msg[80];
 
+	struct intel_engine_cs *engine;
+
 	va_start(args, fmt);
 	vscnprintf(error_msg, sizeof(error_msg), fmt, args);
 	va_end(args);
@@ -2499,8 +2598,59 @@ void i915_handle_error(struct drm_device *dev, bool wedged,
 	i915_report_and_clear_eir(dev);
 
 	if (wedged) {
-		atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
-				&dev_priv->gpu_error.reset_counter);
+		/*
+		 * Defer to full GPU reset if any of the following is true:
+		 * 	1. The caller did not ask for per-engine reset.
+		 *	2. The hardware does not support it (pre-gen7).
+		 *	3. We already tried per-engine reset recently.
+		 */
+		bool full_reset = true;
+
+		/*
+		 * TBD: We currently only support per-engine reset for gen8+.
+		 * Implement support for gen7.
+		 */
+		if (engine_mask && (INTEL_INFO(dev)->gen >= 8)) {
+			u32 i;
+
+			for_each_ring(engine, dev_priv, i) {
+				u32 now, last_engine_reset_timediff;
+
+				if (!(intel_ring_flag(engine) & engine_mask))
+					continue;
+
+				/* Measure the time since this engine was last reset */
+				now = get_seconds();
+				last_engine_reset_timediff =
+					now - engine->hangcheck.last_engine_reset_time;
+
+				full_reset = last_engine_reset_timediff <
+					i915.gpu_reset_promotion_time;
+
+				engine->hangcheck.last_engine_reset_time = now;
+
+				/*
+				 * This engine was not reset too recently - go ahead
+				 * with engine reset instead of falling back to full
+				 * GPU reset.
+				 *
+				 * Flag that we want to try and reset this engine.
+				 * This can still be overridden by a global
+				 * reset e.g. if per-engine reset fails.
+				 */
+				if (!full_reset)
+					atomic_set_mask(I915_ENGINE_RESET_IN_PROGRESS,
+						&engine->hangcheck.flags);
+				else
+					break;
+
+			} /* for_each_ring */
+		}
+
+		if (full_reset) {
+			atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
+					&dev_priv->gpu_error.reset_counter);
+		}
 
 		/*
 		 * Wakeup waiting processes so that the reset function
@@ -2823,7 +2973,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 	 */
 	tmp = I915_READ_CTL(ring);
 	if (tmp & RING_WAIT) {
-		i915_handle_error(dev, false,
+		i915_handle_error(dev, intel_ring_flag(ring), false,
 				  "Kicking stuck wait on %s",
 				  ring->name);
 		I915_WRITE_CTL(ring, tmp);
@@ -2835,7 +2985,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 		default:
 			return HANGCHECK_HUNG;
 		case 1:
-			i915_handle_error(dev, false,
+			i915_handle_error(dev, intel_ring_flag(ring), false,
 					  "Kicking stuck semaphore on %s",
 					  ring->name);
 			I915_WRITE_CTL(ring, tmp);
@@ -2864,7 +3014,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	struct drm_device *dev = dev_priv->dev;
 	struct intel_engine_cs *ring;
 	int i;
-	int busy_count = 0, rings_hung = 0;
+	u32 engine_mask = 0;
+	int busy_count = 0;
 	bool stuck[I915_NUM_RINGS] = { 0 };
 #define BUSY 1
 #define KICK 5
@@ -2960,12 +3111,14 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 			DRM_INFO("%s on %s\n",
 				 stuck[i] ? "stuck" : "no progress",
 				 ring->name);
-			rings_hung++;
+
+			engine_mask |= intel_ring_flag(ring);
+			ring->hangcheck.tdr_count++;
 		}
 	}
 
-	if (rings_hung)
-		return i915_handle_error(dev, true, "Ring hung");
+	if (engine_mask)
+		i915_handle_error(dev, engine_mask, true, "Ring hung (0x%02x)", engine_mask);
 
 	if (busy_count)
 		/* Reset timer case chip hangs without another request
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index bb64415..9cea004 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -50,6 +50,7 @@ struct i915_params i915 __read_mostly = {
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
 	.use_mmio_flip = 0,
+	.gpu_reset_promotion_time = 0,
 	.mmio_debug = 0,
 	.verbose_state_checks = 1,
 	.nuclear_pageflip = 0,
@@ -172,6 +173,15 @@ module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
 MODULE_PARM_DESC(use_mmio_flip,
 		 "use MMIO flips (-1=never, 0=driver discretion [default], 1=always)");
 
+module_param_named(gpu_reset_promotion_time,
+               i915.gpu_reset_promotion_time, int, 0644);
+MODULE_PARM_DESC(gpu_reset_promotion_time,
+               "Catch excessive engine resets. Each engine maintains a "
+	       "timestamp of the last time it was reset. If it hangs again "
+	       "within this period then fall back to full GPU reset to try and"
+	       " recover from the hang. "
+               "default=0 seconds (disabled)");
+
 module_param_named(mmio_debug, i915.mmio_debug, int, 0600);
 MODULE_PARM_DESC(mmio_debug,
 	"Enable the MMIO debug code for the first N failures (default: off). "
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 9c97842..af9f0ad 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -100,6 +100,10 @@
 #define  GRDOM_RESET_STATUS (1<<1)
 #define  GRDOM_RESET_ENABLE (1<<0)
 
+#define RING_RESET_CTL(ring)	((ring)->mmio_base+0xd0)
+#define  READY_FOR_RESET	0x2
+#define  REQUEST_RESET		0x1
+
 #define ILK_GDSR 0x2ca4 /* MCHBAR offset */
 #define  ILK_GRDOM_FULL		(0<<1)
 #define  ILK_GRDOM_RENDER	(1<<1)
@@ -130,6 +134,8 @@
 #define  GEN6_GRDOM_RENDER		(1 << 1)
 #define  GEN6_GRDOM_MEDIA		(1 << 2)
 #define  GEN6_GRDOM_BLT			(1 << 3)
+#define  GEN6_GRDOM_VECS		(1 << 4)
+#define  GEN8_GRDOM_MEDIA2		(1 << 7)
 
 #define RING_PP_DIR_BASE(ring)		((ring)->mmio_base+0x228)
 #define RING_PP_DIR_BASE_READ(ring)	((ring)->mmio_base+0x518)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0fc35dd..e02abec 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -135,6 +135,7 @@
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
+#include "intel_lrc_tdr.h"
 
 #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
 #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
@@ -330,6 +331,164 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
 	spin_unlock(&dev_priv->uncore.lock);
 }
 
+/**
+ * execlist_get_context_reg_page() - Get memory page for context object
+ * @engine: engine
+ * @ctx: context running on engine
+ * @page: returned page
+ *
+ * Return: 0 if successful, otherwise propagates error codes.
+ */
+static inline int execlist_get_context_reg_page(struct intel_engine_cs *engine,
+		struct intel_context *ctx,
+		struct page **page)
+{
+	struct drm_i915_gem_object *ctx_obj;
+
+	if (!page)
+		return -EINVAL;
+
+	if (!ctx)
+		ctx = engine->default_context;
+
+	ctx_obj = ctx->engine[engine->id].state;
+
+	if (WARN(!ctx_obj, "Context object not set up!\n"))
+		return -EINVAL;
+
+	WARN(!i915_gem_obj_is_pinned(ctx_obj),
+	     "Context object is not pinned!\n");
+
+	*page = i915_gem_object_get_page(ctx_obj, 1);
+
+	if (WARN(!*page, "Context object page could not be resolved!\n"))
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
+ * execlist_write_context_reg() - Write value to context register
+ * @engine: engine
+ * @ctx: context running on engine
+ * @ctx_reg: Index into context image pointing to register location
+ * @mmio_reg_addr: MMIO register address
+ * @val: Value to be written
+ *
+ * Return: 0 if successful, otherwise propagates error codes.
+ */
+static inline int execlists_write_context_reg(struct intel_engine_cs *engine,
+		struct intel_context *ctx, u32 ctx_reg, u32 mmio_reg_addr,
+		u32 val)
+{
+	struct page *page = NULL;
+	uint32_t *reg_state;
+
+	int ret = execlist_get_context_reg_page(engine, ctx, &page);
+	if (WARN(ret, "Failed to write %u to register %u for %s!\n",
+		(unsigned int) val, (unsigned int) ctx_reg, engine->name))
+			return ret;
+
+	reg_state = kmap_atomic(page);
+
+	WARN(reg_state[ctx_reg] != mmio_reg_addr,
+	     "Context register address (%x) != MMIO register address (%x)!\n",
+	     (unsigned int) reg_state[ctx_reg], (unsigned int) mmio_reg_addr);
+
+	reg_state[ctx_reg+1] = val;
+	kunmap_atomic(reg_state);
+
+	return ret;
+}
+
+/**
+ * execlist_read_context_reg() - Read value from context register
+ * @engine: engine
+ * @ctx: context running on engine
+ * @ctx_reg: Index into context image pointing to register location
+ * @mmio_reg_addr: MMIO register address
+ * @val: Output parameter returning register value
+ *
+ * Return: 0 if successful, otherwise propagates error codes.
+ */
+static inline int execlists_read_context_reg(struct intel_engine_cs *engine,
+		struct intel_context *ctx, u32 ctx_reg, u32 mmio_reg_addr,
+		u32 *val)
+{
+	struct page *page = NULL;
+	uint32_t *reg_state;
+	int ret = 0;
+
+	if (!val)
+		return -EINVAL;
+
+	ret = execlist_get_context_reg_page(engine, ctx, &page);
+	if (WARN(ret, "Failed to read from register %u for %s!\n",
+		(unsigned int) ctx_reg, engine->name))
+			return ret;
+
+	reg_state = kmap_atomic(page);
+
+	WARN(reg_state[ctx_reg] != mmio_reg_addr,
+	     "Context register address (%x) != MMIO register address (%x)!\n",
+	     (unsigned int) reg_state[ctx_reg], (unsigned int) mmio_reg_addr);
+
+	*val = reg_state[ctx_reg+1];
+	kunmap_atomic(reg_state);
+
+	return ret;
+ }
+
+/*
+ * Generic macros for generating function implementation for context register
+ * read/write functions.
+ *
+ * Macro parameters
+ * ----------------
+ * reg_name: Designated name of context register (e.g. tail, head, buffer_ctl)
+ *
+ * reg_def: Context register macro definition (e.g. CTX_RING_TAIL)
+ *
+ * mmio_reg_def: Name of macro function used to determine the address
+ *		 of the corresponding MMIO register (e.g. RING_TAIL, RING_HEAD).
+ *		 This macro function is assumed to be defined on the form of:
+ *
+ *			#define mmio_reg_def(base) (base+register_offset)
+ *
+ *		 Where "base" is the MMIO base address of the respective ring
+ *		 and "register_offset" is the offset relative to "base".
+ *
+ * Function parameters
+ * -------------------
+ * engine: The engine that the context is running on
+ * ctx: The context of the register that is to be accessed
+ * reg_name: Value to be written/read to/from the register.
+ */
+#define INTEL_EXECLISTS_WRITE_REG(reg_name, reg_def, mmio_reg_def) \
+	int intel_execlists_write_##reg_name(struct intel_engine_cs *engine, \
+					     struct intel_context *ctx, \
+					     u32 reg_name) \
+{ \
+	return execlists_write_context_reg(engine, ctx, (reg_def), \
+			mmio_reg_def(engine->mmio_base), (reg_name)); \
+}
+
+#define INTEL_EXECLISTS_READ_REG(reg_name, reg_def, mmio_reg_def) \
+	int intel_execlists_read_##reg_name(struct intel_engine_cs *engine, \
+					    struct intel_context *ctx, \
+					    u32 *reg_name) \
+{ \
+	return execlists_read_context_reg(engine, ctx, (reg_def), \
+			mmio_reg_def(engine->mmio_base), (reg_name)); \
+}
+
+INTEL_EXECLISTS_READ_REG(tail, CTX_RING_TAIL, RING_TAIL)
+INTEL_EXECLISTS_WRITE_REG(head, CTX_RING_HEAD, RING_HEAD)
+INTEL_EXECLISTS_READ_REG(head, CTX_RING_HEAD, RING_HEAD)
+
+#undef INTEL_EXECLISTS_READ_REG
+#undef INTEL_EXECLISTS_WRITE_REG
+
 static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
 				    struct drm_i915_gem_object *ring_obj,
 				    struct i915_hw_ppgtt *ppgtt,
@@ -387,44 +546,93 @@ static void execlists_submit_contexts(struct intel_engine_cs *ring,
 	execlists_elsp_write(ring, ctx_obj0, ctx_obj1);
 }
 
-static void execlists_context_unqueue(struct intel_engine_cs *ring)
+static void execlists_fetch_requests(struct intel_engine_cs *ring,
+			struct drm_i915_gem_request **req0,
+			struct drm_i915_gem_request **req1)
 {
-	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
 	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
 
-	assert_spin_locked(&ring->execlist_lock);
-
-	if (list_empty(&ring->execlist_queue))
+	if (!req0)
 		return;
 
+	*req0 = NULL;
+
+	if (req1)
+		*req1 = NULL;
+
 	/* Try to read in pairs */
 	list_for_each_entry_safe(cursor, tmp, &ring->execlist_queue,
 				 execlist_link) {
-		if (!req0) {
-			req0 = cursor;
-		} else if (req0->ctx == cursor->ctx) {
-			/* Same ctx: ignore first request, as second request
-			 * will update tail past first request's workload */
-			cursor->elsp_submitted = req0->elsp_submitted;
-			list_del(&req0->execlist_link);
-			list_add_tail(&req0->execlist_link,
+		if (!(*req0))
+			*req0 = cursor;
+		else if ((*req0)->ctx == cursor->ctx) {
+			/*
+			 * Same ctx: ignore first request, as second request
+			 * will update tail past first request's workload
+			 */
+			cursor->elsp_submitted = (*req0)->elsp_submitted;
+			list_del(&(*req0)->execlist_link);
+			list_add_tail(&(*req0)->execlist_link,
 				&ring->execlist_retired_req_list);
-			req0 = cursor;
+			*req0 = cursor;
 		} else {
-			req1 = cursor;
+			if (req1)
+				*req1 = cursor;
 			break;
 		}
 	}
+}
 
-	WARN_ON(req1 && req1->elsp_submitted);
+static void execlists_context_unqueue(struct intel_engine_cs *ring, bool tdr_resubmission)
+{
+	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
+
+	assert_spin_locked(&ring->execlist_lock);
+	if (list_empty(&ring->execlist_queue))
+		return;
+
+	execlists_fetch_requests(ring, &req0, &req1);
+
+	if (tdr_resubmission && req1 && !req1->elsp_submitted)
+		req1 = NULL;
+
+	WARN_ON(req1 && req1->elsp_submitted && !tdr_resubmission);
 
 	execlists_submit_contexts(ring, req0->ctx, req0->tail,
 				  req1 ? req1->ctx : NULL,
 				  req1 ? req1->tail : 0);
 
-	req0->elsp_submitted++;
-	if (req1)
-		req1->elsp_submitted++;
+	if (!tdr_resubmission) {
+		req0->elsp_submitted++;
+		if (req1)
+			req1->elsp_submitted++;
+	}
+}
+
+/**
+ * intel_execlists_TDR_context_resubmission() - ELSP context resubmission
+ * bypassing queue.
+ *
+ * Context submission mechanism exclusively used by TDR that bypasses the
+ * execlist queue. This is necessary since at the point of TDR hang recovery
+ * the hardware will be hung and resubmitting a fixed context (the context that
+ * the TDR has identified as hung and fixed up in order to move past the
+ * blocking batch buffer) to a hung execlist queue will lock up the TDR.
+ * Instead, opt for direct ELSP submission without depending on the rest of the
+ * driver.
+ *
+ * @ring: engine to do resubmission for.
+ *
+ */
+void intel_execlists_TDR_context_resubmission(struct intel_engine_cs *ring)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+	WARN_ON(list_empty(&ring->execlist_queue));
+
+	execlists_context_unqueue(ring, true);
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
 }
 
 static bool execlists_check_remove_request(struct intel_engine_cs *ring,
@@ -506,7 +714,7 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 	}
 
 	if (submit_contexts != 0)
-		execlists_context_unqueue(ring);
+		execlists_context_unqueue(ring, false);
 
 	spin_unlock(&ring->execlist_lock);
 
@@ -570,7 +778,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 
 	list_add_tail(&request->execlist_link, &ring->execlist_queue);
 	if (num_elements == 0)
-		execlists_context_unqueue(ring);
+		execlists_context_unqueue(ring, false);
 
 	spin_unlock_irq(&ring->execlist_lock);
 
@@ -1066,7 +1274,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
 	ring->next_context_status_buffer = 0;
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
 
-	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
+	i915_hangcheck_reinit(ring);
 
 	return 0;
 }
@@ -1314,6 +1522,173 @@ out:
 	return ret;
 }
 
+static int
+gen8_ring_disable(struct intel_engine_cs *ring)
+{
+	intel_request_gpu_engine_reset(ring);
+	return 0;
+}
+
+static int
+gen8_ring_enable(struct intel_engine_cs *ring)
+{
+	intel_unrequest_gpu_engine_reset(ring);
+	return 0;
+}
+
+/*
+ * gen8_ring_save()
+ *
+ * Saves the head MMIO register to scratch memory while engine is reset and
+ * reinitialized. Before saving the head register we nudge the head position to
+ * be correctly aligned with a QWORD boundary, which brings it up to the next
+ * presumably valid instruction. Typically, at the point of hang recovery the
+ * head register will be pointing to the last DWORD of the BB_START
+ * instruction, which is followed by a padding MI_NOOP inserted by the
+ * driver.
+ *
+ * ring: engine to be reset
+ * req: request containing the context currently running on engine
+ * force_advance: indicates whether or not we should nudge the head
+ *		  forward or not
+ */
+static int
+gen8_ring_save(struct intel_engine_cs *ring, struct drm_i915_gem_request *req,
+		bool force_advance)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = NULL;
+	struct intel_context *ctx;
+	int ret = 0;
+	int clamp_to_tail = 0;
+	uint32_t head;
+	uint32_t tail;
+	uint32_t head_addr;
+	uint32_t tail_addr;
+
+	if (WARN_ON(!req))
+	    return -EINVAL;
+
+	ctx = req->ctx;
+	ringbuf = ctx->engine[ring->id].ringbuf;
+
+	/*
+	 * Read head from MMIO register since it contains the
+	 * most up to date value of head at this point.
+	 */
+	head = I915_READ_HEAD(ring);
+
+	/*
+	 * Read tail from the context because the execlist queue
+	 * updates the tail value there first during submission.
+	 * The MMIO tail register is not updated until the actual
+	 * ring submission completes.
+	 */
+	ret = I915_READ_TAIL_CTX(ring, ctx, tail);
+	if (ret)
+		return ret;
+
+	/*
+	 * head_addr and tail_addr are the head and tail values
+	 * excluding ring wrapping information and aligned to DWORD
+	 * boundary
+	 */
+	head_addr = head & HEAD_ADDR;
+	tail_addr = tail & TAIL_ADDR;
+
+	/*
+	 * The head must always chase the tail.
+	 * If the tail is beyond the head then do not allow
+	 * the head to overtake it. If the tail is less than
+	 * the head then the tail has already wrapped and
+	 * there is no problem in advancing the head or even
+	 * wrapping the head back to 0 as worst case it will
+	 * become equal to tail
+	 */
+	if (head_addr <= tail_addr)
+		clamp_to_tail = 1;
+
+	if (force_advance) {
+
+		/* Force head pointer to next QWORD boundary */
+		head_addr &= ~0x7;
+		head_addr += 8;
+
+	} else if (head & 0x7) {
+
+		/* Ensure head pointer is pointing to a QWORD boundary */
+		head += 0x7;
+		head &= ~0x7;
+		head_addr = head;
+	}
+
+	if (clamp_to_tail && (head_addr > tail_addr)) {
+		head_addr = tail_addr;
+	} else if (head_addr >= ringbuf->size) {
+		/* Wrap head back to start if it exceeds ring size */
+		head_addr = 0;
+	}
+
+	head &= ~HEAD_ADDR;
+	head |= (head_addr & HEAD_ADDR);
+	ring->saved_head = head;
+
+	return 0;
+}
+
+static int
+gen8_ring_restore(struct intel_engine_cs *ring, struct drm_i915_gem_request *req)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_context *ctx;
+
+	if (WARN_ON(!req))
+	    return -EINVAL;
+
+	ctx = req->ctx;
+
+	/* Re-initialize ring */
+	if (ring->init_hw) {
+		int ret = ring->init_hw(ring);
+		if (ret != 0) {
+			DRM_ERROR("Failed to re-initialize %s\n",
+					ring->name);
+			return ret;
+		}
+	} else {
+		DRM_ERROR("ring init function pointer not set up\n");
+		return -EINVAL;
+	}
+
+	if (ring->id == RCS) {
+		/*
+		 * These register reinitializations are only located here
+		 * temporarily until they are moved out of the
+		 * init_clock_gating function to some function we can
+		 * call from here.
+		 */
+
+		/* WaVSRefCountFullforceMissDisable:chv */
+		/* WaDSRefCountFullforceMissDisable:chv */
+		I915_WRITE(GEN7_FF_THREAD_MODE,
+			   I915_READ(GEN7_FF_THREAD_MODE) &
+			   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+
+		I915_WRITE(_3D_CHICKEN3,
+			   _3D_CHICKEN_SDE_LIMIT_FIFO_POLY_DEPTH(2));
+
+		/* WaSwitchSolVfFArbitrationPriority:bdw */
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+	}
+
+	/* Restore head */
+
+	I915_WRITE_HEAD(ring, ring->saved_head);
+	I915_WRITE_HEAD_CTX(ring, ctx, ring->saved_head);
+
+	return 0;
+}
+
 static int gen8_init_rcs_context(struct intel_engine_cs *ring,
 		       struct intel_context *ctx)
 {
@@ -1412,6 +1787,10 @@ static int logical_render_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	ring->dev = dev;
 	ret = logical_ring_init(dev, ring);
@@ -1442,6 +1821,10 @@ static int logical_bsd_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1467,6 +1850,10 @@ static int logical_bsd2_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1492,6 +1879,10 @@ static int logical_blt_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1517,6 +1908,10 @@ static int logical_vebox_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1974,3 +2369,120 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+/**
+ * intel_execlists_TDR_get_current_request() - return request currently
+ * processed by engine
+ *
+ * @ring: Engine currently running context to be returned.
+ *
+ * @req:  Output parameter containing the current request (the request at the
+ *	  head of execlist queue corresponding to the given ring). May be NULL
+ *	  if no request has been submitted to the execlist queue of this
+ *	  engine. If the req parameter passed in to the function is not NULL
+ *	  and a request is found and returned the request is referenced before
+ *	  it is returned. It is the responsibility of the caller to dereference
+ *	  it at the end of its life cycle.
+ *
+ * Return:
+ *	CONTEXT_SUBMISSION_STATUS_OK if request is found to be submitted and its
+ *	context is currently running on engine.
+ *
+ *	CONTEXT_SUBMISSION_STATUS_INCONSISTENT if request is found to be submitted
+ *	but its context is not in a state that is consistent with current
+ *	hardware state for the given engine. This has been observed in three cases:
+ *
+ *		1. Before the engine has switched to this context after it has
+ *		been submitted to the execlist queue.
+ *
+ *		2. After the engine has switched away from this context but
+ *		before the context has been removed from the execlist queue.
+ *
+ *		3. The driver has lost an interrupt. Typically the hardware has
+ *		gone to idle but the driver still thinks the context belonging to
+ *		the request at the head of the queue is still executing.
+ *
+ *	CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED if no context has been found
+ *	to be submitted to the execlist queue and if the hardware is idle.
+ */
+enum context_submission_status
+intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request **req)
+{
+	struct drm_i915_private *dev_priv;
+	unsigned long flags;
+	struct drm_i915_gem_request *tmpreq = NULL;
+	struct intel_context *tmpctx = NULL;
+	unsigned hw_context = 0;
+	bool hw_active = false;
+	enum context_submission_status status =
+			CONTEXT_SUBMISSION_STATUS_UNDEFINED;
+
+	if (WARN_ON(!ring))
+		return status;
+
+	dev_priv = ring->dev->dev_private;
+
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+	hw_context = I915_READ(RING_EXECLIST_STATUS_CTX_ID(ring));
+
+	hw_active = (I915_READ(RING_EXECLIST_STATUS(ring)) &
+		EXECLIST_STATUS_CURRENT_ACTIVE_ELEMENT_STATUS) ? true : false;
+
+	tmpreq = list_first_entry_or_null(&ring->execlist_queue,
+		struct drm_i915_gem_request, execlist_link);
+
+	if (tmpreq) {
+		/*
+		 * If the caller has not passed a non-NULL req parameter then
+		 * it is not interested in getting a request reference back.
+		 * Don't temporarily grab a reference since holding the execlist
+		 * lock is enough to ensure that the execlist code will hold its
+		 * reference all throughout this function. As long as that reference
+		 * is kept there is no need for us to take yet another reference.
+		 * The reason why this is of interest is because certain callers, such
+		 * as the TDR hang checker, cannot grab struct_mutex before calling
+		 * and because of that we cannot dereference any requests (DRM might
+		 * assert if we do). Just rely on the execlist code to provide
+		 * indirect protection.
+		 */
+		if (req)
+			i915_gem_request_reference(tmpreq);
+
+
+		if (tmpreq->ctx)
+			tmpctx = tmpreq->ctx;
+
+		WARN(!tmpctx, "No context in request %p\n", tmpreq);
+	}
+
+	if (tmpctx) {
+		unsigned sw_context =
+			intel_execlists_ctx_id((tmpctx)->engine[ring->id].state);
+
+		status = ((hw_context == sw_context) && hw_active) ?
+				CONTEXT_SUBMISSION_STATUS_OK :
+				CONTEXT_SUBMISSION_STATUS_INCONSISTENT;
+	} else {
+		/*
+		 * If we don't have any queue entries and the
+		 * EXECLIST_STATUS register points to zero we are
+		 * clearly not processing any context right now
+		 */
+		WARN((hw_context || hw_active), "hw_context=%x, hardware %s!\n",
+			hw_context, hw_active ? "not idle":"idle");
+
+		status = (hw_context || hw_active) ?
+			CONTEXT_SUBMISSION_STATUS_INCONSISTENT :
+			CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED;
+	}
+
+	if (req)
+		*req = tmpreq;
+
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	return status;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 04d3a6d..d2f497c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -29,6 +29,8 @@
 /* Execlists regs */
 #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
 #define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
+#define	  EXECLIST_STATUS_CURRENT_ACTIVE_ELEMENT_STATUS	(0x3 << 14)
+#define RING_EXECLIST_STATUS_CTX_ID(ring)	(RING_EXECLIST_STATUS(ring)+4)
 #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
 #define	  CTX_CTRL_INHIBIT_SYN_CTX_SWITCH	(1 << 3)
 #define	  CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT	(1 << 0)
@@ -89,4 +91,16 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
 void intel_lrc_irq_handler(struct intel_engine_cs *ring);
 void intel_execlists_retire_requests(struct intel_engine_cs *ring);
 
+int intel_execlists_read_tail(struct intel_engine_cs *ring,
+			 struct intel_context *ctx,
+			 u32 *tail);
+
+int intel_execlists_write_head(struct intel_engine_cs *ring,
+			  struct intel_context *ctx,
+			  u32 head);
+
+int intel_execlists_read_head(struct intel_engine_cs *ring,
+			 struct intel_context *ctx,
+			 u32 *head);
+
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_lrc_tdr.h b/drivers/gpu/drm/i915/intel_lrc_tdr.h
new file mode 100644
index 0000000..4520753
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_lrc_tdr.h
@@ -0,0 +1,36 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _INTEL_LRC_TDR_H_
+#define _INTEL_LRC_TDR_H_
+
+/* Privileged execlist API used exclusively by TDR */
+
+void intel_execlists_TDR_context_resubmission(struct intel_engine_cs *ring);
+
+enum context_submission_status
+intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request **req);
+
+#endif /* _INTEL_LRC_TDR_H_ */
+
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f949583..0fdf983 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -442,6 +442,88 @@ static void ring_write_tail(struct intel_engine_cs *ring,
 	I915_WRITE_TAIL(ring, value);
 }
 
+int intel_ring_disable(struct intel_engine_cs *ring)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->disable)
+		return ring->disable(ring);
+	else {
+		DRM_ERROR("Ring disable not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+int intel_ring_enable(struct intel_engine_cs *ring)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->enable)
+		return ring->enable(ring);
+	else {
+		DRM_ERROR("Ring enable not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+int intel_ring_save(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req,
+		bool force_advance)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->save)
+		return ring->save(ring, req, force_advance);
+	else {
+		DRM_ERROR("Ring save not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+int intel_ring_restore(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->restore)
+		return ring->restore(ring, req);
+	else {
+		DRM_ERROR("Ring restore not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+void intel_gpu_engine_reset_resample(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf;
+	struct drm_i915_private *dev_priv;
+
+	if (WARN_ON(!ring))
+		return;
+
+	dev_priv = ring->dev->dev_private;
+
+	if (i915.enable_execlists) {
+		struct intel_context *ctx;
+
+		if (WARN_ON(!req))
+			return;
+
+		ctx = req->ctx;
+		ringbuf = ctx->engine[ring->id].ringbuf;
+
+		/*
+		 * In gen8+ context head is restored during reset and
+		 * we can use it as a reference to set up the new
+		 * driver state.
+		 */
+		I915_READ_HEAD_CTX(ring, ctx, ringbuf->head);
+		ringbuf->last_retired_head = -1;
+		intel_ring_update_space(ringbuf);
+	}
+}
+
 u64 intel_ring_get_active_head(struct intel_engine_cs *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -637,7 +719,7 @@ static int init_ring_common(struct intel_engine_cs *ring)
 	ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
 	intel_ring_update_space(ringbuf);
 
-	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
+	i915_hangcheck_reinit(ring);
 
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f6dfc..35360a4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -48,6 +48,22 @@ struct  intel_hw_status_page {
 #define I915_READ_MODE(ring) I915_READ(RING_MI_MODE((ring)->mmio_base))
 #define I915_WRITE_MODE(ring, val) I915_WRITE(RING_MI_MODE((ring)->mmio_base), val)
 
+
+#define I915_READ_TAIL_CTX(engine, ctx, outval) \
+	intel_execlists_read_tail((engine), \
+				(ctx), \
+				&(outval));
+
+#define I915_READ_HEAD_CTX(engine, ctx, outval) \
+	intel_execlists_read_head((engine), \
+				(ctx), \
+				&(outval));
+
+#define I915_WRITE_HEAD_CTX(engine, ctx, val) \
+	intel_execlists_write_head((engine), \
+				(ctx), \
+				(val));
+
 /* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to
  * do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
  */
@@ -92,6 +108,34 @@ struct intel_ring_hangcheck {
 	int score;
 	enum intel_ring_hangcheck_action action;
 	int deadlock;
+
+	/*
+	 * Last recorded ring head index.
+	 * This is only ever a ring index where as active
+	 * head may be a graphics address in a ring buffer
+	 */
+	u32 last_head;
+
+	/* Flag to indicate if engine reset required */
+	atomic_t flags;
+
+	/* Indicates request to reset this engine */
+#define I915_ENGINE_RESET_IN_PROGRESS (1<<0)
+
+	/*
+	 * Timestamp (seconds) from when the last time
+	 * this engine was reset.
+	 */
+	u32 last_engine_reset_time;
+
+	/*
+	 * Number of times this engine has been
+	 * reset since boot
+	 */
+	u32 reset_count;
+
+	/* Number of TDR hang detections */
+	u32 tdr_count;
 };
 
 struct intel_ringbuffer {
@@ -177,6 +221,14 @@ struct  intel_engine_cs {
 #define I915_DISPATCH_PINNED 0x2
 	void		(*cleanup)(struct intel_engine_cs *ring);
 
+	int (*enable)(struct intel_engine_cs *ring);
+	int (*disable)(struct intel_engine_cs *ring);
+	int (*save)(struct intel_engine_cs *ring,
+		    struct drm_i915_gem_request *req,
+		    bool force_advance);
+	int (*restore)(struct intel_engine_cs *ring,
+		       struct drm_i915_gem_request *req);
+
 	/* GEN8 signal/wait table - never trust comments!
 	 *	  signal to	signal to    signal to   signal to      signal to
 	 *	    RCS		   VCS          BCS        VECS		 VCS2
@@ -283,6 +335,9 @@ struct  intel_engine_cs {
 
 	struct intel_ring_hangcheck hangcheck;
 
+	/* Saved head value to be restored after reset */
+	u32 saved_head;
+
 	struct {
 		struct drm_i915_gem_object *obj;
 		u32 gtt_offset;
@@ -420,6 +475,15 @@ int intel_ring_space(struct intel_ringbuffer *ringbuf);
 bool intel_ring_stopped(struct intel_engine_cs *ring);
 void __intel_ring_advance(struct intel_engine_cs *ring);
 
+void intel_gpu_engine_reset_resample(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req);
+int intel_ring_disable(struct intel_engine_cs *ring);
+int intel_ring_enable(struct intel_engine_cs *ring);
+int intel_ring_save(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req, bool force_advance);
+int intel_ring_restore(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req);
+
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
 void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index d96d15f..91427ac 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1463,6 +1463,205 @@ int intel_gpu_reset(struct drm_device *dev)
 		return -ENODEV;
 }
 
+static inline int wait_for_engine_reset(struct drm_i915_private *dev_priv,
+		unsigned int grdom)
+{
+#define _CND ((__raw_i915_read32(dev_priv, GEN6_GDRST) & grdom) == 0)
+
+	/*
+	 * Spin waiting for the device to ack the reset request.
+	 * Times out after 500 us
+	 * */
+	return wait_for_atomic_us(_CND, 500);
+
+#undef _CND
+}
+
+static int do_engine_reset_nolock(struct intel_engine_cs *engine)
+{
+	int ret = -ENODEV;
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+
+	assert_spin_locked(&dev_priv->uncore.lock);
+
+	switch (engine->id) {
+	case RCS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_RENDER);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_RENDER);
+		break;
+
+	case BCS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_BLT);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_BLT);
+		break;
+
+	case VCS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_MEDIA);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_MEDIA);
+		break;
+
+	case VECS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_VECS);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_VECS);
+		break;
+
+	case VCS2:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN8_GRDOM_MEDIA2);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN8_GRDOM_MEDIA2);
+		break;
+
+	default:
+		DRM_ERROR("Unexpected engine: %d\n", engine->id);
+		break;
+	}
+
+	return ret;
+}
+
+static int gen8_do_engine_reset(struct intel_engine_cs *engine)
+{
+	struct drm_device *dev = engine->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret = -ENODEV;
+	unsigned long irqflags;
+
+	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+	ret = do_engine_reset_nolock(engine);
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+
+	if (!ret) {
+		u32 reset_ctl = 0;
+
+		/*
+		 * Confirm that reset control register back to normal
+		 * following the reset.
+		 */
+		reset_ctl = I915_READ(RING_RESET_CTL(engine));
+		WARN(reset_ctl & 0x3, "Reset control still active after reset! (0x%08x)\n",
+			reset_ctl);
+	} else {
+		DRM_ERROR("Engine reset failed! (%d)\n", ret);
+	}
+
+	return ret;
+}
+
+int intel_gpu_engine_reset(struct intel_engine_cs *engine)
+{
+	/* Reset an individual engine */
+	int ret = -ENODEV;
+	struct drm_device *dev = engine->dev;
+
+	switch (INTEL_INFO(dev)->gen) {
+	case 8:
+		ret = gen8_do_engine_reset(engine);
+		break;
+	default:
+		DRM_ERROR("Per Engine Reset not supported on Gen%d\n",
+			  INTEL_INFO(dev)->gen);
+		ret = -ENODEV;
+		break;
+	}
+
+	return ret;
+}
+
+static int gen8_request_engine_reset(struct intel_engine_cs *engine)
+{
+	int ret = 0;
+	unsigned long irqflags;
+	u32 reset_ctl = 0;
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+
+	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+
+	/*
+	 * Initiate reset handshake by requesting reset from the
+	 * reset control register.
+	 */
+	__raw_i915_write32(dev_priv, RING_RESET_CTL(engine),
+		_MASKED_BIT_ENABLE(REQUEST_RESET));
+
+	/*
+	 * Wait for ready to reset ack.
+	 */
+	ret = wait_for_atomic_us((__raw_i915_read32(dev_priv,
+		RING_RESET_CTL(engine)) & READY_FOR_RESET) ==
+			READY_FOR_RESET, 500);
+
+	reset_ctl = __raw_i915_read32(dev_priv, RING_RESET_CTL(engine));
+
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+
+	WARN(ret, "Reset request failed! (err=%d, reset control=0x%08x)\n",
+		ret, reset_ctl);
+
+	return ret;
+}
+
+static int gen8_unrequest_engine_reset(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+
+	I915_WRITE(RING_RESET_CTL(engine), _MASKED_BIT_DISABLE(REQUEST_RESET));
+	return 0;
+}
+
+/*
+ * On gen8+ a reset request has to be issued via the reset control register
+ * before a GPU engine can be reset in order to stop the command streamer
+ * and idle the engine. This replaces the legacy way of stopping an engine
+ * by writing to the stop ring bit in the MI_MODE register.
+ */
+int intel_request_gpu_engine_reset(struct intel_engine_cs *engine)
+{
+	/* Request reset for an individual engine */
+	int ret = -ENODEV;
+	struct drm_device *dev;
+
+	if (WARN_ON(!engine))
+		return -EINVAL;
+
+	dev = engine->dev;
+
+	if (INTEL_INFO(dev)->gen >= 8)
+		ret = gen8_request_engine_reset(engine);
+	else
+		DRM_ERROR("Reset request not supported on Gen%d\n",
+			  INTEL_INFO(dev)->gen);
+
+	return ret;
+}
+
+/*
+ * It is possible to back off from a previously issued reset request by simply
+ * clearing the reset request bit in the reset control register.
+ */
+int intel_unrequest_gpu_engine_reset(struct intel_engine_cs *engine)
+{
+	/* Request reset for an individual engine */
+	int ret = -ENODEV;
+	struct drm_device *dev;
+
+	if (WARN_ON(!engine))
+		return -EINVAL;
+
+	dev = engine->dev;
+
+	if (INTEL_INFO(dev)->gen >= 8)
+		ret = gen8_unrequest_engine_reset(engine);
+	else
+		DRM_ERROR("Reset unrequest not supported on Gen%d\n",
+			  INTEL_INFO(dev)->gen);
+
+	return ret;
+}
+
 void intel_uncore_check_errors(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-- 
1.9.1



More information about the Intel-gfx mailing list