[Intel-gfx] [PATCH 2/3] drm/i915: s/seqno/request/ tracking inside objects
John Harrison
John.C.Harrison at Intel.com
Tue Sep 9 16:14:29 CEST 2014
I pulled a fresh tree on Monday and applied this set of patches. There
were two conflicts. It looks like nightly does not have
'i915_gem_context_setparam_ioctl' yet but the tree the patches came from
does. Also, my tree has 'DRM_I915_CTX_BAN_PERIOD' instead of
'ctx->hang_stats.ban_period_seconds'.
However, I can only boot if I have both execlists and PPGTT disabled.
With just PPGTT enabled, I get continuous GPU hangs and nothing ever
gets rendered. With execlists enabled, I get a null pointer dereference
in 'execlists_get_ring'. With both disabled, I can boot to an Ubuntu
desktop but shortly after it goes pop with 'BUG_ON(obj->active == 0)' in
'i915_gem_object_retire__read'.
This is running on BDW.
Am I missing some critical earlier patches?
Thanks,
John.
On 06/09/2014 10:28, Chris Wilson wrote:
> At the heart of this change is that the seqno is a too low level of an
> abstraction to handle the growing complexities of command tracking, both
> with the introduction of multiple command queues with execlists and the
> potential for reordering with a scheduler. On top of the seqno we have
> the request. Conceptually this is just a fence, but it also has
> substantial bookkeeping of its own in order to track the context and
> batch in flight, for example. It is the central structure upon which we
> can extend with dependency tracking et al.
>
> As regards the objects, they were using the seqno as a simple fence,
> upon which is check or even wait upon for command completion. This patch
> exchanges that seqno/ring pair with the request itself. For the
> majority, lifetime of the request is ordered by how we retire objects
> then requests. However, both the unlocked waits and probing elsewhere do
> not tie into the normal request lifetimes and so we need to introduce a
> kref. Extending the objects to use the request as the fence naturally
> extends to segregating read/write fence tracking. This has significance
> for it reduces the number of semaphores we need to emit, reducing the
> likelihood of #54226, and improving performance overall.
>
> v2: Rebase and split out the orthogonal tweaks.
>
> A silly happened with this patch. It seemed to nullify our earlier
> seqno-vs-interrupt w/a. I could not spot why, but gen6+ started to fail
> with missed interrupts (a good test of our robustness handling). So I
> ripped out the existing ACTHD read and replaced it with a RING_HEAD to
> manually check whether the request is complete. That also had the nice
> consequence of forcing __wait_request() to being the central arbiter of
> request completion. Note that during testing, it was not enough to
> re-enable the old workaround of keeping a forcewake reference whilst
> waiting upon the interrupt+seqno.
>
> The keener eyed reviewer will also spot that the reset_counter is moved
> into the request simplifying __wait_request() callsites and reducing the
> number of atomic reads by virtue of moving the check for a pending GPU
> reset to the endpoints of GPU access.
>
> v3: Implement the grand plan
>
> Since execlist landed with its upside-down abstraction, unveil the power
> of the request to remove all the duplication. To gain access to a ring,
> you must allocate a request. To allocate a request you must specify the
> context. Ergo all ring commands are carefully tracked by individual
> requests (which demarcate a single complete transaction with the GPU) in
> a known context (logical partitioning of the GPU with its own set of
> registers and rings - which may be shared with other partitions for
> backwards compatibility).
>
> v4:
>
> Tweak locking around execlist submission and request lists and
> remove duplicated execlist code and the peppering of execlist
> specific code throughout the core.
>
> To simplify rebasing, I pulled in the s/ring/engine/ rename, it adds
> a fair amount of noise of little significance and very easy to tune out.
>
> The patch itself consists of 3 heavily intertwined parts:
>
> 0. Rename ring and engine variables to be consistent with their usage.
> 1. Change the ring access API to require the context under which we
> are operating. This generates a request which we use to build up a
> ring transaction. The request tracks required flushes and
> serialisation with both the GPU caches, other requests and the CPU.
> 2. Reorder initialisation such that we have a clearly defined context
> and engines for the early ring access on module load, resume and
> reset.
> 3. Convert the seqno tracking over to using requests (ala explicit
> fencing).
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Jesse Barnes <jbarnes at virtuousgeek.org>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Cc: Damien Lespiau <damien.lespiau at intel.com>
> Cc: Oscar Mateo <oscar.mateo at intel.com>
> Cc: Brad Volkin <bradley.d.volkin at intel.com>
> Cc: "Kukanova, Svetlana" <svetlana.kukanova at intel.com>
> Cc: Akash Goel <akash.goel at intel.com>
> Cc: "Daniel, Thomas" <thomas.daniel at intel.com>
> Cc: "Siluvery, Arun" <arun.siluvery at linux.intel.com>
> Cc: John Harrison <John.C.Harrison at Intel.com>
> ---
> drivers/gpu/drm/i915/Makefile | 4 +-
> drivers/gpu/drm/i915/i915_cmd_parser.c | 150 +-
> drivers/gpu/drm/i915/i915_debugfs.c | 388 ++--
> drivers/gpu/drm/i915/i915_dma.c | 18 +-
> drivers/gpu/drm/i915/i915_drv.c | 46 +-
> drivers/gpu/drm/i915/i915_drv.h | 406 ++--
> drivers/gpu/drm/i915/i915_gem.c | 1759 +++++---------
> drivers/gpu/drm/i915/i915_gem_context.c | 508 +++--
> drivers/gpu/drm/i915/i915_gem_debug.c | 118 -
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 517 ++---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 140 +-
> drivers/gpu/drm/i915/i915_gem_gtt.h | 4 +-
> drivers/gpu/drm/i915/i915_gem_render_state.c | 69 +-
> drivers/gpu/drm/i915/i915_gem_render_state.h | 47 -
> drivers/gpu/drm/i915/i915_gem_request.c | 651 ++++++
> drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +-
> drivers/gpu/drm/i915/i915_gpu_error.c | 396 ++--
> drivers/gpu/drm/i915/i915_irq.c | 341 +--
> drivers/gpu/drm/i915/i915_reg.h | 3 +-
> drivers/gpu/drm/i915/i915_trace.h | 215 +-
> drivers/gpu/drm/i915/intel_display.c | 355 ++-
> drivers/gpu/drm/i915/intel_drv.h | 14 +-
> drivers/gpu/drm/i915/intel_lrc.c | 1689 +++-----------
> drivers/gpu/drm/i915/intel_lrc.h | 80 +-
> drivers/gpu/drm/i915/intel_overlay.c | 200 +-
> drivers/gpu/drm/i915/intel_pm.c | 90 +-
> drivers/gpu/drm/i915/intel_renderstate.h | 8 +-
> drivers/gpu/drm/i915/intel_ringbuffer.c | 3171 +++++++++++++-------------
> drivers/gpu/drm/i915/intel_ringbuffer.h | 391 ++--
> 29 files changed, 5397 insertions(+), 6383 deletions(-)
> delete mode 100644 drivers/gpu/drm/i915/i915_gem_debug.c
> delete mode 100644 drivers/gpu/drm/i915/i915_gem_render_state.h
> create mode 100644 drivers/gpu/drm/i915/i915_gem_request.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index c1dd485aeb6c..225e8a8206b2 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -17,14 +17,14 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o
>
> # GEM code
> i915-y += i915_cmd_parser.o \
> + i915_gem.o \
> i915_gem_context.o \
> i915_gem_render_state.o \
> - i915_gem_debug.o \
> i915_gem_dmabuf.o \
> i915_gem_evict.o \
> i915_gem_execbuffer.o \
> i915_gem_gtt.o \
> - i915_gem.o \
> + i915_gem_request.o \
> i915_gem_stolen.o \
> i915_gem_tiling.o \
> i915_gem_userptr.o \
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index c45856bcc8b9..408e0bdba48c 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -501,7 +501,7 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
> return 0;
> }
>
> -static bool validate_cmds_sorted(struct intel_engine_cs *ring,
> +static bool validate_cmds_sorted(struct intel_engine_cs *engine,
> const struct drm_i915_cmd_table *cmd_tables,
> int cmd_table_count)
> {
> @@ -523,7 +523,7 @@ static bool validate_cmds_sorted(struct intel_engine_cs *ring,
>
> if (curr < previous) {
> DRM_ERROR("CMD: table not sorted ring=%d table=%d entry=%d cmd=0x%08X prev=0x%08X\n",
> - ring->id, i, j, curr, previous);
> + engine->id, i, j, curr, previous);
> ret = false;
> }
>
> @@ -555,11 +555,11 @@ static bool check_sorted(int ring_id, const u32 *reg_table, int reg_count)
> return ret;
> }
>
> -static bool validate_regs_sorted(struct intel_engine_cs *ring)
> +static bool validate_regs_sorted(struct intel_engine_cs *engine)
> {
> - return check_sorted(ring->id, ring->reg_table, ring->reg_count) &&
> - check_sorted(ring->id, ring->master_reg_table,
> - ring->master_reg_count);
> + return check_sorted(engine->id, engine->reg_table, engine->reg_count) &&
> + check_sorted(engine->id, engine->master_reg_table,
> + engine->master_reg_count);
> }
>
> struct cmd_node {
> @@ -583,13 +583,13 @@ struct cmd_node {
> */
> #define CMD_HASH_MASK STD_MI_OPCODE_MASK
>
> -static int init_hash_table(struct intel_engine_cs *ring,
> +static int init_hash_table(struct intel_engine_cs *engine,
> const struct drm_i915_cmd_table *cmd_tables,
> int cmd_table_count)
> {
> int i, j;
>
> - hash_init(ring->cmd_hash);
> + hash_init(engine->cmd_hash);
>
> for (i = 0; i < cmd_table_count; i++) {
> const struct drm_i915_cmd_table *table = &cmd_tables[i];
> @@ -604,7 +604,7 @@ static int init_hash_table(struct intel_engine_cs *ring,
> return -ENOMEM;
>
> desc_node->desc = desc;
> - hash_add(ring->cmd_hash, &desc_node->node,
> + hash_add(engine->cmd_hash, &desc_node->node,
> desc->cmd.value & CMD_HASH_MASK);
> }
> }
> @@ -612,21 +612,21 @@ static int init_hash_table(struct intel_engine_cs *ring,
> return 0;
> }
>
> -static void fini_hash_table(struct intel_engine_cs *ring)
> +static void fini_hash_table(struct intel_engine_cs *engine)
> {
> struct hlist_node *tmp;
> struct cmd_node *desc_node;
> int i;
>
> - hash_for_each_safe(ring->cmd_hash, i, tmp, desc_node, node) {
> + hash_for_each_safe(engine->cmd_hash, i, tmp, desc_node, node) {
> hash_del(&desc_node->node);
> kfree(desc_node);
> }
> }
>
> /**
> - * i915_cmd_parser_init_ring() - set cmd parser related fields for a ringbuffer
> - * @ring: the ringbuffer to initialize
> + * i915_cmd_parser_init_engine() - set cmd parser related fields for a ringbuffer
> + * @engine: the ringbuffer to initialize
> *
> * Optionally initializes fields related to batch buffer command parsing in the
> * struct intel_engine_cs based on whether the platform requires software
> @@ -634,18 +634,18 @@ static void fini_hash_table(struct intel_engine_cs *ring)
> *
> * Return: non-zero if initialization fails
> */
> -int i915_cmd_parser_init_ring(struct intel_engine_cs *ring)
> +int i915_cmd_parser_init_engine(struct intel_engine_cs *engine)
> {
> const struct drm_i915_cmd_table *cmd_tables;
> int cmd_table_count;
> int ret;
>
> - if (!IS_GEN7(ring->dev))
> + if (!IS_GEN7(engine->i915))
> return 0;
>
> - switch (ring->id) {
> + switch (engine->id) {
> case RCS:
> - if (IS_HASWELL(ring->dev)) {
> + if (IS_HASWELL(engine->i915)) {
> cmd_tables = hsw_render_ring_cmds;
> cmd_table_count =
> ARRAY_SIZE(hsw_render_ring_cmds);
> @@ -654,26 +654,26 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring)
> cmd_table_count = ARRAY_SIZE(gen7_render_cmds);
> }
>
> - ring->reg_table = gen7_render_regs;
> - ring->reg_count = ARRAY_SIZE(gen7_render_regs);
> + engine->reg_table = gen7_render_regs;
> + engine->reg_count = ARRAY_SIZE(gen7_render_regs);
>
> - if (IS_HASWELL(ring->dev)) {
> - ring->master_reg_table = hsw_master_regs;
> - ring->master_reg_count = ARRAY_SIZE(hsw_master_regs);
> + if (IS_HASWELL(engine->i915)) {
> + engine->master_reg_table = hsw_master_regs;
> + engine->master_reg_count = ARRAY_SIZE(hsw_master_regs);
> } else {
> - ring->master_reg_table = ivb_master_regs;
> - ring->master_reg_count = ARRAY_SIZE(ivb_master_regs);
> + engine->master_reg_table = ivb_master_regs;
> + engine->master_reg_count = ARRAY_SIZE(ivb_master_regs);
> }
>
> - ring->get_cmd_length_mask = gen7_render_get_cmd_length_mask;
> + engine->get_cmd_length_mask = gen7_render_get_cmd_length_mask;
> break;
> case VCS:
> cmd_tables = gen7_video_cmds;
> cmd_table_count = ARRAY_SIZE(gen7_video_cmds);
> - ring->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
> + engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
> break;
> case BCS:
> - if (IS_HASWELL(ring->dev)) {
> + if (IS_HASWELL(engine->i915)) {
> cmd_tables = hsw_blt_ring_cmds;
> cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds);
> } else {
> @@ -681,68 +681,68 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring)
> cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
> }
>
> - ring->reg_table = gen7_blt_regs;
> - ring->reg_count = ARRAY_SIZE(gen7_blt_regs);
> + engine->reg_table = gen7_blt_regs;
> + engine->reg_count = ARRAY_SIZE(gen7_blt_regs);
>
> - if (IS_HASWELL(ring->dev)) {
> - ring->master_reg_table = hsw_master_regs;
> - ring->master_reg_count = ARRAY_SIZE(hsw_master_regs);
> + if (IS_HASWELL(engine->i915)) {
> + engine->master_reg_table = hsw_master_regs;
> + engine->master_reg_count = ARRAY_SIZE(hsw_master_regs);
> } else {
> - ring->master_reg_table = ivb_master_regs;
> - ring->master_reg_count = ARRAY_SIZE(ivb_master_regs);
> + engine->master_reg_table = ivb_master_regs;
> + engine->master_reg_count = ARRAY_SIZE(ivb_master_regs);
> }
>
> - ring->get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
> + engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
> break;
> case VECS:
> cmd_tables = hsw_vebox_cmds;
> cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds);
> /* VECS can use the same length_mask function as VCS */
> - ring->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
> + engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
> break;
> default:
> - DRM_ERROR("CMD: cmd_parser_init with unknown ring: %d\n",
> - ring->id);
> + DRM_ERROR("CMD: cmd_parser_init with unknown engine: %d\n",
> + engine->id);
> BUG();
> }
>
> - BUG_ON(!validate_cmds_sorted(ring, cmd_tables, cmd_table_count));
> - BUG_ON(!validate_regs_sorted(ring));
> + BUG_ON(!validate_cmds_sorted(engine, cmd_tables, cmd_table_count));
> + BUG_ON(!validate_regs_sorted(engine));
>
> - ret = init_hash_table(ring, cmd_tables, cmd_table_count);
> + ret = init_hash_table(engine, cmd_tables, cmd_table_count);
> if (ret) {
> DRM_ERROR("CMD: cmd_parser_init failed!\n");
> - fini_hash_table(ring);
> + fini_hash_table(engine);
> return ret;
> }
>
> - ring->needs_cmd_parser = true;
> + engine->needs_cmd_parser = true;
>
> return 0;
> }
>
> /**
> - * i915_cmd_parser_fini_ring() - clean up cmd parser related fields
> - * @ring: the ringbuffer to clean up
> + * i915_cmd_parser_fini_engine() - clean up cmd parser related fields
> + * @engine: the ringbuffer to clean up
> *
> * Releases any resources related to command parsing that may have been
> - * initialized for the specified ring.
> + * initialized for the specified engine.
> */
> -void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring)
> +void i915_cmd_parser_fini_engine(struct intel_engine_cs *engine)
> {
> - if (!ring->needs_cmd_parser)
> + if (!engine->needs_cmd_parser)
> return;
>
> - fini_hash_table(ring);
> + fini_hash_table(engine);
> }
>
> static const struct drm_i915_cmd_descriptor*
> -find_cmd_in_table(struct intel_engine_cs *ring,
> +find_cmd_in_table(struct intel_engine_cs *engine,
> u32 cmd_header)
> {
> struct cmd_node *desc_node;
>
> - hash_for_each_possible(ring->cmd_hash, desc_node, node,
> + hash_for_each_possible(engine->cmd_hash, desc_node, node,
> cmd_header & CMD_HASH_MASK) {
> const struct drm_i915_cmd_descriptor *desc = desc_node->desc;
> u32 masked_cmd = desc->cmd.mask & cmd_header;
> @@ -759,23 +759,23 @@ find_cmd_in_table(struct intel_engine_cs *ring,
> * Returns a pointer to a descriptor for the command specified by cmd_header.
> *
> * The caller must supply space for a default descriptor via the default_desc
> - * parameter. If no descriptor for the specified command exists in the ring's
> + * parameter. If no descriptor for the specified command exists in the engine's
> * command parser tables, this function fills in default_desc based on the
> - * ring's default length encoding and returns default_desc.
> + * engine's default length encoding and returns default_desc.
> */
> static const struct drm_i915_cmd_descriptor*
> -find_cmd(struct intel_engine_cs *ring,
> +find_cmd(struct intel_engine_cs *engine,
> u32 cmd_header,
> struct drm_i915_cmd_descriptor *default_desc)
> {
> const struct drm_i915_cmd_descriptor *desc;
> u32 mask;
>
> - desc = find_cmd_in_table(ring, cmd_header);
> + desc = find_cmd_in_table(engine, cmd_header);
> if (desc)
> return desc;
>
> - mask = ring->get_cmd_length_mask(cmd_header);
> + mask = engine->get_cmd_length_mask(cmd_header);
> if (!mask)
> return NULL;
>
> @@ -832,17 +832,17 @@ finish:
> }
>
> /**
> - * i915_needs_cmd_parser() - should a given ring use software command parsing?
> - * @ring: the ring in question
> + * i915_needs_cmd_parser() - should a given engine use software command parsing?
> + * @engine: the engine in question
> *
> * Only certain platforms require software batch buffer command parsing, and
> * only when enabled via module paramter.
> *
> - * Return: true if the ring requires software command parsing
> + * Return: true if the engine requires software command parsing
> */
> -bool i915_needs_cmd_parser(struct intel_engine_cs *ring)
> +bool i915_needs_cmd_parser(struct intel_engine_cs *engine)
> {
> - if (!ring->needs_cmd_parser)
> + if (!engine->needs_cmd_parser)
> return false;
>
> /*
> @@ -850,13 +850,13 @@ bool i915_needs_cmd_parser(struct intel_engine_cs *ring)
> * disabled. That will cause all of the parser's PPGTT checks to
> * fail. For now, disable parsing when PPGTT is off.
> */
> - if (USES_PPGTT(ring->dev))
> + if (USES_PPGTT(engine->dev))
> return false;
>
> return (i915.enable_cmd_parser == 1);
> }
>
> -static bool check_cmd(const struct intel_engine_cs *ring,
> +static bool check_cmd(const struct intel_engine_cs *engine,
> const struct drm_i915_cmd_descriptor *desc,
> const u32 *cmd,
> const bool is_master,
> @@ -893,16 +893,16 @@ static bool check_cmd(const struct intel_engine_cs *ring,
> *oacontrol_set = (cmd[2] != 0);
> }
>
> - if (!valid_reg(ring->reg_table,
> - ring->reg_count, reg_addr)) {
> + if (!valid_reg(engine->reg_table,
> + engine->reg_count, reg_addr)) {
> if (!is_master ||
> - !valid_reg(ring->master_reg_table,
> - ring->master_reg_count,
> + !valid_reg(engine->master_reg_table,
> + engine->master_reg_count,
> reg_addr)) {
> - DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (ring=%d)\n",
> + DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (engine=%d)\n",
> reg_addr,
> *cmd,
> - ring->id);
> + engine->id);
> return false;
> }
> }
> @@ -931,11 +931,11 @@ static bool check_cmd(const struct intel_engine_cs *ring,
> desc->bits[i].mask;
>
> if (dword != desc->bits[i].expected) {
> - DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n",
> + DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (engine=%d)\n",
> *cmd,
> desc->bits[i].mask,
> desc->bits[i].expected,
> - dword, ring->id);
> + dword, engine->id);
> return false;
> }
> }
> @@ -948,7 +948,7 @@ static bool check_cmd(const struct intel_engine_cs *ring,
>
> /**
> * i915_parse_cmds() - parse a submitted batch buffer for privilege violations
> - * @ring: the ring on which the batch is to execute
> + * @engine: the engine on which the batch is to execute
> * @batch_obj: the batch buffer in question
> * @batch_start_offset: byte offset in the batch at which execution starts
> * @is_master: is the submitting process the drm master?
> @@ -958,7 +958,7 @@ static bool check_cmd(const struct intel_engine_cs *ring,
> *
> * Return: non-zero if the parser finds violations or otherwise fails
> */
> -int i915_parse_cmds(struct intel_engine_cs *ring,
> +int i915_parse_cmds(struct intel_engine_cs *engine,
> struct drm_i915_gem_object *batch_obj,
> u32 batch_start_offset,
> bool is_master)
> @@ -995,7 +995,7 @@ int i915_parse_cmds(struct intel_engine_cs *ring,
> if (*cmd == MI_BATCH_BUFFER_END)
> break;
>
> - desc = find_cmd(ring, *cmd, &default_desc);
> + desc = find_cmd(engine, *cmd, &default_desc);
> if (!desc) {
> DRM_DEBUG_DRIVER("CMD: Unrecognized command: 0x%08X\n",
> *cmd);
> @@ -1017,7 +1017,7 @@ int i915_parse_cmds(struct intel_engine_cs *ring,
> break;
> }
>
> - if (!check_cmd(ring, desc, cmd, is_master, &oacontrol_set)) {
> + if (!check_cmd(engine, desc, cmd, is_master, &oacontrol_set)) {
> ret = -EINVAL;
> break;
> }
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2cbc85f3b237..4d0b5cff5291 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -123,19 +123,22 @@ static void
> describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> {
> struct i915_vma *vma;
> - int pin_count = 0;
> + int pin_count = 0, n;
>
> - seq_printf(m, "%pK: %s%s%s %8zdKiB %02x %02x %u %u %u%s%s%s",
> + seq_printf(m, "%pK: %s%s%s %8zdKiB %02x %02x [",
> &obj->base,
> get_pin_flag(obj),
> get_tiling_flag(obj),
> get_global_flag(obj),
> obj->base.size / 1024,
> obj->base.read_domains,
> - obj->base.write_domain,
> - obj->last_read_seqno,
> - obj->last_write_seqno,
> - obj->last_fenced_seqno,
> + obj->base.write_domain);
> + for (n = 0; n < ARRAY_SIZE(obj->last_read); n++)
> + seq_printf(m, " %x",
> + i915_request_seqno(obj->last_read[n].request));
> + seq_printf(m, " ] %x %x%s%s%s",
> + i915_request_seqno(obj->last_write.request),
> + i915_request_seqno(obj->last_fence.request),
> i915_cache_level_str(to_i915(obj->base.dev), obj->cache_level),
> obj->dirty ? " dirty" : "",
> obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
> @@ -168,15 +171,15 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
> *t = '\0';
> seq_printf(m, " (%s mappable)", s);
> }
> - if (obj->ring != NULL)
> - seq_printf(m, " (%s)", obj->ring->name);
> + if (obj->last_write.request)
> + seq_printf(m, " (%s)", obj->last_write.request->engine->name);
> if (obj->frontbuffer_bits)
> seq_printf(m, " (frontbuffer: 0x%03x)", obj->frontbuffer_bits);
> }
>
> static void describe_ctx(struct seq_file *m, struct intel_context *ctx)
> {
> - seq_putc(m, ctx->legacy_hw_ctx.initialized ? 'I' : 'i');
> + seq_putc(m, ctx->ring[RCS].initialized ? 'I' : 'i');
> seq_putc(m, ctx->remap_slice ? 'R' : 'r');
> seq_putc(m, ' ');
> }
> @@ -336,7 +339,7 @@ static int per_file_stats(int id, void *ptr, void *data)
> if (ppgtt->file_priv != stats->file_priv)
> continue;
>
> - if (obj->ring) /* XXX per-vma statistic */
> + if (obj->active) /* XXX per-vma statistic */
> stats->active += obj->base.size;
> else
> stats->inactive += obj->base.size;
> @@ -346,7 +349,7 @@ static int per_file_stats(int id, void *ptr, void *data)
> } else {
> if (i915_gem_obj_ggtt_bound(obj)) {
> stats->global += obj->base.size;
> - if (obj->ring)
> + if (obj->active)
> stats->active += obj->base.size;
> else
> stats->inactive += obj->base.size;
> @@ -544,14 +547,14 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
> seq_printf(m, "Flip pending (waiting for vsync) on pipe %c (plane %c)\n",
> pipe, plane);
> }
> - if (work->flip_queued_ring) {
> + if (work->flip_queued_request) {
> + struct i915_gem_request *rq =
> + work->flip_queued_request;
> seq_printf(m, "Flip queued on %s at seqno %u, next seqno %u [current breadcrumb %u], completed? %d\n",
> - work->flip_queued_ring->name,
> - work->flip_queued_seqno,
> - dev_priv->next_seqno,
> - work->flip_queued_ring->get_seqno(work->flip_queued_ring, true),
> - i915_seqno_passed(work->flip_queued_ring->get_seqno(work->flip_queued_ring, true),
> - work->flip_queued_seqno));
> + rq->engine->name,
> + rq->seqno, rq->i915->next_seqno,
> + rq->engine->get_seqno(rq->engine),
> + __i915_request_complete__wa(rq));
> } else
> seq_printf(m, "Flip not associated with any ring\n");
> seq_printf(m, "Flip queued on frame %d, (was ready on frame %d), now %d\n",
> @@ -588,8 +591,8 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
> struct drm_info_node *node = m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - struct drm_i915_gem_request *gem_request;
> + struct intel_engine_cs *engine;
> + struct i915_gem_request *rq;
> int ret, count, i;
>
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -597,17 +600,15 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
> return ret;
>
> count = 0;
> - for_each_ring(ring, dev_priv, i) {
> - if (list_empty(&ring->request_list))
> + for_each_engine(engine, dev_priv, i) {
> + if (list_empty(&engine->requests))
> continue;
>
> - seq_printf(m, "%s requests:\n", ring->name);
> - list_for_each_entry(gem_request,
> - &ring->request_list,
> - list) {
> + seq_printf(m, "%s requests:\n", engine->name);
> + list_for_each_entry(rq, &engine->requests, engine_list) {
> seq_printf(m, " %d @ %d\n",
> - gem_request->seqno,
> - (int) (jiffies - gem_request->emitted_jiffies));
> + rq->seqno,
> + (int)(jiffies - rq->emitted_jiffies));
> }
> count++;
> }
> @@ -619,13 +620,17 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
> return 0;
> }
>
> -static void i915_ring_seqno_info(struct seq_file *m,
> - struct intel_engine_cs *ring)
> +static void i915_engine_seqno_info(struct seq_file *m,
> + struct intel_engine_cs *engine)
> {
> - if (ring->get_seqno) {
> - seq_printf(m, "Current sequence (%s): %u\n",
> - ring->name, ring->get_seqno(ring, false));
> - }
> + seq_printf(m, "Current sequence (%s): seqno=%u, tag=%u [last breadcrumb %u, last request %u], next seqno=%u, next tag=%u\n",
> + engine->name,
> + engine->get_seqno(engine),
> + engine->tag,
> + engine->breadcrumb[engine->id],
> + engine->last_request ? engine->last_request->seqno : 0,
> + engine->i915->next_seqno,
> + engine->next_tag);
> }
>
> static int i915_gem_seqno_info(struct seq_file *m, void *data)
> @@ -633,7 +638,7 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
> struct drm_info_node *node = m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int ret, i;
>
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -641,8 +646,8 @@ static int i915_gem_seqno_info(struct seq_file *m, void *data)
> return ret;
> intel_runtime_pm_get(dev_priv);
>
> - for_each_ring(ring, dev_priv, i)
> - i915_ring_seqno_info(m, ring);
> + for_each_engine(engine, dev_priv, i)
> + i915_engine_seqno_info(m, engine);
>
> intel_runtime_pm_put(dev_priv);
> mutex_unlock(&dev->struct_mutex);
> @@ -656,7 +661,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
> struct drm_info_node *node = m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int ret, i, pipe;
>
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> @@ -823,13 +828,13 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
> seq_printf(m, "Graphics Interrupt mask: %08x\n",
> I915_READ(GTIMR));
> }
> - for_each_ring(ring, dev_priv, i) {
> + for_each_engine(engine, dev_priv, i) {
> if (INTEL_INFO(dev)->gen >= 6) {
> seq_printf(m,
> "Graphics Interrupt mask (%s): %08x\n",
> - ring->name, I915_READ_IMR(ring));
> + engine->name, I915_READ_IMR(engine));
> }
> - i915_ring_seqno_info(m, ring);
> + i915_engine_seqno_info(m, engine);
> }
> intel_runtime_pm_put(dev_priv);
> mutex_unlock(&dev->struct_mutex);
> @@ -871,12 +876,12 @@ static int i915_hws_info(struct seq_file *m, void *data)
> struct drm_info_node *node = m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> const u32 *hws;
> int i;
>
> - ring = &dev_priv->ring[(uintptr_t)node->info_ent->data];
> - hws = ring->status_page.page_addr;
> + engine = &dev_priv->engine[(uintptr_t)node->info_ent->data];
> + hws = engine->status_page.page_addr;
> if (hws == NULL)
> return 0;
>
> @@ -1000,7 +1005,7 @@ i915_next_seqno_set(void *data, u64 val)
> struct drm_device *dev = data;
> int ret;
>
> - ret = mutex_lock_interruptible(&dev->struct_mutex);
> + ret = i915_mutex_lock_interruptible(dev);
> if (ret)
> return ret;
>
> @@ -1701,12 +1706,10 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
> return 0;
> }
>
> -static void describe_ctx_ringbuf(struct seq_file *m,
> - struct intel_ringbuffer *ringbuf)
> +static void describe_ring(struct seq_file *m, struct intel_ringbuffer *ring)
> {
> seq_printf(m, " (ringbuffer, space: %d, head: %u, tail: %u, last head: %d)",
> - ringbuf->space, ringbuf->head, ringbuf->tail,
> - ringbuf->last_retired_head);
> + ring->space, ring->head, ring->tail, ring->retired_head);
> }
>
> static int i915_context_status(struct seq_file *m, void *unused)
> @@ -1714,7 +1717,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
> struct drm_info_node *node = m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> struct intel_context *ctx;
> int ret, i;
>
> @@ -1728,42 +1731,26 @@ static int i915_context_status(struct seq_file *m, void *unused)
> seq_putc(m, '\n');
> }
>
> - if (dev_priv->ips.renderctx) {
> - seq_puts(m, "render context ");
> - describe_obj(m, dev_priv->ips.renderctx);
> - seq_putc(m, '\n');
> - }
> -
> list_for_each_entry(ctx, &dev_priv->context_list, link) {
> - if (!i915.enable_execlists &&
> - ctx->legacy_hw_ctx.rcs_state == NULL)
> - continue;
> -
> seq_puts(m, "HW context ");
> describe_ctx(m, ctx);
> - for_each_ring(ring, dev_priv, i) {
> - if (ring->default_context == ctx)
> + for_each_engine(engine, dev_priv, i) {
> + if (engine->default_context == ctx)
> seq_printf(m, "(default context %s) ",
> - ring->name);
> + engine->name);
> }
>
> - if (i915.enable_execlists) {
> + seq_putc(m, '\n');
> + for_each_engine(engine, dev_priv, i) {
> + struct drm_i915_gem_object *obj = ctx->ring[i].state;
> + struct intel_ringbuffer *ring = ctx->ring[i].ring;
> +
> + seq_printf(m, "%s: ", engine->name);
> + if (obj)
> + describe_obj(m, obj);
> + if (ring)
> + describe_ring(m, ring);
> seq_putc(m, '\n');
> - for_each_ring(ring, dev_priv, i) {
> - struct drm_i915_gem_object *ctx_obj =
> - ctx->engine[i].state;
> - struct intel_ringbuffer *ringbuf =
> - ctx->engine[i].ringbuf;
> -
> - seq_printf(m, "%s: ", ring->name);
> - if (ctx_obj)
> - describe_obj(m, ctx_obj);
> - if (ringbuf)
> - describe_ctx_ringbuf(m, ringbuf);
> - seq_putc(m, '\n');
> - }
> - } else {
> - describe_obj(m, ctx->legacy_hw_ctx.rcs_state);
> }
>
> seq_putc(m, '\n');
> @@ -1778,45 +1765,50 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
> {
> struct drm_info_node *node = (struct drm_info_node *) m->private;
> struct drm_device *dev = node->minor->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - struct intel_context *ctx;
> + struct intel_engine_cs *engine;
> int ret, i;
>
> - if (!i915.enable_execlists) {
> - seq_printf(m, "Logical Ring Contexts are disabled\n");
> - return 0;
> - }
> -
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> if (ret)
> return ret;
>
> - list_for_each_entry(ctx, &dev_priv->context_list, link) {
> - for_each_ring(ring, dev_priv, i) {
> - struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
> + for_each_engine(engine, to_i915(dev), i) {
> + struct intel_ringbuffer *ring;
>
> - if (ring->default_context == ctx)
> - continue;
> + list_for_each_entry(ring, &engine->rings, engine_list) {
> + struct intel_context *ctx = ring->ctx;
> + struct task_struct *task;
> +
> + seq_printf(m, "CONTEXT: %s", engine->name);
> +
> + rcu_read_lock();
> + task = ctx->file_priv ? pid_task(ctx->file_priv->file->pid, PIDTYPE_PID) : NULL;
> + seq_printf(m, " %d:%d\n", task ? task->pid : 0, ctx->file_priv ? ctx->user_handle : 0);
> + rcu_read_unlock();
>
> - if (ctx_obj) {
> - struct page *page = i915_gem_object_get_page(ctx_obj, 1);
> - uint32_t *reg_state = kmap_atomic(page);
> + if (engine->execlists_enabled &&
> + ctx->ring[engine->id].state) {
> + struct drm_i915_gem_object *obj;
> + struct page *page;
> + uint32_t *reg_state;
> int j;
>
> - seq_printf(m, "CONTEXT: %s %u\n", ring->name,
> - intel_execlists_ctx_id(ctx_obj));
> + obj = ctx->ring[engine->id].state;
> + page = i915_gem_object_get_page(obj, 1);
> + reg_state = kmap_atomic(page);
>
> + seq_printf(m, "\tLRCA:\n");
> for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) {
> seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 0x%08x\n",
> - i915_gem_obj_ggtt_offset(ctx_obj) + 4096 + (j * 4),
> + i915_gem_obj_ggtt_offset(obj) + 4096 + (j * 4),
> reg_state[j], reg_state[j + 1],
> reg_state[j + 2], reg_state[j + 3]);
> }
> kunmap_atomic(reg_state);
>
> seq_putc(m, '\n');
> - }
> + } else
> + seq_puts(m, "\tLogical Ring Contexts are disabled\n");
> }
> }
>
> @@ -1830,7 +1822,7 @@ static int i915_execlists(struct seq_file *m, void *data)
> struct drm_info_node *node = (struct drm_info_node *)m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> u32 status_pointer;
> u8 read_pointer;
> u8 write_pointer;
> @@ -1840,31 +1832,31 @@ static int i915_execlists(struct seq_file *m, void *data)
> int ring_id, i;
> int ret;
>
> - if (!i915.enable_execlists) {
> - seq_puts(m, "Logical Ring Contexts are disabled\n");
> - return 0;
> - }
> -
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> if (ret)
> return ret;
>
> - for_each_ring(ring, dev_priv, ring_id) {
> - struct intel_ctx_submit_request *head_req = NULL;
> + for_each_engine(engine, dev_priv, ring_id) {
> + struct i915_gem_request *rq = NULL;
> int count = 0;
> unsigned long flags;
>
> - seq_printf(m, "%s\n", ring->name);
> + seq_printf(m, "%s\n", engine->name);
> +
> + if (!engine->execlists_enabled) {
> + seq_puts(m, "\tExeclists are disabled\n");
> + continue;
> + }
>
> - status = I915_READ(RING_EXECLIST_STATUS(ring));
> - ctx_id = I915_READ(RING_EXECLIST_STATUS(ring) + 4);
> + status = I915_READ(RING_EXECLIST_STATUS(engine));
> + ctx_id = I915_READ(RING_EXECLIST_STATUS(engine) + 4);
> seq_printf(m, "\tExeclist status: 0x%08X, context: %u\n",
> status, ctx_id);
>
> - status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> + status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(engine));
> seq_printf(m, "\tStatus pointer: 0x%08X\n", status_pointer);
>
> - read_pointer = ring->next_context_status_buffer;
> + read_pointer = engine->next_context_status_buffer;
> write_pointer = status_pointer & 0x07;
> if (read_pointer > write_pointer)
> write_pointer += 6;
> @@ -1872,29 +1864,33 @@ static int i915_execlists(struct seq_file *m, void *data)
> read_pointer, write_pointer);
>
> for (i = 0; i < 6; i++) {
> - status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 8*i);
> - ctx_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 8*i + 4);
> + status = I915_READ(RING_CONTEXT_STATUS_BUF(engine) + 8*i);
> + ctx_id = I915_READ(RING_CONTEXT_STATUS_BUF(engine) + 8*i + 4);
>
> seq_printf(m, "\tStatus buffer %d: 0x%08X, context: %u\n",
> i, status, ctx_id);
> }
>
> - spin_lock_irqsave(&ring->execlist_lock, flags);
> - list_for_each(cursor, &ring->execlist_queue)
> + spin_lock_irqsave(&engine->irqlock, flags);
> + list_for_each(cursor, &engine->pending)
> count++;
> - head_req = list_first_entry_or_null(&ring->execlist_queue,
> - struct intel_ctx_submit_request, execlist_link);
> - spin_unlock_irqrestore(&ring->execlist_lock, flags);
> + rq = list_first_entry_or_null(&engine->pending, typeof(*rq), engine_list);
> + spin_unlock_irqrestore(&engine->irqlock, flags);
>
> seq_printf(m, "\t%d requests in queue\n", count);
> - if (head_req) {
> - struct drm_i915_gem_object *ctx_obj;
> -
> - ctx_obj = head_req->ctx->engine[ring_id].state;
> - seq_printf(m, "\tHead request id: %u\n",
> - intel_execlists_ctx_id(ctx_obj));
> - seq_printf(m, "\tHead request tail: %u\n",
> - head_req->tail);
> + if (rq) {
> + struct intel_context *ctx = rq->ctx;
> + struct task_struct *task;
> +
> + seq_printf(m, "\tHead request ctx:");
> +
> + rcu_read_lock();
> + task = ctx->file_priv ? pid_task(ctx->file_priv->file->pid, PIDTYPE_PID) : NULL;
> + seq_printf(m, " %d:%d\n", task ? task->pid : 0, ctx->file_priv ? ctx->user_handle : 0);
> + rcu_read_unlock();
> +
> + seq_printf(m, "\tHead request tail: %u\n", rq->tail);
> + seq_printf(m, "\tHead request seqno: %d\n", rq->seqno);
> }
>
> seq_putc(m, '\n');
> @@ -2025,7 +2021,7 @@ static int per_file_ctx(int id, void *ptr, void *data)
> static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> int unused, i;
>
> @@ -2034,13 +2030,13 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
>
> seq_printf(m, "Page directories: %d\n", ppgtt->num_pd_pages);
> seq_printf(m, "Page tables: %d\n", ppgtt->num_pd_entries);
> - for_each_ring(ring, dev_priv, unused) {
> - seq_printf(m, "%s\n", ring->name);
> + for_each_engine(engine, dev_priv, unused) {
> + seq_printf(m, "%s\n", engine->name);
> for (i = 0; i < 4; i++) {
> u32 offset = 0x270 + i * 8;
> - u64 pdp = I915_READ(ring->mmio_base + offset + 4);
> + u64 pdp = I915_READ(engine->mmio_base + offset + 4);
> pdp <<= 32;
> - pdp |= I915_READ(ring->mmio_base + offset);
> + pdp |= I915_READ(engine->mmio_base + offset);
> seq_printf(m, "\tPDP%d 0x%016llx\n", i, pdp);
> }
> }
> @@ -2049,20 +2045,20 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev)
> static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> struct drm_file *file;
> int i;
>
> if (INTEL_INFO(dev)->gen == 6)
> seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(GFX_MODE));
>
> - for_each_ring(ring, dev_priv, i) {
> - seq_printf(m, "%s\n", ring->name);
> + for_each_engine(engine, dev_priv, i) {
> + seq_printf(m, "%s\n", engine->name);
> if (INTEL_INFO(dev)->gen == 7)
> - seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(RING_MODE_GEN7(ring)));
> - seq_printf(m, "PP_DIR_BASE: 0x%08x\n", I915_READ(RING_PP_DIR_BASE(ring)));
> - seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring)));
> - seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring)));
> + seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(RING_MODE_GEN7(engine)));
> + seq_printf(m, "PP_DIR_BASE: 0x%08x\n", I915_READ(RING_PP_DIR_BASE(engine)));
> + seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(engine)));
> + seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(engine)));
> }
> if (dev_priv->mm.aliasing_ppgtt) {
> struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> @@ -2549,67 +2545,62 @@ static int i915_semaphore_status(struct seq_file *m, void *unused)
> struct drm_info_node *node = (struct drm_info_node *) m->private;
> struct drm_device *dev = node->minor->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
> int i, j, ret;
>
> - if (!i915_semaphore_is_enabled(dev)) {
> - seq_puts(m, "Semaphores are disabled\n");
> - return 0;
> - }
> -
> ret = mutex_lock_interruptible(&dev->struct_mutex);
> if (ret)
> return ret;
> intel_runtime_pm_get(dev_priv);
>
> - if (IS_BROADWELL(dev)) {
> - struct page *page;
> - uint64_t *seqno;
> + seq_puts(m, " Last breadcrumb:");
> + for_each_engine(engine, dev_priv, i)
> + for (j = 0; j < num_rings; j++)
> + seq_printf(m, "0x%08x\n",
> + engine->breadcrumb[j]);
> + seq_putc(m, '\n');
>
> - page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0);
> + if (engine->semaphore.wait) {
> + if (IS_BROADWELL(dev)) {
> + struct page *page;
> + uint64_t *seqno;
>
> - seqno = (uint64_t *)kmap_atomic(page);
> - for_each_ring(ring, dev_priv, i) {
> - uint64_t offset;
> + page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0);
>
> - seq_printf(m, "%s\n", ring->name);
> + seqno = (uint64_t *)kmap_atomic(page);
> + for_each_engine(engine, dev_priv, i) {
> + uint64_t offset;
>
> - seq_puts(m, " Last signal:");
> - for (j = 0; j < num_rings; j++) {
> - offset = i * I915_NUM_RINGS + j;
> - seq_printf(m, "0x%08llx (0x%02llx) ",
> - seqno[offset], offset * 8);
> - }
> - seq_putc(m, '\n');
> + seq_printf(m, "%s\n", engine->name);
>
> - seq_puts(m, " Last wait: ");
> - for (j = 0; j < num_rings; j++) {
> - offset = i + (j * I915_NUM_RINGS);
> - seq_printf(m, "0x%08llx (0x%02llx) ",
> - seqno[offset], offset * 8);
> - }
> - seq_putc(m, '\n');
> + seq_puts(m, " Last signal:");
> + for (j = 0; j < num_rings; j++) {
> + offset = i * I915_NUM_ENGINES + j;
> + seq_printf(m, "0x%08llx (0x%02llx) ",
> + seqno[offset], offset * 8);
> + }
> + seq_putc(m, '\n');
>
> - }
> - kunmap_atomic(seqno);
> - } else {
> - seq_puts(m, " Last signal:");
> - for_each_ring(ring, dev_priv, i)
> - for (j = 0; j < num_rings; j++)
> - seq_printf(m, "0x%08x\n",
> - I915_READ(ring->semaphore.mbox.signal[j]));
> - seq_putc(m, '\n');
> - }
> + seq_puts(m, " Last wait: ");
> + for (j = 0; j < num_rings; j++) {
> + offset = i + (j * I915_NUM_ENGINES);
> + seq_printf(m, "0x%08llx (0x%02llx) ",
> + seqno[offset], offset * 8);
> + }
> + seq_putc(m, '\n');
>
> - seq_puts(m, "\nSync seqno:\n");
> - for_each_ring(ring, dev_priv, i) {
> - for (j = 0; j < num_rings; j++) {
> - seq_printf(m, " 0x%08x ", ring->semaphore.sync_seqno[j]);
> + }
> + kunmap_atomic(seqno);
> + } else {
> + seq_puts(m, " Last signal:");
> + for_each_engine(engine, dev_priv, i)
> + for (j = 0; j < num_rings; j++)
> + seq_printf(m, "0x%08x\n",
> + I915_READ(engine->semaphore.mbox.signal[j]));
> + seq_putc(m, '\n');
> }
> - seq_putc(m, '\n');
> }
> - seq_putc(m, '\n');
>
> intel_runtime_pm_put(dev_priv);
> mutex_unlock(&dev->struct_mutex);
> @@ -3826,7 +3817,6 @@ i915_drop_caches_set(void *data, u64 val)
> {
> struct drm_device *dev = data;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct drm_i915_gem_object *obj, *next;
> int ret;
>
> DRM_DEBUG("Dropping caches: 0x%08llx\n", val);
> @@ -3847,10 +3837,18 @@ i915_drop_caches_set(void *data, u64 val)
> i915_gem_retire_requests(dev);
>
> if (val & DROP_BOUND) {
> - list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> - global_list) {
> + struct list_head still_in_list;
> +
> + INIT_LIST_HEAD(&still_in_list);
> + while (!list_empty(&dev_priv->mm.bound_list)) {
> + struct drm_i915_gem_object *obj;
> struct i915_vma *vma, *v;
>
> + obj = list_first_entry(&dev_priv->mm.bound_list,
> + typeof(*obj), global_list);
> +
> + list_move_tail(&obj->global_list, &still_in_list);
> +
> ret = 0;
> drm_gem_object_reference(&obj->base);
> list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link) {
> @@ -3865,16 +3863,30 @@ i915_drop_caches_set(void *data, u64 val)
> if (ret)
> goto unlock;
> }
> +
> + list_splice(&still_in_list, &dev_priv->mm.bound_list);
> }
>
> if (val & DROP_UNBOUND) {
> - list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
> - global_list)
> + struct list_head still_in_list;
> +
> + INIT_LIST_HEAD(&still_in_list);
> + while (!list_empty(&dev_priv->mm.unbound_list)) {
> + struct drm_i915_gem_object *obj;
> +
> + obj = list_first_entry(&dev_priv->mm.unbound_list,
> + typeof(*obj), global_list);
> +
> + list_move_tail(&obj->global_list, &still_in_list);
> +
> if (obj->pages_pin_count == 0) {
> ret = i915_gem_object_put_pages(obj);
> if (ret)
> goto unlock;
> }
> + }
> +
> + list_splice(&still_in_list, &dev_priv->mm.unbound_list);
> }
>
> unlock:
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index a729721595b0..681e7416702c 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -142,13 +142,13 @@ static int i915_getparam(struct drm_device *dev, void *data,
> value = 1;
> break;
> case I915_PARAM_HAS_BSD:
> - value = intel_ring_initialized(&dev_priv->ring[VCS]);
> + value = intel_engine_initialized(&dev_priv->engine[VCS]);
> break;
> case I915_PARAM_HAS_BLT:
> - value = intel_ring_initialized(&dev_priv->ring[BCS]);
> + value = intel_engine_initialized(&dev_priv->engine[BCS]);
> break;
> case I915_PARAM_HAS_VEBOX:
> - value = intel_ring_initialized(&dev_priv->ring[VECS]);
> + value = intel_engine_initialized(&dev_priv->engine[VECS]);
> break;
> case I915_PARAM_HAS_RELAXED_FENCING:
> value = 1;
> @@ -178,7 +178,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
> value = 1;
> break;
> case I915_PARAM_HAS_SEMAPHORES:
> - value = i915_semaphore_is_enabled(dev);
> + value = RCS_ENGINE(dev_priv)->semaphore.wait != NULL;
> break;
> case I915_PARAM_HAS_PRIME_VMAP_FLUSH:
> value = 1;
> @@ -512,8 +512,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
>
> cleanup_gem:
> mutex_lock(&dev->struct_mutex);
> - i915_gem_cleanup_ringbuffer(dev);
> - i915_gem_context_fini(dev);
> + i915_gem_fini(dev);
> mutex_unlock(&dev->struct_mutex);
> cleanup_irq:
> drm_irq_uninstall(dev);
> @@ -698,6 +697,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> if (!drm_core_check_feature(dev, DRIVER_MODESET) && !dev->agp)
> return -EINVAL;
>
> + BUILD_BUG_ON(I915_NUM_ENGINES >= (1 << I915_NUM_ENGINE_BITS));
> +
> dev_priv = kzalloc(sizeof(*dev_priv), GFP_KERNEL);
> if (dev_priv == NULL)
> return -ENOMEM;
> @@ -997,8 +998,7 @@ int i915_driver_unload(struct drm_device *dev)
> flush_workqueue(dev_priv->wq);
>
> mutex_lock(&dev->struct_mutex);
> - i915_gem_cleanup_ringbuffer(dev);
> - i915_gem_context_fini(dev);
> + i915_gem_fini(dev);
> mutex_unlock(&dev->struct_mutex);
> i915_gem_cleanup_stolen(dev);
> }
> @@ -1084,8 +1084,6 @@ void i915_driver_postclose(struct drm_device *dev, struct drm_file *file)
> {
> struct drm_i915_file_private *file_priv = file->driver_priv;
>
> - if (file_priv && file_priv->bsd_ring)
> - file_priv->bsd_ring = NULL;
> kfree(file_priv);
> }
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 4f9c2478aba1..ab504ecc848e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -473,30 +473,6 @@ void intel_detect_pch(struct drm_device *dev)
> pci_dev_put(pch);
> }
>
> -bool i915_semaphore_is_enabled(struct drm_device *dev)
> -{
> - if (INTEL_INFO(dev)->gen < 6)
> - return false;
> -
> - if (i915.semaphores >= 0)
> - return i915.semaphores;
> -
> - /* TODO: make semaphores and Execlists play nicely together */
> - if (i915.enable_execlists)
> - return false;
> -
> - /* Until we get further testing... */
> - if (IS_GEN8(dev))
> - return false;
> -
> -#ifdef CONFIG_INTEL_IOMMU
> - /* Enable semaphores on SNB when IO remapping is off */
> - if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped)
> - return false;
> -#endif
> -
> - return true;
> -}
>
> void intel_hpd_cancel_work(struct drm_i915_private *dev_priv)
> {
> @@ -795,7 +771,6 @@ static int i915_resume_legacy(struct drm_device *dev)
> int i915_reset(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - bool simulated;
> int ret;
>
> if (!i915.reset)
> @@ -803,14 +778,16 @@ int i915_reset(struct drm_device *dev)
>
> mutex_lock(&dev->struct_mutex);
>
> - i915_gem_reset(dev);
> -
> - simulated = dev_priv->gpu_error.stop_rings != 0;
> -
> ret = intel_gpu_reset(dev);
>
> + /* Clear the reset counter. Before anyone else
> + * can grab the mutex, we will declare whether or
> + * not the GPU is wedged.
> + */
> + atomic_inc(&dev_priv->gpu_error.reset_counter);
> +
> /* Also reset the gpu hangman. */
> - if (simulated) {
> + if (dev_priv->gpu_error.stop_rings) {
> DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
> dev_priv->gpu_error.stop_rings = 0;
> if (ret == -ENODEV) {
> @@ -820,6 +797,8 @@ int i915_reset(struct drm_device *dev)
> }
> }
>
> + i915_gem_reset(dev);
> +
> if (ret) {
> DRM_ERROR("Failed to reset chip: %i\n", ret);
> mutex_unlock(&dev->struct_mutex);
> @@ -843,14 +822,7 @@ int i915_reset(struct drm_device *dev)
> if (drm_core_check_feature(dev, DRIVER_MODESET) ||
> !dev_priv->ums.mm_suspended) {
> dev_priv->ums.mm_suspended = 0;
> -
> - /* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
> - dev_priv->gpu_error.reload_in_reset = true;
> -
> ret = i915_gem_init_hw(dev);
> -
> - dev_priv->gpu_error.reload_in_reset = false;
> -
> mutex_unlock(&dev->struct_mutex);
> if (ret) {
> DRM_ERROR("Failed hw init on reset %d\n", ret);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 19d2b060c18c..9529b6b0fef6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -37,7 +37,6 @@
> #include "intel_ringbuffer.h"
> #include "intel_lrc.h"
> #include "i915_gem_gtt.h"
> -#include "i915_gem_render_state.h"
> #include <linux/io-mapping.h>
> #include <linux/i2c.h>
> #include <linux/i2c-algo-bit.h>
> @@ -194,6 +193,7 @@ enum hpd_pin {
> struct drm_i915_private;
> struct i915_mm_struct;
> struct i915_mmu_object;
> +struct i915_gem_request;
>
> enum intel_dpll_id {
> DPLL_ID_PRIVATE = -1, /* non-shared dpll in use */
> @@ -323,6 +323,7 @@ struct drm_i915_error_state {
> u32 pgtbl_er;
> u32 ier;
> u32 gtier[4];
> + u32 gtimr[4];
> u32 ccid;
> u32 derrmr;
> u32 forcewake;
> @@ -340,23 +341,26 @@ struct drm_i915_error_state {
> struct drm_i915_error_object *semaphore_obj;
>
> struct drm_i915_error_ring {
> + int id;
> bool valid;
> /* Software tracked state */
> bool waiting;
> int hangcheck_score;
> - enum intel_ring_hangcheck_action hangcheck_action;
> + enum intel_engine_hangcheck_action hangcheck_action;
> int num_requests;
>
> /* our own tracking of ring head and tail */
> u32 cpu_ring_head;
> u32 cpu_ring_tail;
> -
> - u32 semaphore_seqno[I915_NUM_RINGS - 1];
> + u32 interrupts;
> + u32 irq_count;
>
> /* Register state */
> u32 tail;
> u32 head;
> + u32 start;
> u32 ctl;
> + u32 mode;
> u32 hws;
> u32 ipeir;
> u32 ipehr;
> @@ -364,13 +368,15 @@ struct drm_i915_error_state {
> u32 bbstate;
> u32 instpm;
> u32 instps;
> - u32 seqno;
> + u32 seqno, request, tag, hangcheck;
> + u32 breadcrumb[I915_NUM_ENGINES];
> u64 bbaddr;
> u64 acthd;
> u32 fault_reg;
> u64 faddr;
> u32 rc_psmi; /* sleep state */
> - u32 semaphore_mboxes[I915_NUM_RINGS - 1];
> + u32 semaphore_mboxes[I915_NUM_ENGINES];
> + u32 semaphore_sync[I915_NUM_ENGINES];
>
> struct drm_i915_error_object {
> int page_count;
> @@ -380,8 +386,14 @@ struct drm_i915_error_state {
>
> struct drm_i915_error_request {
> long jiffies;
> - u32 seqno;
> + long pid;
> + u32 batch;
> + u32 head;
> u32 tail;
> + u32 seqno;
> + u32 breadcrumb[I915_NUM_ENGINES];
> + u32 complete;
> + u32 tag;
> } *requests;
>
> struct {
> @@ -394,12 +406,12 @@ struct drm_i915_error_state {
>
> pid_t pid;
> char comm[TASK_COMM_LEN];
> - } ring[I915_NUM_RINGS];
> + } ring[I915_NUM_ENGINES];
>
> struct drm_i915_error_buffer {
> u32 size;
> u32 name;
> - u32 rseqno, wseqno;
> + u32 rseqno[I915_NUM_ENGINES], wseqno, fseqno;
> u32 gtt_offset;
> u32 read_domains;
> u32 write_domain;
> @@ -471,10 +483,10 @@ struct drm_i915_display_funcs {
> struct drm_display_mode *mode);
> void (*fdi_link_train)(struct drm_crtc *crtc);
> void (*init_clock_gating)(struct drm_device *dev);
> - int (*queue_flip)(struct drm_device *dev, struct drm_crtc *crtc,
> + int (*queue_flip)(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags);
> void (*update_primary_plane)(struct drm_crtc *crtc,
> struct drm_framebuffer *fb,
> @@ -626,24 +638,18 @@ struct i915_ctx_hang_stats {
> */
> struct intel_context {
> struct kref ref;
> + struct drm_i915_private *i915;
> int user_handle;
> uint8_t remap_slice;
> struct drm_i915_file_private *file_priv;
> struct i915_ctx_hang_stats hang_stats;
> struct i915_hw_ppgtt *ppgtt;
>
> - /* Legacy ring buffer submission */
> - struct {
> - struct drm_i915_gem_object *rcs_state;
> - bool initialized;
> - } legacy_hw_ctx;
> -
> - /* Execlists */
> - bool rcs_initialized;
> - struct {
> + struct intel_engine_context {
> + struct intel_ringbuffer *ring;
> struct drm_i915_gem_object *state;
> - struct intel_ringbuffer *ringbuf;
> - } engine[I915_NUM_RINGS];
> + bool initialized;
> + } ring[I915_NUM_ENGINES];
>
> struct list_head link;
> };
> @@ -1028,7 +1034,6 @@ struct intel_ilk_power_mgmt {
> int r_t;
>
> struct drm_i915_gem_object *pwrctx;
> - struct drm_i915_gem_object *renderctx;
> };
>
> struct drm_i915_private;
> @@ -1253,9 +1258,6 @@ struct i915_gpu_error {
>
> /* For missed irq/seqno simulation. */
> unsigned int test_irq_rings;
> -
> - /* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
> - bool reload_in_reset;
> };
>
> enum modeset_restore {
> @@ -1460,9 +1462,10 @@ struct drm_i915_private {
> wait_queue_head_t gmbus_wait_queue;
>
> struct pci_dev *bridge_dev;
> - struct intel_engine_cs ring[I915_NUM_RINGS];
> + struct intel_engine_cs engine[I915_NUM_ENGINES];
> + struct intel_context *default_context;
> struct drm_i915_gem_object *semaphore_obj;
> - uint32_t last_seqno, next_seqno;
> + uint32_t next_seqno;
>
> drm_dma_handle_t *status_page_dmah;
> struct resource mch_res;
> @@ -1673,21 +1676,6 @@ struct drm_i915_private {
>
> /* Old ums support infrastructure, same warning applies. */
> struct i915_ums_state ums;
> -
> - /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
> - struct {
> - int (*do_execbuf)(struct drm_device *dev, struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct intel_context *ctx,
> - struct drm_i915_gem_execbuffer2 *args,
> - struct list_head *vmas,
> - struct drm_i915_gem_object *batch_obj,
> - u64 exec_start, u32 flags);
> - int (*init_rings)(struct drm_device *dev);
> - void (*cleanup_ring)(struct intel_engine_cs *ring);
> - void (*stop_ring)(struct intel_engine_cs *ring);
> - } gt;
> -
> /*
> * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
> * will be rejected. Instead look for a better place.
> @@ -1700,9 +1688,11 @@ static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> }
>
> /* Iterate over initialised rings */
> -#define for_each_ring(ring__, dev_priv__, i__) \
> - for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
> - if (((ring__) = &(dev_priv__)->ring[(i__)]), intel_ring_initialized((ring__)))
> +#define for_each_engine(engine__, dev_priv__, i__) \
> + for ((i__) = 0; (i__) < I915_NUM_ENGINES; (i__)++) \
> + if (((engine__) = &(dev_priv__)->engine[(i__)]), intel_engine_initialized((engine__)))
> +
> +#define RCS_ENGINE(x) (&__I915__(x)->engine[RCS])
>
> enum hdmi_force_audio {
> HDMI_AUDIO_OFF_DVI = -2, /* no aux data for HDMI-DVI converter */
> @@ -1767,16 +1757,15 @@ struct drm_i915_gem_object {
> struct drm_mm_node *stolen;
> struct list_head global_list;
>
> - struct list_head ring_list;
> /** Used in execbuf to temporarily hold a ref */
> struct list_head obj_exec_link;
>
> /**
> * This is set if the object is on the active lists (has pending
> - * rendering and so a non-zero seqno), and is not set if it i s on
> - * inactive (ready to be unbound) list.
> + * rendering and so a submitted request), and is not set if it is on
> + * inactive (ready to be unbound) list. We track activity per engine.
> */
> - unsigned int active:1;
> + unsigned int active:I915_NUM_ENGINE_BITS;
>
> /**
> * This is set if the object has been written to since last bound
> @@ -1844,13 +1833,11 @@ struct drm_i915_gem_object {
> void *dma_buf_vmapping;
> int vmapping_count;
>
> - struct intel_engine_cs *ring;
> -
> - /** Breadcrumb of last rendering to the buffer. */
> - uint32_t last_read_seqno;
> - uint32_t last_write_seqno;
> - /** Breadcrumb of last fenced GPU access to the buffer. */
> - uint32_t last_fenced_seqno;
> + /** Breadcrumbs of last rendering to the buffer. */
> + struct {
> + struct i915_gem_request *request;
> + struct list_head engine_list;
> + } last_write, last_read[I915_NUM_ENGINES], last_fence;
>
> /** Current tiling stride for the object, if it's tiled. */
> uint32_t stride;
> @@ -1888,44 +1875,13 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
> unsigned frontbuffer_bits);
>
> /**
> - * Request queue structure.
> - *
> - * The request queue allows us to note sequence numbers that have been emitted
> - * and may be associated with active buffers to be retired.
> - *
> - * By keeping this list, we can avoid having to do questionable
> - * sequence-number comparisons on buffer last_rendering_seqnos, and associate
> - * an emission time with seqnos for tracking how far ahead of the GPU we are.
> + * Returns true if seq1 is later than seq2.
> */
> -struct drm_i915_gem_request {
> - /** On Which ring this request was generated */
> - struct intel_engine_cs *ring;
> -
> - /** GEM sequence number associated with this request. */
> - uint32_t seqno;
> -
> - /** Position in the ringbuffer of the start of the request */
> - u32 head;
> -
> - /** Position in the ringbuffer of the end of the request */
> - u32 tail;
> -
> - /** Context related to this request */
> - struct intel_context *ctx;
> -
> - /** Batch buffer related to this request if any */
> - struct drm_i915_gem_object *batch_obj;
> -
> - /** Time at which this request was emitted, in jiffies. */
> - unsigned long emitted_jiffies;
> -
> - /** global list entry for this request */
> - struct list_head list;
> -
> - struct drm_i915_file_private *file_priv;
> - /** file_priv list entry for this request */
> - struct list_head client_list;
> -};
> +static inline bool
> +__i915_seqno_passed(uint32_t seq1, uint32_t seq2)
> +{
> + return (int32_t)(seq1 - seq2) >= 0;
> +}
>
> struct drm_i915_file_private {
> struct drm_i915_private *dev_priv;
> @@ -1939,7 +1895,7 @@ struct drm_i915_file_private {
> struct idr context_idr;
>
> atomic_t rps_wait_boost;
> - struct intel_engine_cs *bsd_ring;
> + struct intel_engine_cs *bsd_engine;
> };
>
> /*
> @@ -2119,7 +2075,7 @@ struct drm_i915_cmd_table {
> to_i915(dev)->ellc_size)
> #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)->need_gfx_hws)
>
> -#define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6)
> +#define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 5)
> #define HAS_LOGICAL_RING_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 8)
> #define HAS_ALIASING_PPGTT(dev) (INTEL_INFO(dev)->gen >= 6)
> #define HAS_PPGTT(dev) (INTEL_INFO(dev)->gen >= 7 && !IS_GEN8(dev))
> @@ -2227,7 +2183,7 @@ struct i915_params {
> };
> extern struct i915_params i915 __read_mostly;
>
> - /* i915_dma.c */
> +/* i915_dma.c */
> extern int i915_driver_load(struct drm_device *, unsigned long flags);
> extern int i915_driver_unload(struct drm_device *);
> extern int i915_driver_open(struct drm_device *dev, struct drm_file *file);
> @@ -2297,20 +2253,6 @@ int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
> int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
> -void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
> - struct intel_engine_cs *ring);
> -void i915_gem_execbuffer_retire_commands(struct drm_device *dev,
> - struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct drm_i915_gem_object *obj);
> -int i915_gem_ringbuffer_submission(struct drm_device *dev,
> - struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct intel_context *ctx,
> - struct drm_i915_gem_execbuffer2 *args,
> - struct list_head *vmas,
> - struct drm_i915_gem_object *batch_obj,
> - u64 exec_start, u32 flags);
> int i915_gem_execbuffer(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
> int i915_gem_execbuffer2(struct drm_device *dev, void *data,
> @@ -2397,22 +2339,12 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
>
> int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
> int i915_gem_object_sync(struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *to);
> -void i915_vma_move_to_active(struct i915_vma *vma,
> - struct intel_engine_cs *ring);
> + struct i915_gem_request *rq);
> int i915_gem_dumb_create(struct drm_file *file_priv,
> struct drm_device *dev,
> struct drm_mode_create_dumb *args);
> int i915_gem_mmap_gtt(struct drm_file *file_priv, struct drm_device *dev,
> uint32_t handle, uint64_t *offset);
> -/**
> - * Returns true if seq1 is later than seq2.
> - */
> -static inline bool
> -i915_seqno_passed(uint32_t seq1, uint32_t seq2)
> -{
> - return (int32_t)(seq1 - seq2) >= 0;
> -}
>
> int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno);
> int __must_check i915_gem_set_seqno(struct drm_device *dev, u32 seqno);
> @@ -2422,24 +2354,33 @@ int __must_check i915_gem_object_put_fence(struct drm_i915_gem_object *obj);
> bool i915_gem_object_pin_fence(struct drm_i915_gem_object *obj);
> void i915_gem_object_unpin_fence(struct drm_i915_gem_object *obj);
>
> -struct drm_i915_gem_request *
> -i915_gem_find_active_request(struct intel_engine_cs *ring);
> -
> bool i915_gem_retire_requests(struct drm_device *dev);
> -void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
> -int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> - bool interruptible);
> -int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
> +void i915_gem_retire_requests__engine(struct intel_engine_cs *engine);
> +
> +static inline bool __i915_reset_in_progress(unsigned x)
> +{
> + return unlikely(x & I915_RESET_IN_PROGRESS_FLAG);
> +}
>
> static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> {
> - return unlikely(atomic_read(&error->reset_counter)
> - & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
> + return __i915_reset_in_progress(atomic_read(&error->reset_counter));
> +}
> +
> +static inline bool __i915_terminally_wedged(unsigned x)
> +{
> + return unlikely(x & I915_WEDGED);
> }
>
> static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
> {
> - return atomic_read(&error->reset_counter) & I915_WEDGED;
> + return __i915_terminally_wedged(atomic_read(&error->reset_counter));
> +}
> +
> +static inline bool i915_recovery_pending(struct i915_gpu_error *error)
> +{
> + unsigned x = atomic_read(&error->reset_counter);
> + return __i915_reset_in_progress(x) && !__i915_terminally_wedged(x);
> }
>
> static inline u32 i915_reset_count(struct i915_gpu_error *error)
> @@ -2463,21 +2404,11 @@ void i915_gem_reset(struct drm_device *dev);
> bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
> int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> int __must_check i915_gem_init(struct drm_device *dev);
> -int i915_gem_init_rings(struct drm_device *dev);
> int __must_check i915_gem_init_hw(struct drm_device *dev);
> -int i915_gem_l3_remap(struct intel_engine_cs *ring, int slice);
> +void i915_gem_fini(struct drm_device *dev);
> void i915_gem_init_swizzling(struct drm_device *dev);
> -void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
> int __must_check i915_gpu_idle(struct drm_device *dev);
> int __must_check i915_gem_suspend(struct drm_device *dev);
> -int __i915_add_request(struct intel_engine_cs *ring,
> - struct drm_file *file,
> - struct drm_i915_gem_object *batch_obj,
> - u32 *seqno);
> -#define i915_add_request(ring, seqno) \
> - __i915_add_request(ring, NULL, NULL, seqno)
> -int __must_check i915_wait_seqno(struct intel_engine_cs *ring,
> - uint32_t seqno);
> int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
> int __must_check
> i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj,
> @@ -2487,7 +2418,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
> int __must_check
> i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> u32 alignment,
> - struct intel_engine_cs *pipelined);
> + struct i915_gem_request *pipelined);
> void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj);
> int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
> int align);
> @@ -2534,13 +2465,10 @@ static inline bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) {
> }
>
> /* Some GGTT VM helpers */
> -#define i915_obj_to_ggtt(obj) \
> - (&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
> +#define i915_obj_to_ggtt(obj) (&to_i915((obj)->base.dev)->gtt.base)
> static inline bool i915_is_ggtt(struct i915_address_space *vm)
> {
> - struct i915_address_space *ggtt =
> - &((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
> - return vm == ggtt;
> + return vm == &to_i915(vm->dev)->gtt.base;
> }
>
> static inline struct i915_hw_ppgtt *
> @@ -2589,12 +2517,12 @@ void i915_gem_object_ggtt_unpin(struct drm_i915_gem_object *obj);
> /* i915_gem_context.c */
> int __must_check i915_gem_context_init(struct drm_device *dev);
> void i915_gem_context_fini(struct drm_device *dev);
> -void i915_gem_context_reset(struct drm_device *dev);
> int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
> int i915_gem_context_enable(struct drm_i915_private *dev_priv);
> void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
> -int i915_switch_context(struct intel_engine_cs *ring,
> - struct intel_context *to);
> +int i915_request_switch_context(struct i915_gem_request *rq);
> +void i915_request_switch_context__commit(struct i915_gem_request *rq);
> +void i915_request_switch_context__undo(struct i915_gem_request *rq);
> struct intel_context *
> i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
> void i915_gem_context_free(struct kref *ctx_ref);
> @@ -2624,6 +2552,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
>
> +/* i915_gem_render_state.c */
> +int i915_gem_render_state_init(struct i915_gem_request *rq);
> /* i915_gem_evict.c */
> int __must_check i915_gem_evict_something(struct drm_device *dev,
> struct i915_address_space *vm,
> @@ -2643,6 +2573,160 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
> intel_gtt_chipset_flush();
> }
>
> +/* i915_gem_request.c */
> +
> +/**
> + * Request queue structure.
> + *
> + * The request queue allows us to note sequence numbers that have been emitted
> + * and may be associated with active buffers to be retired.
> + *
> + * By keeping this list, we can avoid having to do questionable
> + * sequence-number comparisons on buffer last_rendering_seqnos, and associate
> + * an emission time with seqnos for tracking how far ahead of the GPU we are.
> + */
> +struct i915_gem_request {
> + struct kref kref;
> +
> + /** On which ring/engine/ctx this request was generated */
> + struct drm_i915_private *i915;
> + struct intel_context *ctx;
> + struct intel_engine_cs *engine;
> + struct intel_ringbuffer *ring;
> +
> + /** How many GPU resets ago was this request first constructed? */
> + unsigned reset_counter;
> +
> + /** GEM sequence number/breadcrumb associated with this request. */
> + u32 seqno;
> + u32 breadcrumb[I915_NUM_ENGINES];
> + u32 semaphore[I915_NUM_ENGINES];
> +
> + /** Position in the ringbuffer of the request */
> + u32 head, tail;
> +
> + /** Batch buffer and objects related to this request if any */
> + struct i915_vma *batch;
> + struct list_head vmas;
> +
> + /** Time at which this request was emitted, in jiffies. */
> + unsigned long emitted_jiffies;
> +
> + /** global list entry for this request */
> + struct list_head engine_list;
> + struct list_head breadcrumb_link;
> +
> + struct drm_i915_file_private *file_priv;
> + /** file_priv list entry for this request */
> + struct list_head client_list;
> +
> + u16 tag;
> + unsigned remap_l3:8;
> + unsigned pending_flush:4;
> + bool outstanding:1;
> + bool has_ctx_switch:1;
> +
> + bool completed; /* kept separate for atomicity */
> +};
> +
> +static inline struct intel_engine_cs *i915_request_engine(struct i915_gem_request *rq)
> +{
> + return rq ? rq->engine : NULL;
> +}
> +
> +static inline int i915_request_engine_id(struct i915_gem_request *rq)
> +{
> + return rq ? rq->engine->id : -1;
> +}
> +
> +static inline u32 i915_request_seqno(struct i915_gem_request *rq)
> +{
> + return rq ? rq->seqno : 0;
> +}
> +
> +bool __i915_request_complete__wa(struct i915_gem_request *rq);
> +
> +static inline bool
> +i915_request_complete(struct i915_gem_request *rq)
> +{
> + if (!rq->completed && rq->engine->is_complete(rq)) {
> + trace_i915_gem_request_complete(rq);
> + rq->completed = true;
> + }
> + return rq->completed;
> +}
> +
> +static inline struct i915_gem_request *
> +i915_request_get(struct i915_gem_request *rq)
> +{
> + if (rq)
> + kref_get(&rq->kref);
> + return rq;
> +}
> +
> +void __i915_request_free(struct kref *kref);
> +
> +static inline void
> +i915_request_put(struct i915_gem_request *rq)
> +{
> + if (rq == NULL)
> + return;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> + kref_put(&rq->kref, __i915_request_free);
> +}
> +
> +static inline void
> +i915_request_put__unlocked(struct i915_gem_request *rq)
> +{
> + if (!atomic_add_unless(&rq->kref.refcount, -1, 1)) {
> + struct drm_device *dev = rq->i915->dev;
> +
> + mutex_lock(&dev->struct_mutex);
> + if (likely(atomic_dec_and_test(&rq->kref.refcount)))
> + __i915_request_free(&rq->kref);
> + mutex_unlock(&dev->struct_mutex);
> + }
> +}
> +
> +int __must_check
> +i915_request_add_vma(struct i915_gem_request *rq,
> + struct i915_vma *vma,
> + unsigned fenced);
> +#define VMA_IS_FENCED 0x1
> +#define VMA_HAS_FENCE 0x2
> +int __must_check
> +i915_request_emit_flush(struct i915_gem_request *rq,
> + unsigned flags);
> +int __must_check
> +__i915_request_emit_breadcrumb(struct i915_gem_request *rq, int id);
> +static inline int __must_check
> +i915_request_emit_breadcrumb(struct i915_gem_request *rq)
> +{
> + return __i915_request_emit_breadcrumb(rq, rq->engine->id);
> +}
> +static inline int __must_check
> +i915_request_emit_semaphore(struct i915_gem_request *rq, int id)
> +{
> + return __i915_request_emit_breadcrumb(rq, id);
> +}
> +int __must_check
> +i915_request_emit_batchbuffer(struct i915_gem_request *rq,
> + struct i915_vma *batch,
> + uint64_t start, uint32_t len,
> + unsigned flags);
> +int __must_check
> +i915_request_commit(struct i915_gem_request *rq);
> +struct i915_gem_request *
> +i915_request_get_breadcrumb(struct i915_gem_request *rq);
> +int __must_check
> +i915_request_wait(struct i915_gem_request *rq);
> +int __i915_request_wait(struct i915_gem_request *rq,
> + bool interruptible,
> + s64 *timeout,
> + struct drm_i915_file_private *file);
> +void i915_request_retire(struct i915_gem_request *rq);
> +
> /* i915_gem_stolen.c */
> int i915_gem_init_stolen(struct drm_device *dev);
> int i915_gem_stolen_setup_compression(struct drm_device *dev, int size, int fb_cpp);
> @@ -2669,13 +2753,6 @@ void i915_gem_detect_bit_6_swizzle(struct drm_device *dev);
> void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj);
> void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj);
>
> -/* i915_gem_debug.c */
> -#if WATCH_LISTS
> -int i915_verify_lists(struct drm_device *dev);
> -#else
> -#define i915_verify_lists(dev) 0
> -#endif
> -
> /* i915_debugfs.c */
> int i915_debugfs_init(struct drm_minor *minor);
> void i915_debugfs_cleanup(struct drm_minor *minor);
> @@ -2710,10 +2787,10 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
>
> /* i915_cmd_parser.c */
> int i915_cmd_parser_get_version(void);
> -int i915_cmd_parser_init_ring(struct intel_engine_cs *ring);
> -void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring);
> -bool i915_needs_cmd_parser(struct intel_engine_cs *ring);
> -int i915_parse_cmds(struct intel_engine_cs *ring,
> +int i915_cmd_parser_init_engine(struct intel_engine_cs *engine);
> +void i915_cmd_parser_fini_engine(struct intel_engine_cs *engine);
> +bool i915_needs_cmd_parser(struct intel_engine_cs *engine);
> +int i915_parse_cmds(struct intel_engine_cs *engine,
> struct drm_i915_gem_object *batch_obj,
> u32 batch_start_offset,
> bool is_master);
> @@ -2812,14 +2889,11 @@ extern void intel_detect_pch(struct drm_device *dev);
> extern int intel_trans_dp_port_sel(struct drm_crtc *crtc);
> extern int intel_enable_rc6(const struct drm_device *dev);
>
> -extern bool i915_semaphore_is_enabled(struct drm_device *dev);
> int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file);
> int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file);
>
> -void intel_notify_mmio_flip(struct intel_engine_cs *ring);
> -
> /* overlay */
> extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f4553b2bee8e..46d3aced7a50 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -44,9 +44,6 @@ static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *o
> static __must_check int
> i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
> bool readonly);
> -static void
> -i915_gem_object_retire(struct drm_i915_gem_object *obj);
> -
> static void i915_gem_write_fence(struct drm_device *dev, int reg,
> struct drm_i915_gem_object *obj);
> static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
> @@ -108,23 +105,95 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
> spin_unlock(&dev_priv->mm.object_stat_lock);
> }
>
> +static void
> +i915_gem_object_retire__write(struct drm_i915_gem_object *obj)
> +{
> + intel_fb_obj_flush(obj, true);
> + list_del_init(&obj->last_write.engine_list);
> + i915_request_put(obj->last_write.request);
> + obj->last_write.request = NULL;
> +}
> +
> +static void
> +i915_gem_object_retire__fence(struct drm_i915_gem_object *obj)
> +{
> + list_del_init(&obj->last_fence.engine_list);
> + i915_request_put(obj->last_fence.request);
> + obj->last_fence.request = NULL;
> +}
> +
> +static void
> +i915_gem_object_retire__read(struct drm_i915_gem_object *obj,
> + struct intel_engine_cs *engine)
> +{
> + struct i915_vma *vma;
> +
> + BUG_ON(obj->active == 0);
> +
> + list_del_init(&obj->last_read[engine->id].engine_list);
> + i915_request_put(obj->last_read[engine->id].request);
> + obj->last_read[engine->id].request = NULL;
> +
> + if (obj->last_write.request &&
> + obj->last_write.request->engine == engine)
> + i915_gem_object_retire__write(obj);
> +
> + if (obj->last_fence.request &&
> + obj->last_fence.request->engine == engine)
> + i915_gem_object_retire__fence(obj);
> +
> + if (--obj->active)
> + return;
> +
> + list_for_each_entry(vma, &obj->vma_list, vma_link)
> + if (!list_empty(&vma->mm_list))
> + list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
> +
> + drm_gem_object_unreference(&obj->base);
> +}
> +
> +static void
> +i915_gem_object_retire(struct drm_i915_gem_object *obj)
> +{
> + struct i915_gem_request *rq;
> + int i;
> +
> + /* We should only be called from code paths where we know we
> + * hold both the active reference *and* a user reference.
> + * Therefore we can safely access the object after retiring as
> + * we will hold a second reference and not free the object.
> + */
> +
> + rq = obj->last_write.request;
> + if (rq && i915_request_complete(rq))
> + i915_gem_object_retire__write(obj);
> +
> + rq = obj->last_fence.request;
> + if (rq && i915_request_complete(rq))
> + i915_gem_object_retire__fence(obj);
> +
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + rq = obj->last_read[i].request;
> + if (rq && i915_request_complete(rq))
> + i915_gem_object_retire__read(obj, rq->engine);
> + }
> +
> + if (!obj->active)
> + i915_gem_retire_requests(obj->base.dev);
> +}
> +
> static int
> i915_gem_wait_for_error(struct i915_gpu_error *error)
> {
> int ret;
>
> -#define EXIT_COND (!i915_reset_in_progress(error) || \
> - i915_terminally_wedged(error))
> - if (EXIT_COND)
> - return 0;
> -
> /*
> * Only wait 10 seconds for the gpu reset to complete to avoid hanging
> * userspace. If it takes that long something really bad is going on and
> * we should simply try to bail out and fail as gracefully as possible.
> */
> ret = wait_event_interruptible_timeout(error->reset_queue,
> - EXIT_COND,
> + !i915_recovery_pending(error),
> 10*HZ);
> if (ret == 0) {
> DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
> @@ -132,7 +201,6 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
> } else if (ret < 0) {
> return ret;
> }
> -#undef EXIT_COND
>
> return 0;
> }
> @@ -152,7 +220,6 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
> if (ret)
> return ret;
>
> - WARN_ON(i915_verify_lists(dev));
> return 0;
> }
>
> @@ -476,8 +543,6 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
> ret = i915_gem_object_wait_rendering(obj, true);
> if (ret)
> return ret;
> -
> - i915_gem_object_retire(obj);
> }
>
> ret = i915_gem_object_get_pages(obj);
> @@ -893,8 +958,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
> ret = i915_gem_object_wait_rendering(obj, false);
> if (ret)
> return ret;
> -
> - i915_gem_object_retire(obj);
> }
> /* Same trick applies to invalidate partially written cachelines read
> * before writing. */
> @@ -1073,235 +1136,6 @@ unlock:
> return ret;
> }
>
> -int
> -i915_gem_check_wedge(struct i915_gpu_error *error,
> - bool interruptible)
> -{
> - if (i915_reset_in_progress(error)) {
> - /* Non-interruptible callers can't handle -EAGAIN, hence return
> - * -EIO unconditionally for these. */
> - if (!interruptible)
> - return -EIO;
> -
> - /* Recovery complete, but the reset failed ... */
> - if (i915_terminally_wedged(error))
> - return -EIO;
> -
> - /*
> - * Check if GPU Reset is in progress - we need intel_ring_begin
> - * to work properly to reinit the hw state while the gpu is
> - * still marked as reset-in-progress. Handle this with a flag.
> - */
> - if (!error->reload_in_reset)
> - return -EAGAIN;
> - }
> -
> - return 0;
> -}
> -
> -/*
> - * Compare seqno against outstanding lazy request. Emit a request if they are
> - * equal.
> - */
> -int
> -i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
> -{
> - int ret;
> -
> - BUG_ON(!mutex_is_locked(&ring->dev->struct_mutex));
> -
> - ret = 0;
> - if (seqno == ring->outstanding_lazy_seqno)
> - ret = i915_add_request(ring, NULL);
> -
> - return ret;
> -}
> -
> -static void fake_irq(unsigned long data)
> -{
> - wake_up_process((struct task_struct *)data);
> -}
> -
> -static bool missed_irq(struct drm_i915_private *dev_priv,
> - struct intel_engine_cs *ring)
> -{
> - return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
> -}
> -
> -static bool can_wait_boost(struct drm_i915_file_private *file_priv)
> -{
> - if (file_priv == NULL)
> - return true;
> -
> - return !atomic_xchg(&file_priv->rps_wait_boost, true);
> -}
> -
> -/**
> - * __wait_seqno - wait until execution of seqno has finished
> - * @ring: the ring expected to report seqno
> - * @seqno: duh!
> - * @reset_counter: reset sequence associated with the given seqno
> - * @interruptible: do an interruptible wait (normally yes)
> - * @timeout: in - how long to wait (NULL forever); out - how much time remaining
> - *
> - * Note: It is of utmost importance that the passed in seqno and reset_counter
> - * values have been read by the caller in an smp safe manner. Where read-side
> - * locks are involved, it is sufficient to read the reset_counter before
> - * unlocking the lock that protects the seqno. For lockless tricks, the
> - * reset_counter _must_ be read before, and an appropriate smp_rmb must be
> - * inserted.
> - *
> - * Returns 0 if the seqno was found within the alloted time. Else returns the
> - * errno with remaining time filled in timeout argument.
> - */
> -static int __wait_seqno(struct intel_engine_cs *ring, u32 seqno,
> - unsigned reset_counter,
> - bool interruptible,
> - s64 *timeout,
> - struct drm_i915_file_private *file_priv)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - const bool irq_test_in_progress =
> - ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
> - DEFINE_WAIT(wait);
> - unsigned long timeout_expire;
> - s64 before, now;
> - int ret;
> -
> - WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
> -
> - if (i915_seqno_passed(ring->get_seqno(ring, true), seqno))
> - return 0;
> -
> - timeout_expire = timeout ? jiffies + nsecs_to_jiffies((u64)*timeout) : 0;
> -
> - if (INTEL_INFO(dev)->gen >= 6 && ring->id == RCS && can_wait_boost(file_priv)) {
> - gen6_rps_boost(dev_priv);
> - if (file_priv)
> - mod_delayed_work(dev_priv->wq,
> - &file_priv->mm.idle_work,
> - msecs_to_jiffies(100));
> - }
> -
> - if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring)))
> - return -ENODEV;
> -
> - /* Record current time in case interrupted by signal, or wedged */
> - trace_i915_gem_request_wait_begin(ring, seqno);
> - before = ktime_get_raw_ns();
> - for (;;) {
> - struct timer_list timer;
> -
> - prepare_to_wait(&ring->irq_queue, &wait,
> - interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
> -
> - /* We need to check whether any gpu reset happened in between
> - * the caller grabbing the seqno and now ... */
> - if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
> - /* ... but upgrade the -EAGAIN to an -EIO if the gpu
> - * is truely gone. */
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> - if (ret == 0)
> - ret = -EAGAIN;
> - break;
> - }
> -
> - if (i915_seqno_passed(ring->get_seqno(ring, false), seqno)) {
> - ret = 0;
> - break;
> - }
> -
> - if (interruptible && signal_pending(current)) {
> - ret = -ERESTARTSYS;
> - break;
> - }
> -
> - if (timeout && time_after_eq(jiffies, timeout_expire)) {
> - ret = -ETIME;
> - break;
> - }
> -
> - timer.function = NULL;
> - if (timeout || missed_irq(dev_priv, ring)) {
> - unsigned long expire;
> -
> - setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
> - expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
> - mod_timer(&timer, expire);
> - }
> -
> - io_schedule();
> -
> - if (timer.function) {
> - del_singleshot_timer_sync(&timer);
> - destroy_timer_on_stack(&timer);
> - }
> - }
> - now = ktime_get_raw_ns();
> - trace_i915_gem_request_wait_end(ring, seqno);
> -
> - if (!irq_test_in_progress)
> - ring->irq_put(ring);
> -
> - finish_wait(&ring->irq_queue, &wait);
> -
> - if (timeout) {
> - s64 tres = *timeout - (now - before);
> -
> - *timeout = tres < 0 ? 0 : tres;
> - }
> -
> - return ret;
> -}
> -
> -/**
> - * Waits for a sequence number to be signaled, and cleans up the
> - * request and object lists appropriately for that event.
> - */
> -int
> -i915_wait_seqno(struct intel_engine_cs *ring, uint32_t seqno)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - bool interruptible = dev_priv->mm.interruptible;
> - int ret;
> -
> - BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> - BUG_ON(seqno == 0);
> -
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> - if (ret)
> - return ret;
> -
> - ret = i915_gem_check_olr(ring, seqno);
> - if (ret)
> - return ret;
> -
> - return __wait_seqno(ring, seqno,
> - atomic_read(&dev_priv->gpu_error.reset_counter),
> - interruptible, NULL, NULL);
> -}
> -
> -static int
> -i915_gem_object_wait_rendering__tail(struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring)
> -{
> - if (!obj->active)
> - return 0;
> -
> - /* Manually manage the write flush as we may have not yet
> - * retired the buffer.
> - *
> - * Note that the last_write_seqno is always the earlier of
> - * the two (read/write) seqno, so if we haved successfully waited,
> - * we know we have passed the last write.
> - */
> - obj->last_write_seqno = 0;
> -
> - return 0;
> -}
> -
> /**
> * Ensures that all rendering to the object has completed and the object is
> * safe to unbind from the GTT or access from the CPU.
> @@ -1310,19 +1144,30 @@ static __must_check int
> i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
> bool readonly)
> {
> - struct intel_engine_cs *ring = obj->ring;
> - u32 seqno;
> - int ret;
> + int i, ret;
>
> - seqno = readonly ? obj->last_write_seqno : obj->last_read_seqno;
> - if (seqno == 0)
> + if (!obj->active)
> return 0;
>
> - ret = i915_wait_seqno(ring, seqno);
> - if (ret)
> - return ret;
> + if (readonly) {
> + if (obj->last_write.request) {
> + ret = i915_request_wait(obj->last_write.request);
> + if (ret)
> + return ret;
> + }
> + } else {
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + if (obj->last_read[i].request == NULL)
> + continue;
> +
> + ret = i915_request_wait(obj->last_read[i].request);
> + if (ret)
> + return ret;
> + }
> + }
>
> - return i915_gem_object_wait_rendering__tail(obj, ring);
> + i915_gem_object_retire(obj);
> + return 0;
> }
>
> /* A nonblocking variant of the above wait. This is a highly dangerous routine
> @@ -1335,34 +1180,51 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
> {
> struct drm_device *dev = obj->base.dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = obj->ring;
> - unsigned reset_counter;
> - u32 seqno;
> - int ret;
> + struct i915_gem_request *rq[I915_NUM_ENGINES] = {};
> + int i, n, ret;
>
> BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> BUG_ON(!dev_priv->mm.interruptible);
>
> - seqno = readonly ? obj->last_write_seqno : obj->last_read_seqno;
> - if (seqno == 0)
> + n= 0;
> + if (readonly) {
> + if (obj->last_write.request) {
> + rq[n] = i915_request_get_breadcrumb(obj->last_write.request);
> + if (IS_ERR(rq[n]))
> + return PTR_ERR(rq[n]);
> + n++;
> + }
> + } else {
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + if (obj->last_read[i].request == NULL)
> + continue;
> +
> + rq[n] = i915_request_get_breadcrumb(obj->last_read[i].request);
> + if (IS_ERR(rq[n])) {
> + ret = PTR_ERR(rq[n]);
> + goto out;
> + }
> + n++;
> + }
> + }
> + if (n == 0)
> return 0;
>
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
> - if (ret)
> - return ret;
> + mutex_unlock(&dev->struct_mutex);
>
> - ret = i915_gem_check_olr(ring, seqno);
> - if (ret)
> - return ret;
> + for (i = 0; i < n; i++) {
> + ret = __i915_request_wait(rq[i], true, NULL, file_priv);
> + if (ret)
> + break;
> + }
>
> - reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> - mutex_unlock(&dev->struct_mutex);
> - ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, file_priv);
> mutex_lock(&dev->struct_mutex);
> - if (ret)
> - return ret;
>
> - return i915_gem_object_wait_rendering__tail(obj, ring);
> +out:
> + for (i = 0; i < n; i++)
> + i915_request_put(rq[i]);
> +
> + return ret;
> }
>
> /**
> @@ -2165,459 +2027,115 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> return 0;
> }
>
> -static void
> -i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring)
> +int i915_gem_set_seqno(struct drm_device *dev, u32 seqno)
> {
> - u32 seqno = intel_ring_get_seqno(ring);
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct intel_engine_cs *signaller, *waiter;
> + int ret, i, j;
>
> - BUG_ON(ring == NULL);
> - if (obj->ring != ring && obj->last_write_seqno) {
> - /* Keep the seqno relative to the current ring */
> - obj->last_write_seqno = seqno;
> - }
> - obj->ring = ring;
> + if (seqno == 0)
> + return -EINVAL;
>
> - /* Add a reference if we're newly entering the active list. */
> - if (!obj->active) {
> - drm_gem_object_reference(&obj->base);
> - obj->active = 1;
> - }
> + if (seqno == dev_priv->next_seqno)
> + return 0;
>
> - list_move_tail(&obj->ring_list, &ring->active_list);
> + do {
> + /* Flush the breadcrumbs */
> + ret = i915_gpu_idle(dev);
> + if (ret)
> + return ret;
>
> - obj->last_read_seqno = seqno;
> -}
> + if (!i915_gem_retire_requests(dev))
> + return -EIO;
>
> -void i915_vma_move_to_active(struct i915_vma *vma,
> - struct intel_engine_cs *ring)
> -{
> - list_move_tail(&vma->mm_list, &vma->vm->active_list);
> - return i915_gem_object_move_to_active(vma->obj, ring);
> -}
> + /* Update all semaphores to the current value */
> + for_each_engine(signaller, to_i915(dev), i) {
> + struct i915_gem_request *rq;
>
> -static void
> -i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> -{
> - struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> - struct i915_address_space *vm;
> - struct i915_vma *vma;
> + if (!signaller->semaphore.signal)
> + continue;
>
> - BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> - BUG_ON(!obj->active);
> + rq = intel_engine_alloc_request(signaller,
> + signaller->default_context);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
>
> - list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> - vma = i915_gem_obj_to_vma(obj, vm);
> - if (vma && !list_empty(&vma->mm_list))
> - list_move_tail(&vma->mm_list, &vm->inactive_list);
> - }
> + for_each_engine(waiter, to_i915(dev), j) {
> + if (signaller == waiter)
> + continue;
>
> - intel_fb_obj_flush(obj, true);
> + if (!waiter->semaphore.wait)
> + continue;
>
> - list_del_init(&obj->ring_list);
> - obj->ring = NULL;
> + ret = i915_request_emit_semaphore(rq, waiter->id);
> + if (ret)
> + break;
> + }
>
> - obj->last_read_seqno = 0;
> - obj->last_write_seqno = 0;
> - obj->base.write_domain = 0;
> + if (ret == 0)
> + ret = i915_request_commit(rq);
> + i915_request_put(rq);
> + if (ret)
> + return ret;
> + }
>
> - obj->last_fenced_seqno = 0;
> + /* We can only roll seqno forwards across a wraparound.
> + * This ship is not for turning!
> + */
> + if (!__i915_seqno_passed(dev_priv->next_seqno, seqno))
> + break;
>
> - obj->active = 0;
> - drm_gem_object_unreference(&obj->base);
> + dev_priv->next_seqno += 0x40000000;
> + }while (1);
>
> - WARN_ON(i915_verify_lists(dev));
> + dev_priv->next_seqno = seqno;
> + return 0;
> }
>
> -static void
> -i915_gem_object_retire(struct drm_i915_gem_object *obj)
> +void i915_gem_restore_fences(struct drm_device *dev)
> {
> - struct intel_engine_cs *ring = obj->ring;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + int i;
>
> - if (ring == NULL)
> - return;
> + for (i = 0; i < dev_priv->num_fence_regs; i++) {
> + struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
>
> - if (i915_seqno_passed(ring->get_seqno(ring, true),
> - obj->last_read_seqno))
> - i915_gem_object_move_to_inactive(obj);
> + /*
> + * Commit delayed tiling changes if we have an object still
> + * attached to the fence, otherwise just clear the fence.
> + */
> + if (reg->obj) {
> + i915_gem_object_update_fence(reg->obj, reg,
> + reg->obj->tiling_mode);
> + } else {
> + i915_gem_write_fence(dev, i, NULL);
> + }
> + }
> }
>
> -static int
> -i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
> +void i915_gem_reset(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - int ret, i, j;
> + struct intel_engine_cs *engine;
> + int i;
>
> - /* Carefully retire all requests without writing to the rings */
> - for_each_ring(ring, dev_priv, i) {
> - ret = intel_ring_idle(ring);
> - if (ret)
> - return ret;
> - }
> - i915_gem_retire_requests(dev);
> + for_each_engine(engine, dev_priv, i) {
> + /* Clearing the read list will also clear the write
> + * and fence lists, 3 birds with one stone.
> + */
> + while (!list_empty(&engine->read_list)) {
> + struct drm_i915_gem_object *obj;
> +
> + obj = list_first_entry(&engine->read_list,
> + struct drm_i915_gem_object,
> + last_read[i].engine_list);
>
> - /* Finally reset hw state */
> - for_each_ring(ring, dev_priv, i) {
> - intel_ring_init_seqno(ring, seqno);
> + i915_gem_object_retire__read(obj, engine);
> + }
>
> - for (j = 0; j < ARRAY_SIZE(ring->semaphore.sync_seqno); j++)
> - ring->semaphore.sync_seqno[j] = 0;
> + intel_engine_reset(engine);
> }
>
> - return 0;
> -}
> -
> -int i915_gem_set_seqno(struct drm_device *dev, u32 seqno)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret;
> -
> - if (seqno == 0)
> - return -EINVAL;
> -
> - /* HWS page needs to be set less than what we
> - * will inject to ring
> - */
> - ret = i915_gem_init_seqno(dev, seqno - 1);
> - if (ret)
> - return ret;
> -
> - /* Carefully set the last_seqno value so that wrap
> - * detection still works
> - */
> - dev_priv->next_seqno = seqno;
> - dev_priv->last_seqno = seqno - 1;
> - if (dev_priv->last_seqno == 0)
> - dev_priv->last_seqno--;
> -
> - return 0;
> -}
> -
> -int
> -i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> - /* reserve 0 for non-seqno */
> - if (dev_priv->next_seqno == 0) {
> - int ret = i915_gem_init_seqno(dev, 0);
> - if (ret)
> - return ret;
> -
> - dev_priv->next_seqno = 1;
> - }
> -
> - *seqno = dev_priv->last_seqno = dev_priv->next_seqno++;
> - return 0;
> -}
> -
> -int __i915_add_request(struct intel_engine_cs *ring,
> - struct drm_file *file,
> - struct drm_i915_gem_object *obj,
> - u32 *out_seqno)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - struct drm_i915_gem_request *request;
> - struct intel_ringbuffer *ringbuf;
> - u32 request_ring_position, request_start;
> - int ret;
> -
> - request = ring->preallocated_lazy_request;
> - if (WARN_ON(request == NULL))
> - return -ENOMEM;
> -
> - if (i915.enable_execlists) {
> - struct intel_context *ctx = request->ctx;
> - ringbuf = ctx->engine[ring->id].ringbuf;
> - } else
> - ringbuf = ring->buffer;
> -
> - request_start = intel_ring_get_tail(ringbuf);
> - /*
> - * Emit any outstanding flushes - execbuf can fail to emit the flush
> - * after having emitted the batchbuffer command. Hence we need to fix
> - * things up similar to emitting the lazy request. The difference here
> - * is that the flush _must_ happen before the next request, no matter
> - * what.
> - */
> - if (i915.enable_execlists) {
> - ret = logical_ring_flush_all_caches(ringbuf);
> - if (ret)
> - return ret;
> - } else {
> - ret = intel_ring_flush_all_caches(ring);
> - if (ret)
> - return ret;
> - }
> -
> - /* Record the position of the start of the request so that
> - * should we detect the updated seqno part-way through the
> - * GPU processing the request, we never over-estimate the
> - * position of the head.
> - */
> - request_ring_position = intel_ring_get_tail(ringbuf);
> -
> - if (i915.enable_execlists) {
> - ret = ring->emit_request(ringbuf);
> - if (ret)
> - return ret;
> - } else {
> - ret = ring->add_request(ring);
> - if (ret)
> - return ret;
> - }
> -
> - request->seqno = intel_ring_get_seqno(ring);
> - request->ring = ring;
> - request->head = request_start;
> - request->tail = request_ring_position;
> -
> - /* Whilst this request exists, batch_obj will be on the
> - * active_list, and so will hold the active reference. Only when this
> - * request is retired will the the batch_obj be moved onto the
> - * inactive_list and lose its active reference. Hence we do not need
> - * to explicitly hold another reference here.
> - */
> - request->batch_obj = obj;
> -
> - if (!i915.enable_execlists) {
> - /* Hold a reference to the current context so that we can inspect
> - * it later in case a hangcheck error event fires.
> - */
> - request->ctx = ring->last_context;
> - if (request->ctx)
> - i915_gem_context_reference(request->ctx);
> - }
> -
> - request->emitted_jiffies = jiffies;
> - list_add_tail(&request->list, &ring->request_list);
> - request->file_priv = NULL;
> -
> - if (file) {
> - struct drm_i915_file_private *file_priv = file->driver_priv;
> -
> - spin_lock(&file_priv->mm.lock);
> - request->file_priv = file_priv;
> - list_add_tail(&request->client_list,
> - &file_priv->mm.request_list);
> - spin_unlock(&file_priv->mm.lock);
> - }
> -
> - trace_i915_gem_request_add(ring, request->seqno);
> - ring->outstanding_lazy_seqno = 0;
> - ring->preallocated_lazy_request = NULL;
> -
> - if (!dev_priv->ums.mm_suspended) {
> - i915_queue_hangcheck(ring->dev);
> -
> - cancel_delayed_work_sync(&dev_priv->mm.idle_work);
> - queue_delayed_work(dev_priv->wq,
> - &dev_priv->mm.retire_work,
> - round_jiffies_up_relative(HZ));
> - intel_mark_busy(dev_priv->dev);
> - }
> -
> - if (out_seqno)
> - *out_seqno = request->seqno;
> - return 0;
> -}
> -
> -static inline void
> -i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> -{
> - struct drm_i915_file_private *file_priv = request->file_priv;
> -
> - if (!file_priv)
> - return;
> -
> - spin_lock(&file_priv->mm.lock);
> - list_del(&request->client_list);
> - request->file_priv = NULL;
> - spin_unlock(&file_priv->mm.lock);
> -}
> -
> -static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
> - const struct intel_context *ctx)
> -{
> - unsigned long elapsed;
> -
> - elapsed = get_seconds() - ctx->hang_stats.guilty_ts;
> -
> - if (ctx->hang_stats.banned)
> - return true;
> -
> - if (ctx->hang_stats.ban_period_seconds &&
> - elapsed <= ctx->hang_stats.ban_period_seconds) {
> - if (!i915_gem_context_is_default(ctx)) {
> - DRM_DEBUG("context hanging too fast, banning!\n");
> - return true;
> - } else if (i915_stop_ring_allow_ban(dev_priv)) {
> - if (i915_stop_ring_allow_warn(dev_priv))
> - DRM_ERROR("gpu hanging too fast, banning!\n");
> - return true;
> - }
> - }
> -
> - return false;
> -}
> -
> -static void i915_set_reset_status(struct drm_i915_private *dev_priv,
> - struct intel_context *ctx,
> - const bool guilty)
> -{
> - struct i915_ctx_hang_stats *hs;
> -
> - if (WARN_ON(!ctx))
> - return;
> -
> - hs = &ctx->hang_stats;
> -
> - if (guilty) {
> - hs->banned = i915_context_is_banned(dev_priv, ctx);
> - hs->batch_active++;
> - hs->guilty_ts = get_seconds();
> - } else {
> - hs->batch_pending++;
> - }
> -}
> -
> -static void i915_gem_free_request(struct drm_i915_gem_request *request)
> -{
> - list_del(&request->list);
> - i915_gem_request_remove_from_client(request);
> -
> - if (request->ctx)
> - i915_gem_context_unreference(request->ctx);
> -
> - kfree(request);
> -}
> -
> -struct drm_i915_gem_request *
> -i915_gem_find_active_request(struct intel_engine_cs *ring)
> -{
> - struct drm_i915_gem_request *request;
> - u32 completed_seqno;
> -
> - completed_seqno = ring->get_seqno(ring, false);
> -
> - list_for_each_entry(request, &ring->request_list, list) {
> - if (i915_seqno_passed(completed_seqno, request->seqno))
> - continue;
> -
> - return request;
> - }
> -
> - return NULL;
> -}
> -
> -static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv,
> - struct intel_engine_cs *ring)
> -{
> - struct drm_i915_gem_request *request;
> - bool ring_hung;
> -
> - request = i915_gem_find_active_request(ring);
> -
> - if (request == NULL)
> - return;
> -
> - ring_hung = ring->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG;
> -
> - i915_set_reset_status(dev_priv, request->ctx, ring_hung);
> -
> - list_for_each_entry_continue(request, &ring->request_list, list)
> - i915_set_reset_status(dev_priv, request->ctx, false);
> -}
> -
> -static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
> - struct intel_engine_cs *ring)
> -{
> - while (!list_empty(&ring->active_list)) {
> - struct drm_i915_gem_object *obj;
> -
> - obj = list_first_entry(&ring->active_list,
> - struct drm_i915_gem_object,
> - ring_list);
> -
> - i915_gem_object_move_to_inactive(obj);
> - }
> -
> - /*
> - * We must free the requests after all the corresponding objects have
> - * been moved off active lists. Which is the same order as the normal
> - * retire_requests function does. This is important if object hold
> - * implicit references on things like e.g. ppgtt address spaces through
> - * the request.
> - */
> - while (!list_empty(&ring->request_list)) {
> - struct drm_i915_gem_request *request;
> -
> - request = list_first_entry(&ring->request_list,
> - struct drm_i915_gem_request,
> - list);
> -
> - i915_gem_free_request(request);
> - }
> -
> - while (!list_empty(&ring->execlist_queue)) {
> - struct intel_ctx_submit_request *submit_req;
> -
> - submit_req = list_first_entry(&ring->execlist_queue,
> - struct intel_ctx_submit_request,
> - execlist_link);
> - list_del(&submit_req->execlist_link);
> - intel_runtime_pm_put(dev_priv);
> - i915_gem_context_unreference(submit_req->ctx);
> - kfree(submit_req);
> - }
> -
> - /* These may not have been flush before the reset, do so now */
> - kfree(ring->preallocated_lazy_request);
> - ring->preallocated_lazy_request = NULL;
> - ring->outstanding_lazy_seqno = 0;
> -}
> -
> -void i915_gem_restore_fences(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int i;
> -
> - for (i = 0; i < dev_priv->num_fence_regs; i++) {
> - struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
> -
> - /*
> - * Commit delayed tiling changes if we have an object still
> - * attached to the fence, otherwise just clear the fence.
> - */
> - if (reg->obj) {
> - i915_gem_object_update_fence(reg->obj, reg,
> - reg->obj->tiling_mode);
> - } else {
> - i915_gem_write_fence(dev, i, NULL);
> - }
> - }
> -}
> -
> -void i915_gem_reset(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - int i;
> -
> - /*
> - * Before we free the objects from the requests, we need to inspect
> - * them for finding the guilty party. As the requests only borrow
> - * their reference to the objects, the inspection must be done first.
> - */
> - for_each_ring(ring, dev_priv, i)
> - i915_gem_reset_ring_status(dev_priv, ring);
> -
> - for_each_ring(ring, dev_priv, i)
> - i915_gem_reset_ring_cleanup(dev_priv, ring);
> -
> - i915_gem_context_reset(dev);
> -
> i915_gem_restore_fences(dev);
> }
>
> @@ -2625,100 +2143,95 @@ void i915_gem_reset(struct drm_device *dev)
> * This function clears the request list as sequence numbers are passed.
> */
> void
> -i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
> +i915_gem_retire_requests__engine(struct intel_engine_cs *engine)
> {
> - uint32_t seqno;
> -
> - if (list_empty(&ring->request_list))
> + if (engine->last_request == NULL)
> return;
>
> - WARN_ON(i915_verify_lists(ring->dev));
> -
> - seqno = ring->get_seqno(ring, true);
> + if (!intel_engine_retire(engine, engine->get_seqno(engine)))
> + return;
>
> - /* Move any buffers on the active list that are no longer referenced
> - * by the ringbuffer to the flushing/inactive lists as appropriate,
> - * before we free the context associated with the requests.
> - */
> - while (!list_empty(&ring->active_list)) {
> + while (!list_empty(&engine->write_list)) {
> struct drm_i915_gem_object *obj;
>
> - obj = list_first_entry(&ring->active_list,
> - struct drm_i915_gem_object,
> - ring_list);
> + obj = list_first_entry(&engine->write_list,
> + struct drm_i915_gem_object,
> + last_write.engine_list);
>
> - if (!i915_seqno_passed(seqno, obj->last_read_seqno))
> + if (!obj->last_write.request->completed)
> break;
>
> - i915_gem_object_move_to_inactive(obj);
> + i915_gem_object_retire__write(obj);
> }
>
> + while (!list_empty(&engine->fence_list)) {
> + struct drm_i915_gem_object *obj;
>
> - while (!list_empty(&ring->request_list)) {
> - struct drm_i915_gem_request *request;
> - struct intel_ringbuffer *ringbuf;
> -
> - request = list_first_entry(&ring->request_list,
> - struct drm_i915_gem_request,
> - list);
> + obj = list_first_entry(&engine->fence_list,
> + struct drm_i915_gem_object,
> + last_fence.engine_list);
>
> - if (!i915_seqno_passed(seqno, request->seqno))
> + if (!obj->last_fence.request->completed)
> break;
>
> - trace_i915_gem_request_retire(ring, request->seqno);
> + i915_gem_object_retire__fence(obj);
> + }
>
> - /* This is one of the few common intersection points
> - * between legacy ringbuffer submission and execlists:
> - * we need to tell them apart in order to find the correct
> - * ringbuffer to which the request belongs to.
> - */
> - if (i915.enable_execlists) {
> - struct intel_context *ctx = request->ctx;
> - ringbuf = ctx->engine[ring->id].ringbuf;
> - } else
> - ringbuf = ring->buffer;
> -
> - /* We know the GPU must have read the request to have
> - * sent us the seqno + interrupt, so use the position
> - * of tail of the request to update the last known position
> - * of the GPU head.
> - */
> - ringbuf->last_retired_head = request->tail;
> + while (!list_empty(&engine->read_list)) {
> + struct drm_i915_gem_object *obj;
>
> - i915_gem_free_request(request);
> - }
> + obj = list_first_entry(&engine->read_list,
> + struct drm_i915_gem_object,
> + last_read[engine->id].engine_list);
>
> - if (unlikely(ring->trace_irq_seqno &&
> - i915_seqno_passed(seqno, ring->trace_irq_seqno))) {
> - ring->irq_put(ring);
> - ring->trace_irq_seqno = 0;
> - }
> + if (!obj->last_read[engine->id].request->completed)
> + break;
>
> - WARN_ON(i915_verify_lists(ring->dev));
> + i915_gem_object_retire__read(obj, engine);
> + }
> }
>
> bool
> i915_gem_retire_requests(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> bool idle = true;
> int i;
>
> - for_each_ring(ring, dev_priv, i) {
> - i915_gem_retire_requests_ring(ring);
> - idle &= list_empty(&ring->request_list);
> + for_each_engine(engine, dev_priv, i) {
> + i915_gem_retire_requests__engine(engine);
> + idle &= engine->last_request == NULL;
> }
>
> if (idle)
> mod_delayed_work(dev_priv->wq,
> - &dev_priv->mm.idle_work,
> - msecs_to_jiffies(100));
> + &dev_priv->mm.idle_work,
> + msecs_to_jiffies(100));
>
> return idle;
> }
>
> static void
> +i915_gem_flush_requests(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct intel_engine_cs *engine;
> + int i, ignored;
> +
> + for_each_engine(engine, dev_priv, i) {
> + if (engine->last_request == NULL)
> + continue;
> +
> + if (engine->last_request->breadcrumb[engine->id])
> + continue;
> +
> + ignored = intel_engine_flush(engine, engine->last_request->ctx);
> + }
> + (void)ignored;
> +}
> +
> +static void
> i915_gem_retire_work_handler(struct work_struct *work)
> {
> struct drm_i915_private *dev_priv =
> @@ -2730,10 +2243,13 @@ i915_gem_retire_work_handler(struct work_struct *work)
> idle = false;
> if (mutex_trylock(&dev->struct_mutex)) {
> idle = i915_gem_retire_requests(dev);
> + if (!idle)
> + i915_gem_flush_requests(dev);
> mutex_unlock(&dev->struct_mutex);
> }
> if (!idle)
> - queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work,
> + queue_delayed_work(dev_priv->wq,
> + &dev_priv->mm.retire_work,
> round_jiffies_up_relative(HZ));
> }
>
> @@ -2756,14 +2272,16 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
> {
> int ret;
>
> - if (obj->active) {
> - ret = i915_gem_check_olr(obj->ring, obj->last_read_seqno);
> + if (!obj->active)
> + return 0;
> +
> + if (obj->last_write.request) {
> + ret = i915_request_emit_breadcrumb(obj->last_write.request);
> if (ret)
> return ret;
> -
> - i915_gem_retire_requests_ring(obj->ring);
> }
>
> + i915_gem_object_retire(obj);
> return 0;
> }
>
> @@ -2792,13 +2310,10 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
> int
> i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_i915_gem_wait *args = data;
> struct drm_i915_gem_object *obj;
> - struct intel_engine_cs *ring = NULL;
> - unsigned reset_counter;
> - u32 seqno = 0;
> - int ret = 0;
> + struct i915_gem_request *rq[I915_NUM_ENGINES] = {};
> + int i, n, ret = 0;
>
> ret = i915_mutex_lock_interruptible(dev);
> if (ret)
> @@ -2815,13 +2330,8 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> if (ret)
> goto out;
>
> - if (obj->active) {
> - seqno = obj->last_read_seqno;
> - ring = obj->ring;
> - }
> -
> - if (seqno == 0)
> - goto out;
> + if (!obj->active)
> + goto out;
>
> /* Do this after OLR check to make sure we make forward progress polling
> * on this IOCTL with a timeout <=0 (like busy ioctl)
> @@ -2831,12 +2341,31 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> goto out;
> }
>
> + for (i = n = 0; i < I915_NUM_ENGINES; i++) {
> + if (obj->last_read[i].request == NULL)
> + continue;
> +
> + rq[n] = i915_request_get_breadcrumb(obj->last_read[i].request);
> + if (IS_ERR(rq[n])) {
> + ret = PTR_ERR(rq[n]);
> + break;
> + }
> + n++;
> + }
> +
> drm_gem_object_unreference(&obj->base);
> - reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> mutex_unlock(&dev->struct_mutex);
>
> - return __wait_seqno(ring, seqno, reset_counter, true, &args->timeout_ns,
> - file->driver_priv);
> + for (i = 0; i < n; i++) {
> + if (ret == 0)
> + ret = __i915_request_wait(rq[i], true,
> + &args->timeout_ns,
> + file->driver_priv);
> +
> + i915_request_put__unlocked(rq[i]);
> + }
> +
> + return ret;
>
> out:
> drm_gem_object_unreference(&obj->base);
> @@ -2844,6 +2373,50 @@ out:
> return ret;
> }
>
> +static int
> +__i915_request_sync(struct i915_gem_request *waiter,
> + struct i915_gem_request *signaller,
> + struct drm_i915_gem_object *obj,
> + bool *retire)
> +{
> + int ret;
> +
> + if (signaller == NULL || i915_request_complete(signaller))
> + return 0;
> +
> + if (waiter == NULL)
> + goto wait;
> +
> + /* XXX still true with execlists? */
> + if (waiter->engine == signaller->engine)
> + return 0;
> +
> + if (!waiter->engine->semaphore.wait)
> + goto wait;
> +
> + /* Try to emit only one wait per request per ring */
> + if (waiter->semaphore[signaller->engine->id] &&
> + __i915_seqno_passed(waiter->semaphore[signaller->engine->id],
> + signaller->seqno))
> + return 0;
> +
> + ret = i915_request_emit_semaphore(signaller, waiter->engine->id);
> + if (ret)
> + goto wait;
> +
> + trace_i915_gem_ring_wait(signaller, waiter);
> + if (waiter->engine->semaphore.wait(waiter, signaller))
> + goto wait;
> +
> + waiter->pending_flush &= ~I915_COMMAND_BARRIER;
> + waiter->semaphore[signaller->engine->id] = signaller->breadcrumb[waiter->engine->id];
> + return 0;
> +
> +wait:
> + *retire = true;
> + return i915_request_wait(signaller);
> +}
> +
> /**
> * i915_gem_object_sync - sync an object to a ring.
> *
> @@ -2858,38 +2431,23 @@ out:
> */
> int
> i915_gem_object_sync(struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *to)
> + struct i915_gem_request *rq)
> {
> - struct intel_engine_cs *from = obj->ring;
> - u32 seqno;
> - int ret, idx;
> -
> - if (from == NULL || to == from)
> - return 0;
> + int ret = 0, i;
> + bool retire = false;
>
> - if (to == NULL || !i915_semaphore_is_enabled(obj->base.dev))
> - return i915_gem_object_wait_rendering(obj, false);
> -
> - idx = intel_ring_sync_index(from, to);
> -
> - seqno = obj->last_read_seqno;
> - /* Optimization: Avoid semaphore sync when we are sure we already
> - * waited for an object with higher seqno */
> - if (seqno <= from->semaphore.sync_seqno[idx])
> - return 0;
> -
> - ret = i915_gem_check_olr(obj->ring, seqno);
> - if (ret)
> - return ret;
> + if (obj->base.pending_write_domain == 0) {
> + ret = __i915_request_sync(rq, obj->last_write.request, obj, &retire);
> + } else {
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + ret = __i915_request_sync(rq, obj->last_read[i].request, obj, &retire);
> + if (ret)
> + break;
> + }
> + }
>
> - trace_i915_gem_ring_sync_to(from, to, seqno);
> - ret = to->semaphore.sync_to(to, from, seqno);
> - if (!ret)
> - /* We use last_read_seqno because sync_to()
> - * might have just caused seqno wrap under
> - * the radar.
> - */
> - from->semaphore.sync_seqno[idx] = obj->last_read_seqno;
> + if (retire)
> + i915_gem_object_retire(obj);
>
> return ret;
> }
> @@ -2983,19 +2541,22 @@ int i915_vma_unbind(struct i915_vma *vma)
>
> int i915_gpu_idle(struct drm_device *dev)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - int ret, i;
> + struct intel_engine_cs *engine;
> + int i;
>
> - /* Flush everything onto the inactive list. */
> - for_each_ring(ring, dev_priv, i) {
> - if (!i915.enable_execlists) {
> - ret = i915_switch_context(ring, ring->default_context);
> - if (ret)
> - return ret;
> - }
> + /* Flush everything including contexts onto the inactive list. */
> + for_each_engine(engine, to_i915(dev), i) {
> + struct i915_gem_request *rq;
> + int ret;
> +
> + rq = intel_engine_alloc_request(engine,
> + engine->default_context);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> + ret = i915_request_wait(rq);
> + i915_request_put(rq);
>
> - ret = intel_ring_idle(ring);
> if (ret)
> return ret;
> }
> @@ -3199,14 +2760,16 @@ static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
> static int
> i915_gem_object_wait_fence(struct drm_i915_gem_object *obj)
> {
> - if (obj->last_fenced_seqno) {
> - int ret = i915_wait_seqno(obj->ring, obj->last_fenced_seqno);
> - if (ret)
> - return ret;
> + int ret;
>
> - obj->last_fenced_seqno = 0;
> - }
> + if (obj->last_fence.request == NULL)
> + return 0;
>
> + ret = i915_request_wait(obj->last_fence.request);
> + if (ret)
> + return ret;
> +
> + i915_gem_object_retire__fence(obj);
> return 0;
> }
>
> @@ -3641,7 +3204,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> if (ret)
> return ret;
>
> - i915_gem_object_retire(obj);
> i915_gem_object_flush_cpu_write_domain(obj, false);
>
> /* Serialise direct access to this object with the barriers for
> @@ -3660,14 +3222,12 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_GTT) != 0);
> obj->base.read_domains |= I915_GEM_DOMAIN_GTT;
> if (write) {
> + intel_fb_obj_invalidate(obj, NULL);
> obj->base.read_domains = I915_GEM_DOMAIN_GTT;
> obj->base.write_domain = I915_GEM_DOMAIN_GTT;
> obj->dirty = 1;
> }
>
> - if (write)
> - intel_fb_obj_invalidate(obj, NULL);
> -
> trace_i915_gem_object_change_domain(obj,
> old_read_domains,
> old_write_domain);
> @@ -3739,7 +3299,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> * in obj->write_domain and have been skipping the clflushes.
> * Just set it to the CPU cache for now.
> */
> - i915_gem_object_retire(obj);
> WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
>
> old_read_domains = obj->base.read_domains;
> @@ -3865,17 +3424,15 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
> int
> i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> u32 alignment,
> - struct intel_engine_cs *pipelined)
> + struct i915_gem_request *pipelined)
> {
> u32 old_read_domains, old_write_domain;
> bool was_pin_display;
> int ret;
>
> - if (pipelined != obj->ring) {
> - ret = i915_gem_object_sync(obj, pipelined);
> - if (ret)
> - return ret;
> - }
> + ret = i915_gem_object_sync(obj, pipelined);
> + if (ret)
> + return ret;
>
> /* Mark the pin_display early so that we account for the
> * display coherency whilst setting up the cache domains.
> @@ -3971,7 +3528,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
> if (ret)
> return ret;
>
> - i915_gem_object_retire(obj);
> i915_gem_object_flush_gtt_write_domain(obj);
>
> old_write_domain = obj->base.write_domain;
> @@ -3984,78 +3540,25 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
> obj->base.read_domains |= I915_GEM_DOMAIN_CPU;
> }
>
> - /* It should now be out of any other write domains, and we can update
> - * the domain values for our changes.
> - */
> - BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_CPU) != 0);
> -
> - /* If we're writing through the CPU, then the GPU read domains will
> - * need to be invalidated at next use.
> - */
> - if (write) {
> - obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> - obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> - }
> -
> - if (write)
> - intel_fb_obj_invalidate(obj, NULL);
> -
> - trace_i915_gem_object_change_domain(obj,
> - old_read_domains,
> - old_write_domain);
> -
> - return 0;
> -}
> -
> -/* Throttle our rendering by waiting until the ring has completed our requests
> - * emitted over 20 msec ago.
> - *
> - * Note that if we were to use the current jiffies each time around the loop,
> - * we wouldn't escape the function with any frames outstanding if the time to
> - * render a frame was over 20ms.
> - *
> - * This should get us reasonable parallelism between CPU and GPU but also
> - * relatively low latency when blocking on a particular request to finish.
> - */
> -static int
> -i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct drm_i915_file_private *file_priv = file->driver_priv;
> - unsigned long recent_enough = jiffies - msecs_to_jiffies(20);
> - struct drm_i915_gem_request *request;
> - struct intel_engine_cs *ring = NULL;
> - unsigned reset_counter;
> - u32 seqno = 0;
> - int ret;
> -
> - ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
> - if (ret)
> - return ret;
> -
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
> - if (ret)
> - return ret;
> -
> - spin_lock(&file_priv->mm.lock);
> - list_for_each_entry(request, &file_priv->mm.request_list, client_list) {
> - if (time_after_eq(request->emitted_jiffies, recent_enough))
> - break;
> -
> - ring = request->ring;
> - seqno = request->seqno;
> - }
> - reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> - spin_unlock(&file_priv->mm.lock);
> -
> - if (seqno == 0)
> - return 0;
> + /* It should now be out of any other write domains, and we can update
> + * the domain values for our changes.
> + */
> + BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_CPU) != 0);
>
> - ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, NULL);
> - if (ret == 0)
> - queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
> + /* If we're writing through the CPU, then the GPU read domains will
> + * need to be invalidated at next use.
> + */
> + if (write) {
> + intel_fb_obj_invalidate(obj, NULL);
> + obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> + obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> + }
>
> - return ret;
> + trace_i915_gem_object_change_domain(obj,
> + old_read_domains,
> + old_write_domain);
> +
> + return 0;
> }
>
> static bool
> @@ -4268,7 +3771,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
> {
> struct drm_i915_gem_busy *args = data;
> struct drm_i915_gem_object *obj;
> - int ret;
> + int ret, i;
>
> ret = i915_mutex_lock_interruptible(dev);
> if (ret)
> @@ -4287,10 +3790,16 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
> */
> ret = i915_gem_object_flush_active(obj);
>
> - args->busy = obj->active;
> - if (obj->ring) {
> - BUILD_BUG_ON(I915_NUM_RINGS > 16);
> - args->busy |= intel_ring_flag(obj->ring) << 16;
> + args->busy = 0;
> + if (obj->active) {
> + BUILD_BUG_ON(I915_NUM_ENGINES > 16);
> + args->busy |= 1;
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + if (obj->last_read[i].request == NULL)
> + continue;
> +
> + args->busy |= 1 << (16 + i);
> + }
> }
>
> drm_gem_object_unreference(&obj->base);
> @@ -4299,11 +3808,58 @@ unlock:
> return ret;
> }
>
> +/* Throttle our rendering by waiting until the ring has completed our requests
> + * emitted over 20 msec ago.
> + *
> + * Note that if we were to use the current jiffies each time around the loop,
> + * we wouldn't escape the function with any frames outstanding if the time to
> + * render a frame was over 20ms.
> + *
> + * This should get us reasonable parallelism between CPU and GPU but also
> + * relatively low latency when blocking on a particular request to finish.
> + */
> int
> i915_gem_throttle_ioctl(struct drm_device *dev, void *data,
> - struct drm_file *file_priv)
> + struct drm_file *file)
> {
> - return i915_gem_ring_throttle(dev, file_priv);
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_file_private *file_priv = file->driver_priv;
> + unsigned long recent_enough = jiffies - msecs_to_jiffies(20);
> + struct i915_gem_request *rq, *tmp;
> + int ret;
> +
> + ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
> + if (ret)
> + return ret;
> +
> + /* used for querying whethering the GPU is wedged by legacy userspace */
> + if (i915_terminally_wedged(&dev_priv->gpu_error))
> + return -EIO;
> +
> + spin_lock(&file_priv->mm.lock);
> + rq = NULL;
> + list_for_each_entry(tmp, &file_priv->mm.request_list, client_list) {
> + if (time_after_eq(tmp->emitted_jiffies, recent_enough))
> + break;
> + rq = tmp;
> + }
> + rq = i915_request_get(rq);
> + spin_unlock(&file_priv->mm.lock);
> +
> + if (rq != NULL) {
> + if (rq->breadcrumb[rq->engine->id] == 0) {
> + ret = i915_mutex_lock_interruptible(dev);
> + if (ret == 0) {
> + ret = i915_request_emit_breadcrumb(rq);
> + mutex_unlock(&dev->struct_mutex);
> + }
> + }
> + if (ret == 0)
> + ret = __i915_request_wait(rq, true, NULL, NULL);
> + i915_request_put__unlocked(rq);
> + }
> +
> + return ret;
> }
>
> int
> @@ -4356,8 +3912,13 @@ unlock:
> void i915_gem_object_init(struct drm_i915_gem_object *obj,
> const struct drm_i915_gem_object_ops *ops)
> {
> + int i;
> +
> INIT_LIST_HEAD(&obj->global_list);
> - INIT_LIST_HEAD(&obj->ring_list);
> + INIT_LIST_HEAD(&obj->last_fence.engine_list);
> + INIT_LIST_HEAD(&obj->last_write.engine_list);
> + for (i = 0; i < I915_NUM_ENGINES; i++)
> + INIT_LIST_HEAD(&obj->last_read[i].engine_list);
> INIT_LIST_HEAD(&obj->obj_exec_link);
> INIT_LIST_HEAD(&obj->vma_list);
>
> @@ -4543,121 +4104,59 @@ void i915_gem_vma_destroy(struct i915_vma *vma)
> }
>
> static void
> -i915_gem_stop_ringbuffers(struct drm_device *dev)
> +i915_gem_cleanup_engines(struct drm_device *dev)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> int i;
>
> - for_each_ring(ring, dev_priv, i)
> - dev_priv->gt.stop_ring(ring);
> -}
> -
> -int
> -i915_gem_suspend(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret = 0;
> -
> - mutex_lock(&dev->struct_mutex);
> - if (dev_priv->ums.mm_suspended)
> - goto err;
> -
> - ret = i915_gpu_idle(dev);
> - if (ret)
> - goto err;
> -
> - i915_gem_retire_requests(dev);
> -
> - /* Under UMS, be paranoid and evict. */
> - if (!drm_core_check_feature(dev, DRIVER_MODESET))
> - i915_gem_evict_everything(dev);
> -
> - i915_gem_stop_ringbuffers(dev);
> -
> - /* Hack! Don't let anybody do execbuf while we don't control the chip.
> - * We need to replace this with a semaphore, or something.
> - * And not confound ums.mm_suspended!
> - */
> - dev_priv->ums.mm_suspended = !drm_core_check_feature(dev,
> - DRIVER_MODESET);
> - mutex_unlock(&dev->struct_mutex);
> -
> - del_timer_sync(&dev_priv->gpu_error.hangcheck_timer);
> - cancel_delayed_work_sync(&dev_priv->mm.retire_work);
> - flush_delayed_work(&dev_priv->mm.idle_work);
> + /* Not the regular for_each_engine so we can cleanup a failed setup */
> + for (i =0; i < I915_NUM_ENGINES; i++) {
> + struct intel_engine_cs *engine = &to_i915(dev)->engine[i];
>
> - return 0;
> + if (engine->i915 == NULL)
> + continue;
>
> -err:
> - mutex_unlock(&dev->struct_mutex);
> - return ret;
> + intel_engine_cleanup(engine);
> + }
> }
>
> -int i915_gem_l3_remap(struct intel_engine_cs *ring, int slice)
> +static int
> +i915_gem_resume_engines(struct drm_device *dev)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
> - u32 *remap_info = dev_priv->l3_parity.remap_info[slice];
> + struct intel_engine_cs *engine;
> int i, ret;
>
> - if (!HAS_L3_DPF(dev) || !remap_info)
> - return 0;
> -
> - ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3);
> - if (ret)
> - return ret;
> -
> - /*
> - * Note: We do not worry about the concurrent register cacheline hang
> - * here because no other code should access these registers other than
> - * at initialization time.
> - */
> - for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
> - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> - intel_ring_emit(ring, reg_base + i);
> - intel_ring_emit(ring, remap_info[i/4]);
> + for_each_engine(engine, to_i915(dev), i) {
> + ret = intel_engine_resume(engine);
> + if (ret)
> + return ret;
> }
>
> - intel_ring_advance(ring);
> -
> - return ret;
> + return 0;
> }
>
> -void i915_gem_init_swizzling(struct drm_device *dev)
> +static int
> +i915_gem_suspend_engines(struct drm_device *dev)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> - if (INTEL_INFO(dev)->gen < 5 ||
> - dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_NONE)
> - return;
> -
> - I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> - DISP_TILE_SURFACE_SWIZZLING);
> + struct intel_engine_cs *engine;
> + int i, ret;
>
> - if (IS_GEN5(dev))
> - return;
> + for_each_engine(engine, to_i915(dev), i) {
> + ret = intel_engine_suspend(engine);
> + if (ret)
> + return ret;
> + }
>
> - I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
> - if (IS_GEN6(dev))
> - I915_WRITE(ARB_MODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_SNB));
> - else if (IS_GEN7(dev))
> - I915_WRITE(ARB_MODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_IVB));
> - else if (IS_GEN8(dev))
> - I915_WRITE(GAMTARBMODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_BDW));
> - else
> - BUG();
> + return 0;
> }
>
> static bool
> -intel_enable_blt(struct drm_device *dev)
> +intel_enable_blt(struct drm_i915_private *dev_priv)
> {
> - if (!HAS_BLT(dev))
> + if (!HAS_BLT(dev_priv))
> return false;
>
> /* The blitter was dysfunctional on early prototypes */
> - if (IS_GEN6(dev) && dev->pdev->revision < 8) {
> + if (IS_GEN6(dev_priv) && dev_priv->dev->pdev->revision < 8) {
> DRM_INFO("BLT not supported on this pre-production hardware;"
> " graphics performance will be degraded.\n");
> return false;
> @@ -4666,34 +4165,32 @@ intel_enable_blt(struct drm_device *dev)
> return true;
> }
>
> -static void init_unused_ring(struct drm_device *dev, u32 base)
> +static void stop_unused_ring(struct drm_i915_private *dev_priv, u32 base)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> I915_WRITE(RING_CTL(base), 0);
> I915_WRITE(RING_HEAD(base), 0);
> I915_WRITE(RING_TAIL(base), 0);
> I915_WRITE(RING_START(base), 0);
> }
>
> -static void init_unused_rings(struct drm_device *dev)
> +static void stop_unused_rings(struct drm_i915_private *dev_priv)
> {
> - if (IS_I830(dev)) {
> - init_unused_ring(dev, PRB1_BASE);
> - init_unused_ring(dev, SRB0_BASE);
> - init_unused_ring(dev, SRB1_BASE);
> - init_unused_ring(dev, SRB2_BASE);
> - init_unused_ring(dev, SRB3_BASE);
> - } else if (IS_GEN2(dev)) {
> - init_unused_ring(dev, SRB0_BASE);
> - init_unused_ring(dev, SRB1_BASE);
> - } else if (IS_GEN3(dev)) {
> - init_unused_ring(dev, PRB1_BASE);
> - init_unused_ring(dev, PRB2_BASE);
> + if (IS_I830(dev_priv)) {
> + stop_unused_ring(dev_priv, PRB1_BASE);
> + stop_unused_ring(dev_priv, SRB0_BASE);
> + stop_unused_ring(dev_priv, SRB1_BASE);
> + stop_unused_ring(dev_priv, SRB2_BASE);
> + stop_unused_ring(dev_priv, SRB3_BASE);
> + } else if (IS_GEN2(dev_priv)) {
> + stop_unused_ring(dev_priv, SRB0_BASE);
> + stop_unused_ring(dev_priv, SRB1_BASE);
> + } else if (IS_GEN3(dev_priv)) {
> + stop_unused_ring(dev_priv, PRB1_BASE);
> + stop_unused_ring(dev_priv, PRB2_BASE);
> }
> }
>
> -int i915_gem_init_rings(struct drm_device *dev)
> +static int i915_gem_setup_engines(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> int ret;
> @@ -4704,61 +4201,116 @@ int i915_gem_init_rings(struct drm_device *dev)
> * will prevent c3 entry. Makes sure all unused rings
> * are totally idle.
> */
> - init_unused_rings(dev);
> + stop_unused_rings(dev_priv);
>
> - ret = intel_init_render_ring_buffer(dev);
> + ret = intel_init_render_engine(dev_priv);
> if (ret)
> - return ret;
> + goto cleanup;
>
> - if (HAS_BSD(dev)) {
> - ret = intel_init_bsd_ring_buffer(dev);
> + if (HAS_BSD(dev_priv)) {
> + ret = intel_init_bsd_engine(dev_priv);
> if (ret)
> - goto cleanup_render_ring;
> + goto cleanup;
> }
>
> - if (intel_enable_blt(dev)) {
> - ret = intel_init_blt_ring_buffer(dev);
> + if (intel_enable_blt(dev_priv)) {
> + ret = intel_init_blt_engine(dev_priv);
> if (ret)
> - goto cleanup_bsd_ring;
> + goto cleanup;
> }
>
> - if (HAS_VEBOX(dev)) {
> - ret = intel_init_vebox_ring_buffer(dev);
> + if (HAS_VEBOX(dev_priv)) {
> + ret = intel_init_vebox_engine(dev_priv);
> if (ret)
> - goto cleanup_blt_ring;
> + goto cleanup;
> }
>
> - if (HAS_BSD2(dev)) {
> - ret = intel_init_bsd2_ring_buffer(dev);
> + if (HAS_BSD2(dev_priv)) {
> + ret = intel_init_bsd2_engine(dev_priv);
> if (ret)
> - goto cleanup_vebox_ring;
> + goto cleanup;
> }
>
> - ret = i915_gem_set_seqno(dev, ((u32)~0 - 0x1000));
> + return 0;
> +
> +cleanup:
> + i915_gem_cleanup_engines(dev);
> + return ret;
> +}
> +
> +int
> +i915_gem_suspend(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> + int ret = 0;
> +
> + mutex_lock(&dev->struct_mutex);
> + if (dev_priv->ums.mm_suspended)
> + goto err;
> +
> + ret = i915_gpu_idle(dev);
> if (ret)
> - goto cleanup_bsd2_ring;
> + goto err;
>
> - return 0;
> + i915_gem_retire_requests(dev);
> +
> + /* Under UMS, be paranoid and evict. */
> + if (!drm_core_check_feature(dev, DRIVER_MODESET))
> + i915_gem_evict_everything(dev);
> +
> + ret = i915_gem_suspend_engines(dev);
> + if (ret)
> + goto err;
> +
> + /* Hack! Don't let anybody do execbuf while we don't control the chip.
> + * We need to replace this with a semaphore, or something.
> + * And not confound ums.mm_suspended!
> + */
> + dev_priv->ums.mm_suspended = !drm_core_check_feature(dev,
> + DRIVER_MODESET);
> + mutex_unlock(&dev->struct_mutex);
> +
> + del_timer_sync(&dev_priv->gpu_error.hangcheck_timer);
> + cancel_delayed_work_sync(&dev_priv->mm.retire_work);
> + flush_delayed_work(&dev_priv->mm.idle_work);
>
> -cleanup_bsd2_ring:
> - intel_cleanup_ring_buffer(&dev_priv->ring[VCS2]);
> -cleanup_vebox_ring:
> - intel_cleanup_ring_buffer(&dev_priv->ring[VECS]);
> -cleanup_blt_ring:
> - intel_cleanup_ring_buffer(&dev_priv->ring[BCS]);
> -cleanup_bsd_ring:
> - intel_cleanup_ring_buffer(&dev_priv->ring[VCS]);
> -cleanup_render_ring:
> - intel_cleanup_ring_buffer(&dev_priv->ring[RCS]);
> + return 0;
>
> +err:
> + mutex_unlock(&dev->struct_mutex);
> return ret;
> }
>
> +void i915_gem_init_swizzling(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = dev->dev_private;
> +
> + if (INTEL_INFO(dev)->gen < 5 ||
> + dev_priv->mm.bit_6_swizzle_x == I915_BIT_6_SWIZZLE_NONE)
> + return;
> +
> + I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> + DISP_TILE_SURFACE_SWIZZLING);
> +
> + if (IS_GEN5(dev))
> + return;
> +
> + I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
> + if (IS_GEN6(dev))
> + I915_WRITE(ARB_MODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_SNB));
> + else if (IS_GEN7(dev))
> + I915_WRITE(ARB_MODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_IVB));
> + else if (IS_GEN8(dev))
> + I915_WRITE(GAMTARBMODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_BDW));
> + else
> + BUG();
> +}
> +
> int
> i915_gem_init_hw(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret, i;
> + int ret;
>
> if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
> return -EIO;
> @@ -4784,33 +4336,11 @@ i915_gem_init_hw(struct drm_device *dev)
>
> i915_gem_init_swizzling(dev);
>
> - ret = dev_priv->gt.init_rings(dev);
> - if (ret)
> - return ret;
> -
> - for (i = 0; i < NUM_L3_SLICES(dev); i++)
> - i915_gem_l3_remap(&dev_priv->ring[RCS], i);
> -
> - /*
> - * XXX: Contexts should only be initialized once. Doing a switch to the
> - * default context switch however is something we'd like to do after
> - * reset or thaw (the latter may not actually be necessary for HW, but
> - * goes with our code better). Context switching requires rings (for
> - * the do_switch), but before enabling PPGTT. So don't move this.
> - */
> - ret = i915_gem_context_enable(dev_priv);
> - if (ret && ret != -EIO) {
> - DRM_ERROR("Context enable failed %d\n", ret);
> - i915_gem_cleanup_ringbuffer(dev);
> -
> - return ret;
> - }
> -
> ret = i915_ppgtt_init_hw(dev);
> - if (ret && ret != -EIO) {
> - DRM_ERROR("PPGTT enable failed %d\n", ret);
> - i915_gem_cleanup_ringbuffer(dev);
> - }
> + if (ret == 0)
> + ret = i915_gem_context_enable(dev_priv);
> + if (ret == 0)
> + ret = i915_gem_resume_engines(dev);
>
> return ret;
> }
> @@ -4820,9 +4350,6 @@ int i915_gem_init(struct drm_device *dev)
> struct drm_i915_private *dev_priv = dev->dev_private;
> int ret;
>
> - i915.enable_execlists = intel_sanitize_enable_execlists(dev,
> - i915.enable_execlists);
> -
> mutex_lock(&dev->struct_mutex);
>
> if (IS_VALLEYVIEW(dev)) {
> @@ -4833,18 +4360,6 @@ int i915_gem_init(struct drm_device *dev)
> DRM_DEBUG_DRIVER("allow wake ack timed out\n");
> }
>
> - if (!i915.enable_execlists) {
> - dev_priv->gt.do_execbuf = i915_gem_ringbuffer_submission;
> - dev_priv->gt.init_rings = i915_gem_init_rings;
> - dev_priv->gt.cleanup_ring = intel_cleanup_ring_buffer;
> - dev_priv->gt.stop_ring = intel_stop_ring_buffer;
> - } else {
> - dev_priv->gt.do_execbuf = intel_execlists_submission;
> - dev_priv->gt.init_rings = intel_logical_rings_init;
> - dev_priv->gt.cleanup_ring = intel_logical_ring_cleanup;
> - dev_priv->gt.stop_ring = intel_logical_ring_stop;
> - }
> -
> ret = i915_gem_init_userptr(dev);
> if (ret) {
> mutex_unlock(&dev->struct_mutex);
> @@ -4853,13 +4368,12 @@ int i915_gem_init(struct drm_device *dev)
>
> i915_gem_init_global_gtt(dev);
>
> - ret = i915_gem_context_init(dev);
> - if (ret) {
> - mutex_unlock(&dev->struct_mutex);
> - return ret;
> - }
> -
> - ret = i915_gem_init_hw(dev);
> + gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> + ret = i915_gem_setup_engines(dev);
> + if (ret == 0)
> + ret = i915_gem_context_init(dev);
> + if (ret == 0)
> + ret = i915_gem_init_hw(dev);
> if (ret == -EIO) {
> /* Allow ring initialisation to fail by marking the GPU as
> * wedged. But we only want to do this where the GPU is angry,
> @@ -4869,20 +4383,16 @@ int i915_gem_init(struct drm_device *dev)
> atomic_set_mask(I915_WEDGED, &dev_priv->gpu_error.reset_counter);
> ret = 0;
> }
> + gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
> mutex_unlock(&dev->struct_mutex);
>
> return ret;
> }
>
> -void
> -i915_gem_cleanup_ringbuffer(struct drm_device *dev)
> +void i915_gem_fini(struct drm_device *dev)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - int i;
> -
> - for_each_ring(ring, dev_priv, i)
> - dev_priv->gt.cleanup_ring(ring);
> + i915_gem_context_fini(dev);
> + i915_gem_cleanup_engines(dev);
> }
>
> int
> @@ -4901,26 +4411,12 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
> }
>
> mutex_lock(&dev->struct_mutex);
> - dev_priv->ums.mm_suspended = 0;
> -
> ret = i915_gem_init_hw(dev);
> - if (ret != 0) {
> - mutex_unlock(&dev->struct_mutex);
> - return ret;
> - }
> -
> + if (ret == 0)
> + ret = drm_irq_install(dev, dev->pdev->irq);
> + if (ret == 0)
> + dev_priv->ums.mm_suspended = 0;
> BUG_ON(!list_empty(&dev_priv->gtt.base.active_list));
> -
> - ret = drm_irq_install(dev, dev->pdev->irq);
> - if (ret)
> - goto cleanup_ringbuffer;
> - mutex_unlock(&dev->struct_mutex);
> -
> - return 0;
> -
> -cleanup_ringbuffer:
> - i915_gem_cleanup_ringbuffer(dev);
> - dev_priv->ums.mm_suspended = 1;
> mutex_unlock(&dev->struct_mutex);
>
> return ret;
> @@ -4954,10 +4450,13 @@ i915_gem_lastclose(struct drm_device *dev)
> }
>
> static void
> -init_ring_lists(struct intel_engine_cs *ring)
> +init_null_engine(struct intel_engine_cs *engine)
> {
> - INIT_LIST_HEAD(&ring->active_list);
> - INIT_LIST_HEAD(&ring->request_list);
> + INIT_LIST_HEAD(&engine->read_list);
> + INIT_LIST_HEAD(&engine->write_list);
> + INIT_LIST_HEAD(&engine->fence_list);
> + INIT_LIST_HEAD(&engine->requests);
> + INIT_LIST_HEAD(&engine->rings);
> }
>
> void i915_init_vm(struct drm_i915_private *dev_priv,
> @@ -4991,8 +4490,8 @@ i915_gem_load(struct drm_device *dev)
> INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> - for (i = 0; i < I915_NUM_RINGS; i++)
> - init_ring_lists(&dev_priv->ring[i]);
> + for (i = 0; i < I915_NUM_ENGINES; i++)
> + init_null_engine(&dev_priv->engine[i]);
> for (i = 0; i < I915_MAX_NUM_FENCES; i++)
> INIT_LIST_HEAD(&dev_priv->fence_regs[i].lru_list);
> INIT_DELAYED_WORK(&dev_priv->mm.retire_work,
> @@ -5052,13 +4551,13 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
> */
> spin_lock(&file_priv->mm.lock);
> while (!list_empty(&file_priv->mm.request_list)) {
> - struct drm_i915_gem_request *request;
> + struct i915_gem_request *rq;
>
> - request = list_first_entry(&file_priv->mm.request_list,
> - struct drm_i915_gem_request,
> - client_list);
> - list_del(&request->client_list);
> - request->file_priv = NULL;
> + rq = list_first_entry(&file_priv->mm.request_list,
> + struct i915_gem_request,
> + client_list);
> + list_del(&rq->client_list);
> + rq->file_priv = NULL;
> }
> spin_unlock(&file_priv->mm.lock);
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 958d2cfad61a..c9b2a12be660 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -96,9 +96,9 @@
> #define GEN6_CONTEXT_ALIGN (64<<10)
> #define GEN7_CONTEXT_ALIGN 4096
>
> -static size_t get_context_alignment(struct drm_device *dev)
> +static size_t get_context_alignment(struct drm_i915_private *i915)
> {
> - if (IS_GEN6(dev))
> + if (IS_GEN6(i915))
> return GEN6_CONTEXT_ALIGN;
>
> return GEN7_CONTEXT_ALIGN;
> @@ -111,6 +111,9 @@ static int get_context_size(struct drm_device *dev)
> u32 reg;
>
> switch (INTEL_INFO(dev)->gen) {
> + case 5:
> + ret = ILK_CXT_TOTAL_SIZE;
> + break;
> case 6:
> reg = I915_READ(CXT_SIZE);
> ret = GEN6_CXT_TOTAL_SIZE(reg) * 64;
> @@ -134,16 +137,22 @@ static int get_context_size(struct drm_device *dev)
>
> void i915_gem_context_free(struct kref *ctx_ref)
> {
> - struct intel_context *ctx = container_of(ctx_ref,
> - typeof(*ctx), ref);
> -
> - if (i915.enable_execlists)
> - intel_lr_context_free(ctx);
> + struct intel_context *ctx =
> + container_of(ctx_ref, typeof(*ctx), ref);
> + struct drm_i915_private *dev_priv = ctx->i915;
> + int i;
>
> i915_ppgtt_put(ctx->ppgtt);
>
> - if (ctx->legacy_hw_ctx.rcs_state)
> - drm_gem_object_unreference(&ctx->legacy_hw_ctx.rcs_state->base);
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + if (intel_engine_initialized(&dev_priv->engine[i]) &&
> + ctx->ring[i].ring != NULL)
> + dev_priv->engine[i].put_ring(ctx->ring[i].ring, ctx);
> +
> + if (ctx->ring[i].state != NULL)
> + drm_gem_object_unreference(&ctx->ring[i].state->base);
> + }
> +
> list_del(&ctx->link);
> kfree(ctx);
> }
> @@ -192,15 +201,16 @@ __create_hw_context(struct drm_device *dev,
>
> kref_init(&ctx->ref);
> list_add_tail(&ctx->link, &dev_priv->context_list);
> + ctx->i915 = dev_priv;
>
> if (dev_priv->hw_context_size) {
> struct drm_i915_gem_object *obj =
> - i915_gem_alloc_context_obj(dev, dev_priv->hw_context_size);
> + i915_gem_alloc_context_obj(dev, dev_priv->hw_context_size);
> if (IS_ERR(obj)) {
> ret = PTR_ERR(obj);
> goto err_out;
> }
> - ctx->legacy_hw_ctx.rcs_state = obj;
> + ctx->ring[RCS].state = obj;
> }
>
> /* Default context will never have a file_priv */
> @@ -228,18 +238,11 @@ err_out:
> return ERR_PTR(ret);
> }
>
> -/**
> - * The default context needs to exist per ring that uses contexts. It stores the
> - * context state of the GPU for applications that don't utilize HW contexts, as
> - * well as an idle case.
> - */
> static struct intel_context *
> i915_gem_create_context(struct drm_device *dev,
> struct drm_i915_file_private *file_priv)
> {
> - const bool is_global_default_ctx = file_priv == NULL;
> struct intel_context *ctx;
> - int ret = 0;
>
> BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>
> @@ -247,82 +250,29 @@ i915_gem_create_context(struct drm_device *dev,
> if (IS_ERR(ctx))
> return ctx;
>
> - if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state) {
> - /* We may need to do things with the shrinker which
> - * require us to immediately switch back to the default
> - * context. This can cause a problem as pinning the
> - * default context also requires GTT space which may not
> - * be available. To avoid this we always pin the default
> - * context.
> - */
> - ret = i915_gem_obj_ggtt_pin(ctx->legacy_hw_ctx.rcs_state,
> - get_context_alignment(dev), 0);
> - if (ret) {
> - DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
> - goto err_destroy;
> - }
> - }
> -
> if (USES_FULL_PPGTT(dev)) {
> struct i915_hw_ppgtt *ppgtt = i915_ppgtt_create(dev, file_priv);
>
> if (IS_ERR_OR_NULL(ppgtt)) {
> DRM_DEBUG_DRIVER("PPGTT setup failed (%ld)\n",
> PTR_ERR(ppgtt));
> - ret = PTR_ERR(ppgtt);
> - goto err_unpin;
> + i915_gem_context_unreference(ctx);
> + return ERR_CAST(ppgtt);
> }
>
> ctx->ppgtt = ppgtt;
> }
>
> return ctx;
> -
> -err_unpin:
> - if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state)
> - i915_gem_object_ggtt_unpin(ctx->legacy_hw_ctx.rcs_state);
> -err_destroy:
> - i915_gem_context_unreference(ctx);
> - return ERR_PTR(ret);
> -}
> -
> -void i915_gem_context_reset(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int i;
> -
> - /* In execlists mode we will unreference the context when the execlist
> - * queue is cleared and the requests destroyed.
> - */
> - if (i915.enable_execlists)
> - return;
> -
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> - struct intel_engine_cs *ring = &dev_priv->ring[i];
> - struct intel_context *lctx = ring->last_context;
> -
> - if (lctx) {
> - if (lctx->legacy_hw_ctx.rcs_state && i == RCS)
> - i915_gem_object_ggtt_unpin(lctx->legacy_hw_ctx.rcs_state);
> -
> - i915_gem_context_unreference(lctx);
> - ring->last_context = NULL;
> - }
> - }
> }
>
> int i915_gem_context_init(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct intel_context *ctx;
> - int i;
> -
> - /* Init should only be called once per module load. Eventually the
> - * restriction on the context_disabled check can be loosened. */
> - if (WARN_ON(dev_priv->ring[RCS].default_context))
> - return 0;
> + int i, ret;
>
> - if (i915.enable_execlists) {
> + if (RCS_ENGINE(dev_priv)->execlists_enabled) {
> /* NB: intentionally left blank. We will allocate our own
> * backing objects as we need them, thank you very much */
> dev_priv->hw_context_size = 0;
> @@ -335,83 +285,112 @@ int i915_gem_context_init(struct drm_device *dev)
> }
> }
>
> - ctx = i915_gem_create_context(dev, NULL);
> + /**
> + * The default context needs to exist per ring that uses contexts.
> + * It stores the context state of the GPU for applications that don't
> + * utilize HW contexts or per-process VM, as well as an idle case.
> + */
> + ctx = __create_hw_context(dev, NULL);
> if (IS_ERR(ctx)) {
> DRM_ERROR("Failed to create default global context (error %ld)\n",
> PTR_ERR(ctx));
> return PTR_ERR(ctx);
> }
>
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> - struct intel_engine_cs *ring = &dev_priv->ring[i];
> + if (dev_priv->hw_context_size) {
> + /* We may need to do things with the shrinker which
> + * require us to immediately switch back to the default
> + * context. This can cause a problem as pinning the
> + * default context also requires GTT space which may not
> + * be available. To avoid this we always pin the default
> + * context.
> + */
> + ret = i915_gem_obj_ggtt_pin(ctx->ring[RCS].state,
> + get_context_alignment(dev_priv), 0);
> + if (ret) {
> + DRM_ERROR("Failed to pin global default context\n");
> + i915_gem_context_unreference(ctx);
> + return ret;
> + }
> + }
>
> - /* NB: RCS will hold a ref for all rings */
> - ring->default_context = ctx;
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + struct intel_engine_cs *engine = &dev_priv->engine[i];
> +
> + if (engine->i915 == NULL)
> + continue;
> +
> + engine->default_context = ctx;
> + i915_gem_context_reference(ctx);
> }
>
> + dev_priv->default_context = ctx;
> +
> DRM_DEBUG_DRIVER("%s context support initialized\n",
> - i915.enable_execlists ? "LR" :
> - dev_priv->hw_context_size ? "HW" : "fake");
> + RCS_ENGINE(dev_priv)->execlists_enabled ? "LR" :
> + dev_priv->hw_context_size ? "HW" : "fake");
> return 0;
> }
>
> void i915_gem_context_fini(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_context *dctx = dev_priv->ring[RCS].default_context;
> + struct intel_engine_cs *engine;
> int i;
>
> - if (dctx->legacy_hw_ctx.rcs_state) {
> + if (dev_priv->hw_context_size)
> /* The only known way to stop the gpu from accessing the hw context is
> * to reset it. Do this as the very last operation to avoid confusing
> * other code, leading to spurious errors. */
> intel_gpu_reset(dev);
>
> - /* When default context is created and switched to, base object refcount
> - * will be 2 (+1 from object creation and +1 from do_switch()).
> - * i915_gem_context_fini() will be called after gpu_idle() has switched
> - * to default context. So we need to unreference the base object once
> - * to offset the do_switch part, so that i915_gem_context_unreference()
> - * can then free the base object correctly. */
> - WARN_ON(!dev_priv->ring[RCS].last_context);
> - if (dev_priv->ring[RCS].last_context == dctx) {
> - /* Fake switch to NULL context */
> - WARN_ON(dctx->legacy_hw_ctx.rcs_state->active);
> - i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
> - i915_gem_context_unreference(dctx);
> - dev_priv->ring[RCS].last_context = NULL;
> - }
> -
> - i915_gem_object_ggtt_unpin(dctx->legacy_hw_ctx.rcs_state);
> + for_each_engine(engine, dev_priv, i) {
> + i915_gem_context_unreference(engine->default_context);
> + engine->default_context = NULL;
> }
>
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> - struct intel_engine_cs *ring = &dev_priv->ring[i];
> -
> - if (ring->last_context)
> - i915_gem_context_unreference(ring->last_context);
> -
> - ring->default_context = NULL;
> - ring->last_context = NULL;
> + if (dev_priv->default_context) {
> + if (dev_priv->hw_context_size)
> + i915_gem_object_ggtt_unpin(dev_priv->default_context->ring[RCS].state);
> + i915_gem_context_unreference(dev_priv->default_context);
> + dev_priv->default_context = NULL;
> }
> -
> - i915_gem_context_unreference(dctx);
> }
>
> int i915_gem_context_enable(struct drm_i915_private *dev_priv)
> {
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int ret, i;
>
> - BUG_ON(!dev_priv->ring[RCS].default_context);
> + for_each_engine(engine, dev_priv, i) {
> + struct intel_context *ctx = engine->default_context;
> + struct i915_gem_request *rq;
>
> - if (i915.enable_execlists)
> - return 0;
> + if (HAS_L3_DPF(dev_priv))
> + ctx->remap_slice = (1 << NUM_L3_SLICES(dev_priv)) - 1;
>
> - for_each_ring(ring, dev_priv, i) {
> - ret = i915_switch_context(ring, ring->default_context);
> - if (ret)
> + rq = intel_engine_alloc_request(engine, ctx);
> + if (IS_ERR(rq)) {
> + ret = PTR_ERR(rq);
> + goto err;
> + }
> +
> + ret = 0;
> + /*
> + * Workarounds applied in this fn are part of register state context,
> + * they need to be re-initialized followed by gpu reset, suspend/resume,
> + * module reload.
> + */
> + if (engine->init_context)
> + ret = engine->init_context(rq);
> + if (ret == 0)
> + ret = i915_request_commit(rq);
> + i915_request_put(rq);
> + if (ret) {
> +err:
> + DRM_ERROR("failed to enabled contexts (%s): %d\n", engine->name, ret);
> return ret;
> + }
> }
>
> return 0;
> @@ -421,7 +400,9 @@ static int context_idr_cleanup(int id, void *p, void *data)
> {
> struct intel_context *ctx = p;
>
> + ctx->file_priv = NULL;
> i915_gem_context_unreference(ctx);
> +
> return 0;
> }
>
> @@ -465,41 +446,48 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
> }
>
> static inline int
> -mi_set_context(struct intel_engine_cs *ring,
> - struct intel_context *new_context,
> - u32 hw_flags)
> +mi_set_context(struct i915_gem_request *rq,
> + struct intel_engine_context *new_context,
> + u32 flags)
> {
> - u32 flags = hw_flags | MI_MM_SPACE_GTT;
> - int ret;
> + struct intel_ringbuffer *ring;
> + int len;
>
> /* w/a: If Flush TLB Invalidation Mode is enabled, driver must do a TLB
> * invalidation prior to MI_SET_CONTEXT. On GEN6 we don't set the value
> - * explicitly, so we rely on the value at ring init, stored in
> + * explicitly, so we rely on the value at engine init, stored in
> * itlb_before_ctx_switch.
> */
> - if (IS_GEN6(ring->dev)) {
> - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, 0);
> - if (ret)
> - return ret;
> - }
> + if (IS_GEN6(rq->i915))
> + rq->pending_flush |= I915_INVALIDATE_CACHES;
>
> - /* These flags are for resource streamer on HSW+ */
> - if (!IS_HASWELL(ring->dev) && INTEL_INFO(ring->dev)->gen < 8)
> - flags |= (MI_SAVE_EXT_STATE_EN | MI_RESTORE_EXT_STATE_EN);
> + len = 3;
> + switch (INTEL_INFO(rq->i915)->gen) {
> + case 8:
> + case 7:
> + case 5: len += 2;
> + break;
> + }
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, len);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - /* WaProgramMiArbOnOffAroundMiSetContext:ivb,vlv,hsw,bdw,chv */
> - if (INTEL_INFO(ring->dev)->gen >= 7)
> + switch (INTEL_INFO(rq->i915)->gen) {
> + case 8:
> + case 7:
> + /* WaProgramMiArbOnOffAroundMiSetContext:ivb,vlv,hsw,bdw,chv */
> intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_DISABLE);
> - else
> - intel_ring_emit(ring, MI_NOOP);
> + break;
> + case 5:
> + intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
> + break;
> + }
>
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_emit(ring, MI_SET_CONTEXT);
> - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(new_context->legacy_hw_ctx.rcs_state) |
> + intel_ring_emit(ring,
> + i915_gem_obj_ggtt_offset(new_context->state) |
> + MI_MM_SPACE_GTT |
> flags);
> /*
> * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP
> @@ -507,60 +495,106 @@ mi_set_context(struct intel_engine_cs *ring,
> */
> intel_ring_emit(ring, MI_NOOP);
>
> - if (INTEL_INFO(ring->dev)->gen >= 7)
> + switch (INTEL_INFO(rq->i915)->gen) {
> + case 8:
> + case 7:
> intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> - else
> - intel_ring_emit(ring, MI_NOOP);
> + break;
> + case 5:
> + intel_ring_emit(ring, MI_SUSPEND_FLUSH);
> + break;
> + }
>
> intel_ring_advance(ring);
>
> - return ret;
> + rq->pending_flush &= ~I915_COMMAND_BARRIER;
> + return 0;
> }
>
> -static int do_switch(struct intel_engine_cs *ring,
> - struct intel_context *to)
> +static int l3_remap(struct i915_gem_request *rq, int slice)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - struct intel_context *from = ring->last_context;
> + const u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200);
> + const u32 *remap_info;
> + struct intel_ringbuffer *ring;
> + int i;
> +
> + remap_info = rq->i915->l3_parity.remap_info[slice];
> + if (remap_info == NULL)
> + return 0;
> +
> + ring = intel_ring_begin(rq, GEN7_L3LOG_SIZE / 4 * 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
> +
> + /*
> + * Note: We do not worry about the concurrent register cacheline hang
> + * here because no other code should access these registers other than
> + * at initialization time.
> + */
> + for (i = 0; i < GEN7_L3LOG_SIZE; i += 4) {
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> + intel_ring_emit(ring, reg_base + i);
> + intel_ring_emit(ring, remap_info[i/4]);
> + }
> +
> + intel_ring_advance(ring);
> + return 0;
> +}
> +
> +/**
> + * i915_request_switch_context() - perform a GPU context switch.
> + * @rq: request and ring/ctx for which we'll execute the context switch
> + *
> + * The context life cycle is simple. The context refcount is incremented and
> + * decremented by 1 and create and destroy. If the context is in use by the GPU,
> + * it will have a refoucnt > 1. This allows us to destroy the context abstract
> + * object while letting the normal object tracking destroy the backing BO.
> + */
> +int i915_request_switch_context(struct i915_gem_request *rq)
> +{
> + struct intel_context *to = rq->ctx;
> + struct intel_engine_context *ctx = &to->ring[rq->engine->id];
> + struct intel_context *from;
> u32 hw_flags = 0;
> - bool uninitialized = false;
> int ret, i;
>
> - if (from != NULL && ring == &dev_priv->ring[RCS]) {
> - BUG_ON(from->legacy_hw_ctx.rcs_state == NULL);
> - BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state));
> - }
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + if (ctx->state == NULL)
> + return 0;
>
> - if (from == to && !to->remap_slice)
> + if (rq->ring->last_context == to && !to->remap_slice)
> return 0;
>
> /* Trying to pin first makes error handling easier. */
> - if (ring == &dev_priv->ring[RCS]) {
> - ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
> - get_context_alignment(ring->dev), 0);
> - if (ret)
> - return ret;
> - }
> + ret = i915_gem_obj_ggtt_pin(ctx->state,
> + get_context_alignment(rq->i915), 0);
> + if (ret)
> + return ret;
>
> /*
> * Pin can switch back to the default context if we end up calling into
> * evict_everything - as a last ditch gtt defrag effort that also
> * switches to the default context. Hence we need to reload from here.
> */
> - from = ring->last_context;
> + from = rq->ring->last_context;
> +
> + /* With execlists enabled, the ring, vm and logical state are
> + * interwined and we do not need to explicitly load the mm or
> + * logical state as it is loaded along with the LRCA.
> + *
> + * But we still want to pin the state (for global usage tracking)
> + * whilst in use and reload the l3 mapping if it has changed.
> + */
> + if (rq->engine->execlists_enabled)
> + goto load_l3_map;
>
> if (to->ppgtt) {
> - ret = to->ppgtt->switch_mm(to->ppgtt, ring);
> + ret = to->ppgtt->switch_mm(rq, to->ppgtt);
> if (ret)
> goto unpin_out;
> }
>
> - if (ring != &dev_priv->ring[RCS]) {
> - if (from)
> - i915_gem_context_unreference(from);
> - goto done;
> - }
> -
> /*
> * Clear this page out of any CPU caches for coherent swap-in/out. Note
> * that thanks to write = false in this call and us not setting any gpu
> @@ -569,33 +603,39 @@ static int do_switch(struct intel_engine_cs *ring,
> *
> * XXX: We need a real interface to do this instead of trickery.
> */
> - ret = i915_gem_object_set_to_gtt_domain(to->legacy_hw_ctx.rcs_state, false);
> + ret = i915_gem_object_set_to_gtt_domain(ctx->state, false);
> if (ret)
> goto unpin_out;
>
> - if (!to->legacy_hw_ctx.rcs_state->has_global_gtt_mapping) {
> - struct i915_vma *vma = i915_gem_obj_to_vma(to->legacy_hw_ctx.rcs_state,
> - &dev_priv->gtt.base);
> - vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level, GLOBAL_BIND);
> + if (!ctx->state->has_global_gtt_mapping) {
> + struct i915_vma *vma = i915_gem_obj_to_vma(ctx->state,
> + &rq->i915->gtt.base);
> + vma->bind_vma(vma, ctx->state->cache_level, GLOBAL_BIND);
> }
>
> - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
> + if (!ctx->initialized || i915_gem_context_is_default(to))
> hw_flags |= MI_RESTORE_INHIBIT;
>
> - ret = mi_set_context(ring, to, hw_flags);
> + /* These flags are for resource streamer on HSW+ */
> + if (!IS_HASWELL(rq->i915) && INTEL_INFO(rq->i915)->gen < 8) {
> + if (ctx->initialized)
> + hw_flags |= MI_RESTORE_EXT_STATE_EN;
> + hw_flags |= MI_SAVE_EXT_STATE_EN;
> + }
> +
> + trace_i915_gem_ring_switch_context(rq->engine, to, hw_flags);
> + ret = mi_set_context(rq, ctx, hw_flags);
> if (ret)
> goto unpin_out;
>
> +load_l3_map:
> for (i = 0; i < MAX_L3_SLICES; i++) {
> if (!(to->remap_slice & (1<<i)))
> continue;
>
> - ret = i915_gem_l3_remap(ring, i);
> /* If it failed, try again next round */
> - if (ret)
> - DRM_DEBUG_DRIVER("L3 remapping failed\n");
> - else
> - to->remap_slice &= ~(1<<i);
> + if (l3_remap(rq, i) == 0)
> + rq->remap_l3 |= 1 << i;
> }
>
> /* The backing object for the context is done after switching to the
> @@ -605,8 +645,16 @@ static int do_switch(struct intel_engine_cs *ring,
> * MI_SET_CONTEXT instead of when the next seqno has completed.
> */
> if (from != NULL) {
> - from->legacy_hw_ctx.rcs_state->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> - i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), ring);
> + struct drm_i915_gem_object *from_obj = from->ring[rq->engine->id].state;
> +
> + from_obj->base.pending_read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> + /* obj is kept alive until the next request by its active ref */
> + ret = i915_request_add_vma(rq,
> + i915_gem_obj_to_ggtt(from_obj),
> + 0);
> + if (ret)
> + goto unpin_out;
> +
> /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
> * whole damn pipeline, we don't need to explicitly mark the
> * object dirty. The only exception is that the context must be
> @@ -614,79 +662,61 @@ static int do_switch(struct intel_engine_cs *ring,
> * able to defer doing this until we know the object would be
> * swapped, but there is no way to do that yet.
> */
> - from->legacy_hw_ctx.rcs_state->dirty = 1;
> - BUG_ON(from->legacy_hw_ctx.rcs_state->ring != ring);
> -
> - /* obj is kept alive until the next request by its active ref */
> - i915_gem_object_ggtt_unpin(from->legacy_hw_ctx.rcs_state);
> - i915_gem_context_unreference(from);
> - }
> -
> - uninitialized = !to->legacy_hw_ctx.initialized && from == NULL;
> - to->legacy_hw_ctx.initialized = true;
> -
> -done:
> - i915_gem_context_reference(to);
> - ring->last_context = to;
> -
> - if (uninitialized) {
> - if (ring->init_context) {
> - ret = ring->init_context(ring);
> - if (ret)
> - DRM_ERROR("ring init context: %d\n", ret);
> - }
> -
> - ret = i915_gem_render_state_init(ring);
> - if (ret)
> - DRM_ERROR("init render state: %d\n", ret);
> + from_obj->dirty = 1;
> }
>
> + rq->has_ctx_switch = true;
> return 0;
>
> unpin_out:
> - if (ring->id == RCS)
> - i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
> + i915_gem_object_ggtt_unpin(ctx->state);
> return ret;
> }
>
> /**
> - * i915_switch_context() - perform a GPU context switch.
> - * @ring: ring for which we'll execute the context switch
> - * @to: the context to switch to
> - *
> - * The context life cycle is simple. The context refcount is incremented and
> - * decremented by 1 and create and destroy. If the context is in use by the GPU,
> - * it will have a refcount > 1. This allows us to destroy the context abstract
> - * object while letting the normal object tracking destroy the backing BO.
> - *
> - * This function should not be used in execlists mode. Instead the context is
> - * switched by writing to the ELSP and requests keep a reference to their
> - * context.
> + * i915_request_switch_context__commit() - commit the context sitch
> + * @rq: request for which we have executed the context switch
> */
> -int i915_switch_context(struct intel_engine_cs *ring,
> - struct intel_context *to)
> +void i915_request_switch_context__commit(struct i915_gem_request *rq)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct intel_context *ctx;
>
> - WARN_ON(i915.enable_execlists);
> - WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
>
> - if (to->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */
> - if (to != ring->last_context) {
> - i915_gem_context_reference(to);
> - if (ring->last_context)
> - i915_gem_context_unreference(ring->last_context);
> - ring->last_context = to;
> - }
> - return 0;
> - }
> + if (!rq->has_ctx_switch)
> + return;
> +
> + ctx = rq->ring->last_context;
> + if (ctx)
> + i915_gem_object_ggtt_unpin(ctx->ring[rq->engine->id].state);
>
> - return do_switch(ring, to);
> + ctx = rq->ctx;
> + ctx->remap_slice &= ~rq->remap_l3;
> + ctx->ring[rq->engine->id].initialized = true;
> +
> + rq->has_ctx_switch = false;
> +}
> +
> +/**
> + * i915_request_switch_context__undo() - unwind the context sitch
> + * @rq: request for which we have executed the context switch
> + */
> +void i915_request_switch_context__undo(struct i915_gem_request *rq)
> +{
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + if (!rq->has_ctx_switch)
> + return;
> +
> + i915_gem_object_ggtt_unpin(rq->ctx->ring[rq->engine->id].state);
> }
>
> -static bool contexts_enabled(struct drm_device *dev)
> +static bool contexts_enabled(struct drm_i915_private *dev_priv)
> {
> - return i915.enable_execlists || to_i915(dev)->hw_context_size;
> + if (RCS_ENGINE(dev_priv)->execlists_enabled)
> + return true;
> +
> + return dev_priv->hw_context_size;
> }
>
> int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> @@ -697,7 +727,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> struct intel_context *ctx;
> int ret;
>
> - if (!contexts_enabled(dev))
> + if (!contexts_enabled(to_i915(dev)))
> return -ENODEV;
>
> ret = i915_mutex_lock_interruptible(dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
> deleted file mode 100644
> index f462d1b51d97..000000000000
> --- a/drivers/gpu/drm/i915/i915_gem_debug.c
> +++ /dev/null
> @@ -1,118 +0,0 @@
> -/*
> - * Copyright © 2008 Intel Corporation
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a
> - * copy of this software and associated documentation files (the "Software"),
> - * to deal in the Software without restriction, including without limitation
> - * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> - * and/or sell copies of the Software, and to permit persons to whom the
> - * Software is furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice (including the next
> - * paragraph) shall be included in all copies or substantial portions of the
> - * Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> - * IN THE SOFTWARE.
> - *
> - * Authors:
> - * Keith Packard <keithp at keithp.com>
> - *
> - */
> -
> -#include <drm/drmP.h>
> -#include <drm/i915_drm.h>
> -#include "i915_drv.h"
> -
> -#if WATCH_LISTS
> -int
> -i915_verify_lists(struct drm_device *dev)
> -{
> - static int warned;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct drm_i915_gem_object *obj;
> - int err = 0;
> -
> - if (warned)
> - return 0;
> -
> - list_for_each_entry(obj, &dev_priv->render_ring.active_list, list) {
> - if (obj->base.dev != dev ||
> - !atomic_read(&obj->base.refcount.refcount)) {
> - DRM_ERROR("freed render active %p\n", obj);
> - err++;
> - break;
> - } else if (!obj->active ||
> - (obj->base.read_domains & I915_GEM_GPU_DOMAINS) == 0) {
> - DRM_ERROR("invalid render active %p (a %d r %x)\n",
> - obj,
> - obj->active,
> - obj->base.read_domains);
> - err++;
> - } else if (obj->base.write_domain && list_empty(&obj->gpu_write_list)) {
> - DRM_ERROR("invalid render active %p (w %x, gwl %d)\n",
> - obj,
> - obj->base.write_domain,
> - !list_empty(&obj->gpu_write_list));
> - err++;
> - }
> - }
> -
> - list_for_each_entry(obj, &dev_priv->mm.flushing_list, list) {
> - if (obj->base.dev != dev ||
> - !atomic_read(&obj->base.refcount.refcount)) {
> - DRM_ERROR("freed flushing %p\n", obj);
> - err++;
> - break;
> - } else if (!obj->active ||
> - (obj->base.write_domain & I915_GEM_GPU_DOMAINS) == 0 ||
> - list_empty(&obj->gpu_write_list)) {
> - DRM_ERROR("invalid flushing %p (a %d w %x gwl %d)\n",
> - obj,
> - obj->active,
> - obj->base.write_domain,
> - !list_empty(&obj->gpu_write_list));
> - err++;
> - }
> - }
> -
> - list_for_each_entry(obj, &dev_priv->mm.gpu_write_list, gpu_write_list) {
> - if (obj->base.dev != dev ||
> - !atomic_read(&obj->base.refcount.refcount)) {
> - DRM_ERROR("freed gpu write %p\n", obj);
> - err++;
> - break;
> - } else if (!obj->active ||
> - (obj->base.write_domain & I915_GEM_GPU_DOMAINS) == 0) {
> - DRM_ERROR("invalid gpu write %p (a %d w %x)\n",
> - obj,
> - obj->active,
> - obj->base.write_domain);
> - err++;
> - }
> - }
> -
> - list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
> - if (obj->base.dev != dev ||
> - !atomic_read(&obj->base.refcount.refcount)) {
> - DRM_ERROR("freed inactive %p\n", obj);
> - err++;
> - break;
> - } else if (obj->pin_count || obj->active ||
> - (obj->base.write_domain & I915_GEM_GPU_DOMAINS)) {
> - DRM_ERROR("invalid inactive %p (p %d a %d w %x)\n",
> - obj,
> - obj->pin_count, obj->active,
> - obj->base.write_domain);
> - err++;
> - }
> - }
> -
> - return warned = err;
> -}
> -#endif /* WATCH_LIST */
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index c9016c439649..5ee96db71b37 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -42,6 +42,7 @@
>
> struct eb_vmas {
> struct list_head vmas;
> + struct i915_vma *batch;
> int and;
> union {
> struct i915_vma *lut[0];
> @@ -88,6 +89,26 @@ eb_reset(struct eb_vmas *eb)
> memset(eb->buckets, 0, (eb->and+1)*sizeof(struct hlist_head));
> }
>
> +static struct i915_vma *
> +eb_get_batch(struct eb_vmas *eb)
> +{
> + struct i915_vma *vma =
> + list_entry(eb->vmas.prev, typeof(*vma), exec_list);
> +
> + /*
> + * SNA is doing fancy tricks with compressing batch buffers, which leads
> + * to negative relocation deltas. Usually that works out ok since the
> + * relocate address is still positive, except when the batch is placed
> + * very low in the GTT. Ensure this doesn't happen.
> + *
> + * Note that actual hangs have only been observed on gen7, but for
> + * paranoia do it everywhere.
> + */
> + vma->exec_entry->flags |= __EXEC_OBJECT_NEEDS_BIAS;
> +
> + return vma;
> +}
> +
> static int
> eb_lookup_vmas(struct eb_vmas *eb,
> struct drm_i915_gem_exec_object2 *exec,
> @@ -165,6 +186,9 @@ eb_lookup_vmas(struct eb_vmas *eb,
> ++i;
> }
>
> + /* take note of the batch buffer before we might reorder the lists */
> + eb->batch = eb_get_batch(eb);
> +
> return 0;
>
>
> @@ -256,7 +280,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
> {
> struct drm_device *dev = obj->base.dev;
> uint32_t page_offset = offset_in_page(reloc->offset);
> - uint64_t delta = reloc->delta + target_offset;
> + uint64_t delta = (int)reloc->delta + target_offset;
> char *vaddr;
> int ret;
>
> @@ -292,7 +316,7 @@ relocate_entry_gtt(struct drm_i915_gem_object *obj,
> {
> struct drm_device *dev = obj->base.dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - uint64_t delta = reloc->delta + target_offset;
> + uint64_t delta = (int)reloc->delta + target_offset;
> uint64_t offset;
> void __iomem *reloc_page;
> int ret;
> @@ -422,13 +446,11 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> ret = relocate_entry_cpu(obj, reloc, target_offset);
> else
> ret = relocate_entry_gtt(obj, reloc, target_offset);
> -
> if (ret)
> return ret;
>
> /* and update the user's relocation entry */
> reloc->presumed_offset = target_offset;
> -
> return 0;
> }
>
> @@ -521,7 +543,7 @@ i915_gem_execbuffer_relocate(struct eb_vmas *eb)
>
> static int
> i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> - struct intel_engine_cs *ring,
> + struct intel_engine_cs *engine,
> bool *need_reloc)
> {
> struct drm_i915_gem_object *obj = vma->obj;
> @@ -610,7 +632,7 @@ eb_vma_misplaced(struct i915_vma *vma)
> }
>
> static int
> -i915_gem_execbuffer_reserve(struct intel_engine_cs *ring,
> +i915_gem_execbuffer_reserve(struct intel_engine_cs *engine,
> struct list_head *vmas,
> bool *need_relocs)
> {
> @@ -618,10 +640,10 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring,
> struct i915_vma *vma;
> struct i915_address_space *vm;
> struct list_head ordered_vmas;
> - bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> + bool has_fenced_gpu_access = INTEL_INFO(engine->i915)->gen < 4;
> int retry;
>
> - i915_gem_retire_requests_ring(ring);
> + i915_gem_retire_requests__engine(engine);
>
> vm = list_first_entry(vmas, struct i915_vma, exec_list)->vm;
>
> @@ -676,7 +698,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring,
> if (eb_vma_misplaced(vma))
> ret = i915_vma_unbind(vma);
> else
> - ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
> + ret = i915_gem_execbuffer_reserve_vma(vma, engine, need_relocs);
> if (ret)
> goto err;
> }
> @@ -686,7 +708,7 @@ i915_gem_execbuffer_reserve(struct intel_engine_cs *ring,
> if (drm_mm_node_allocated(&vma->node))
> continue;
>
> - ret = i915_gem_execbuffer_reserve_vma(vma, ring, need_relocs);
> + ret = i915_gem_execbuffer_reserve_vma(vma, engine, need_relocs);
> if (ret)
> goto err;
> }
> @@ -706,10 +728,10 @@ err:
> }
>
> static int
> -i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> +i915_gem_execbuffer_relocate_slow(struct drm_i915_private *i915,
> struct drm_i915_gem_execbuffer2 *args,
> struct drm_file *file,
> - struct intel_engine_cs *ring,
> + struct intel_engine_cs *engine,
> struct eb_vmas *eb,
> struct drm_i915_gem_exec_object2 *exec)
> {
> @@ -731,7 +753,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> drm_gem_object_unreference(&vma->obj->base);
> }
>
> - mutex_unlock(&dev->struct_mutex);
> + mutex_unlock(&i915->dev->struct_mutex);
>
> total = 0;
> for (i = 0; i < count; i++)
> @@ -742,7 +764,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> if (reloc == NULL || reloc_offset == NULL) {
> drm_free_large(reloc);
> drm_free_large(reloc_offset);
> - mutex_lock(&dev->struct_mutex);
> + mutex_lock(&i915->dev->struct_mutex);
> return -ENOMEM;
> }
>
> @@ -757,7 +779,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> if (copy_from_user(reloc+total, user_relocs,
> exec[i].relocation_count * sizeof(*reloc))) {
> ret = -EFAULT;
> - mutex_lock(&dev->struct_mutex);
> + mutex_lock(&i915->dev->struct_mutex);
> goto err;
> }
>
> @@ -775,7 +797,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> &invalid_offset,
> sizeof(invalid_offset))) {
> ret = -EFAULT;
> - mutex_lock(&dev->struct_mutex);
> + mutex_lock(&i915->dev->struct_mutex);
> goto err;
> }
> }
> @@ -784,9 +806,9 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> total += exec[i].relocation_count;
> }
>
> - ret = i915_mutex_lock_interruptible(dev);
> + ret = i915_mutex_lock_interruptible(i915->dev);
> if (ret) {
> - mutex_lock(&dev->struct_mutex);
> + mutex_lock(&i915->dev->struct_mutex);
> goto err;
> }
>
> @@ -797,7 +819,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
> goto err;
>
> need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> - ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
> + ret = i915_gem_execbuffer_reserve(engine, &eb->vmas, &need_relocs);
> if (ret)
> goto err;
>
> @@ -822,17 +844,19 @@ err:
> }
>
> static int
> -i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring,
> - struct list_head *vmas)
> +vmas_move_to_rq(struct list_head *vmas,
> + struct i915_gem_request *rq)
> {
> struct i915_vma *vma;
> uint32_t flush_domains = 0;
> bool flush_chipset = false;
> int ret;
>
> + /* 1: flush/serialise damage from other sources */
> list_for_each_entry(vma, vmas, exec_list) {
> struct drm_i915_gem_object *obj = vma->obj;
> - ret = i915_gem_object_sync(obj, ring);
> +
> + ret = i915_gem_object_sync(obj, rq);
> if (ret)
> return ret;
>
> @@ -840,18 +864,39 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring,
> flush_chipset |= i915_gem_clflush_object(obj, false);
>
> flush_domains |= obj->base.write_domain;
> + if (obj->last_read[rq->engine->id].request == NULL)
> + rq->pending_flush |= I915_INVALIDATE_CACHES;
> }
>
> if (flush_chipset)
> - i915_gem_chipset_flush(ring->dev);
> + i915_gem_chipset_flush(rq->i915->dev);
>
> if (flush_domains & I915_GEM_DOMAIN_GTT)
> wmb();
>
> - /* Unconditionally invalidate gpu caches and ensure that we do flush
> - * any residual writes from the previous batch.
> - */
> - return intel_ring_invalidate_all_caches(ring);
> + /* 2: invalidate the caches from this ring after emitting semaphores */
> + ret = i915_request_emit_flush(rq, I915_INVALIDATE_CACHES);
> + if (ret)
> + return ret;
> +
> + /* 3: track flushes and objects for this rq */
> + list_for_each_entry(vma, vmas, exec_list) {
> + struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> + unsigned fenced;
> +
> + fenced = 0;
> + if (entry->flags & EXEC_OBJECT_NEEDS_FENCE) {
> + fenced |= VMA_IS_FENCED;
> + if (entry->flags & __EXEC_OBJECT_HAS_FENCE)
> + fenced |= VMA_HAS_FENCE;
> + }
> +
> + ret = i915_request_add_vma(rq, vma, fenced);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> }
>
> static bool
> @@ -864,7 +909,7 @@ i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec)
> }
>
> static int
> -validate_exec_list(struct drm_device *dev,
> +validate_exec_list(struct drm_i915_private *dev_priv,
> struct drm_i915_gem_exec_object2 *exec,
> int count)
> {
> @@ -874,7 +919,7 @@ validate_exec_list(struct drm_device *dev,
> int i;
>
> invalid_flags = __EXEC_OBJECT_UNKNOWN_FLAGS;
> - if (USES_FULL_PPGTT(dev))
> + if (USES_FULL_PPGTT(dev_priv))
> invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
>
> for (i = 0; i < count; i++) {
> @@ -912,13 +957,14 @@ validate_exec_list(struct drm_device *dev,
> }
>
> static struct intel_context *
> -i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
> - struct intel_engine_cs *ring, const u32 ctx_id)
> +i915_gem_validate_context(struct drm_file *file,
> + struct intel_engine_cs *engine,
> + const u32 ctx_id)
> {
> struct intel_context *ctx = NULL;
> struct i915_ctx_hang_stats *hs;
>
> - if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_HANDLE)
> + if (engine->id != RCS && ctx_id != DEFAULT_CONTEXT_HANDLE)
> return ERR_PTR(-EINVAL);
>
> ctx = i915_gem_context_get(file->driver_priv, ctx_id);
> @@ -931,86 +977,23 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
> return ERR_PTR(-EIO);
> }
>
> - if (i915.enable_execlists && !ctx->engine[ring->id].state) {
> - int ret = intel_lr_context_deferred_create(ctx, ring);
> - if (ret) {
> - DRM_DEBUG("Could not create LRC %u: %d\n", ctx_id, ret);
> - return ERR_PTR(ret);
> - }
> - }
> -
> return ctx;
> }
>
> -void
> -i915_gem_execbuffer_move_to_active(struct list_head *vmas,
> - struct intel_engine_cs *ring)
> -{
> - u32 seqno = intel_ring_get_seqno(ring);
> - struct i915_vma *vma;
> -
> - list_for_each_entry(vma, vmas, exec_list) {
> - struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> - struct drm_i915_gem_object *obj = vma->obj;
> - u32 old_read = obj->base.read_domains;
> - u32 old_write = obj->base.write_domain;
> -
> - obj->base.write_domain = obj->base.pending_write_domain;
> - if (obj->base.write_domain == 0)
> - obj->base.pending_read_domains |= obj->base.read_domains;
> - obj->base.read_domains = obj->base.pending_read_domains;
> -
> - i915_vma_move_to_active(vma, ring);
> - if (obj->base.write_domain) {
> - obj->dirty = 1;
> - obj->last_write_seqno = seqno;
> -
> - intel_fb_obj_invalidate(obj, ring);
> -
> - /* update for the implicit flush after a batch */
> - obj->base.write_domain &= ~I915_GEM_GPU_DOMAINS;
> - }
> - if (entry->flags & EXEC_OBJECT_NEEDS_FENCE) {
> - obj->last_fenced_seqno = seqno;
> - if (entry->flags & __EXEC_OBJECT_HAS_FENCE) {
> - struct drm_i915_private *dev_priv = to_i915(ring->dev);
> - list_move_tail(&dev_priv->fence_regs[obj->fence_reg].lru_list,
> - &dev_priv->mm.fence_list);
> - }
> - }
> -
> - trace_i915_gem_object_change_domain(obj, old_read, old_write);
> - }
> -}
> -
> -void
> -i915_gem_execbuffer_retire_commands(struct drm_device *dev,
> - struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct drm_i915_gem_object *obj)
> -{
> - /* Unconditionally force add_request to emit a full flush. */
> - ring->gpu_caches_dirty = true;
> -
> - /* Add a breadcrumb for the completion of the batch buffer */
> - (void)__i915_add_request(ring, file, obj, NULL);
> -}
> -
> static int
> -i915_reset_gen7_sol_offsets(struct drm_device *dev,
> - struct intel_engine_cs *ring)
> +reset_sol_offsets(struct i915_gem_request *rq)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret, i;
> + struct intel_ringbuffer *ring;
> + int i;
>
> - if (!IS_GEN7(dev) || ring != &dev_priv->ring[RCS]) {
> + if (!IS_GEN7(rq->i915) || rq->engine->id != RCS) {
> DRM_DEBUG("sol reset is gen7/rcs only\n");
> return -EINVAL;
> }
>
> - ret = intel_ring_begin(ring, 4 * 3);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 4 * 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> for (i = 0; i < 4; i++) {
> intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> @@ -1019,74 +1002,119 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
> }
>
> intel_ring_advance(ring);
> -
> return 0;
> }
>
> static int
> -i915_emit_box(struct intel_engine_cs *ring,
> - struct drm_clip_rect *box,
> - int DR1, int DR4)
> +emit_box(struct i915_gem_request *rq,
> + struct drm_clip_rect *box,
> + int DR1, int DR4)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> if (box->y2 <= box->y1 || box->x2 <= box->x1 ||
> box->y2 <= 0 || box->x2 <= 0) {
> - DRM_ERROR("Bad box %d,%d..%d,%d\n",
> + DRM_DEBUG("Bad box %d,%d..%d,%d\n",
> box->x1, box->y1, box->x2, box->y2);
> return -EINVAL;
> }
>
> - if (INTEL_INFO(ring->dev)->gen >= 4) {
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + if (INTEL_INFO(rq->i915)->gen >= 4) {
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO_I965);
> - intel_ring_emit(ring, (box->x1 & 0xffff) | box->y1 << 16);
> - intel_ring_emit(ring, ((box->x2 - 1) & 0xffff) | (box->y2 - 1) << 16);
> - intel_ring_emit(ring, DR4);
> } else {
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 5);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO);
> intel_ring_emit(ring, DR1);
> - intel_ring_emit(ring, (box->x1 & 0xffff) | box->y1 << 16);
> - intel_ring_emit(ring, ((box->x2 - 1) & 0xffff) | (box->y2 - 1) << 16);
> - intel_ring_emit(ring, DR4);
> - intel_ring_emit(ring, 0);
> }
> + intel_ring_emit(ring, (box->x1 & 0xffff) | box->y1 << 16);
> + intel_ring_emit(ring, ((box->x2 - 1) & 0xffff) | (box->y2 - 1) << 16);
> + intel_ring_emit(ring, DR4);
> intel_ring_advance(ring);
>
> return 0;
> }
>
> +static int set_contants_base(struct i915_gem_request *rq,
> + struct drm_i915_gem_execbuffer2 *args)
> +{
> + int mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> + u32 mask = I915_EXEC_CONSTANTS_MASK;
>
> -int
> -i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct intel_context *ctx,
> - struct drm_i915_gem_execbuffer2 *args,
> - struct list_head *vmas,
> - struct drm_i915_gem_object *batch_obj,
> - u64 exec_start, u32 flags)
> + switch (mode) {
> + case I915_EXEC_CONSTANTS_REL_GENERAL:
> + case I915_EXEC_CONSTANTS_ABSOLUTE:
> + case I915_EXEC_CONSTANTS_REL_SURFACE:
> + if (mode != 0 && rq->engine->id != RCS) {
> + DRM_DEBUG("non-0 rel constants mode on non-RCS\n");
> + return -EINVAL;
> + }
> +
> + if (mode != rq->engine->i915->relative_constants_mode) {
> + if (INTEL_INFO(rq->engine->i915)->gen < 4) {
> + DRM_DEBUG("no rel constants on pre-gen4\n");
> + return -EINVAL;
> + }
> +
> + if (INTEL_INFO(rq->engine->i915)->gen > 5 &&
> + mode == I915_EXEC_CONSTANTS_REL_SURFACE) {
> + DRM_DEBUG("rel surface constants mode invalid on gen5+\n");
> + return -EINVAL;
> + }
> +
> + /* The HW changed the meaning on this bit on gen6 */
> + if (INTEL_INFO(rq->i915)->gen >= 6)
> + mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
> + }
> + break;
> + default:
> + DRM_DEBUG("execbuf with unknown constants: %d\n", mode);
> + return -EINVAL;
> + }
> +
> + /* XXX INSTPM is per-context not global etc */
> + if (rq->engine->id == RCS && mode != rq->i915->relative_constants_mode) {
> + struct intel_ringbuffer *ring;
> +
> + ring = intel_ring_begin(rq, 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
> +
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> + intel_ring_emit(ring, INSTPM);
> + intel_ring_emit(ring, mask << 16 | mode);
> + intel_ring_advance(ring);
> +
> + rq->i915->relative_constants_mode = mode;
> + }
> +
> + return 0;
> +}
> +
> +static int
> +submit_execbuf(struct intel_engine_cs *engine,
> + struct intel_context *ctx,
> + struct drm_i915_gem_execbuffer2 *args,
> + struct eb_vmas *eb,
> + u64 exec_start, u32 flags)
> {
> struct drm_clip_rect *cliprects = NULL;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - u64 exec_len;
> - int instp_mode;
> - u32 instp_mask;
> + struct i915_gem_request *rq = NULL;
> int i, ret = 0;
>
> if (args->num_cliprects != 0) {
> - if (ring != &dev_priv->ring[RCS]) {
> + if (engine->id != RCS) {
> DRM_DEBUG("clip rectangles are only valid with the render ring\n");
> return -EINVAL;
> }
>
> - if (INTEL_INFO(dev)->gen >= 5) {
> + if (INTEL_INFO(engine->i915)->gen >= 5) {
> DRM_DEBUG("clip rectangles are only valid on pre-gen5\n");
> return -EINVAL;
> }
> @@ -1108,7 +1136,6 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
> if (copy_from_user(cliprects,
> to_user_ptr(args->cliprects_ptr),
> sizeof(*cliprects)*args->num_cliprects)) {
> - ret = -EFAULT;
> goto error;
> }
> } else {
> @@ -1123,168 +1150,108 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file,
> }
> }
>
> - ret = i915_gem_execbuffer_move_to_gpu(ring, vmas);
> - if (ret)
> - goto error;
> + rq = intel_engine_alloc_request(engine, ctx);
> + if (IS_ERR(rq)) {
> + kfree(cliprects);
> + return PTR_ERR(rq);
> + }
>
> - ret = i915_switch_context(ring, ctx);
> + ret = vmas_move_to_rq(&eb->vmas, rq);
> if (ret)
> goto error;
>
> - instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> - instp_mask = I915_EXEC_CONSTANTS_MASK;
> - switch (instp_mode) {
> - case I915_EXEC_CONSTANTS_REL_GENERAL:
> - case I915_EXEC_CONSTANTS_ABSOLUTE:
> - case I915_EXEC_CONSTANTS_REL_SURFACE:
> - if (instp_mode != 0 && ring != &dev_priv->ring[RCS]) {
> - DRM_DEBUG("non-0 rel constants mode on non-RCS\n");
> - ret = -EINVAL;
> - goto error;
> - }
> -
> - if (instp_mode != dev_priv->relative_constants_mode) {
> - if (INTEL_INFO(dev)->gen < 4) {
> - DRM_DEBUG("no rel constants on pre-gen4\n");
> - ret = -EINVAL;
> - goto error;
> - }
> -
> - if (INTEL_INFO(dev)->gen > 5 &&
> - instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) {
> - DRM_DEBUG("rel surface constants mode invalid on gen5+\n");
> - ret = -EINVAL;
> - goto error;
> - }
> -
> - /* The HW changed the meaning on this bit on gen6 */
> - if (INTEL_INFO(dev)->gen >= 6)
> - instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
> - }
> - break;
> - default:
> - DRM_DEBUG("execbuf with unknown constants: %d\n", instp_mode);
> - ret = -EINVAL;
> + ret = set_contants_base(rq, args);
> + if (ret)
> goto error;
> - }
> -
> - if (ring == &dev_priv->ring[RCS] &&
> - instp_mode != dev_priv->relative_constants_mode) {
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - goto error;
> -
> - intel_ring_emit(ring, MI_NOOP);
> - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> - intel_ring_emit(ring, INSTPM);
> - intel_ring_emit(ring, instp_mask << 16 | instp_mode);
> - intel_ring_advance(ring);
> -
> - dev_priv->relative_constants_mode = instp_mode;
> - }
>
> if (args->flags & I915_EXEC_GEN7_SOL_RESET) {
> - ret = i915_reset_gen7_sol_offsets(dev, ring);
> + ret = reset_sol_offsets(rq);
> if (ret)
> goto error;
> }
>
> - exec_len = args->batch_len;
> if (cliprects) {
> for (i = 0; i < args->num_cliprects; i++) {
> - ret = i915_emit_box(ring, &cliprects[i],
> - args->DR1, args->DR4);
> + ret = emit_box(rq, &cliprects[i],
> + args->DR1, args->DR4);
> if (ret)
> goto error;
>
> - ret = ring->dispatch_execbuffer(ring,
> - exec_start, exec_len,
> - flags);
> + ret = i915_request_emit_batchbuffer(rq, eb->batch,
> + exec_start,
> + args->batch_len,
> + flags);
> if (ret)
> goto error;
> }
> } else {
> - ret = ring->dispatch_execbuffer(ring,
> - exec_start, exec_len,
> - flags);
> + ret = i915_request_emit_batchbuffer(rq, eb->batch,
> + exec_start,
> + args->batch_len,
> + flags);
> if (ret)
> - return ret;
> + goto error;
> }
>
> - trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
> + ret = i915_request_commit(rq);
> + if (ret)
> + goto error;
> +
> + i915_queue_hangcheck(rq->i915->dev);
>
> - i915_gem_execbuffer_move_to_active(vmas, ring);
> - i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> + cancel_delayed_work_sync(&rq->i915->mm.idle_work);
> + queue_delayed_work(rq->i915->wq,
> + &rq->i915->mm.retire_work,
> + round_jiffies_up_relative(HZ));
> + intel_mark_busy(rq->i915->dev);
>
> error:
> + i915_request_put(rq);
> kfree(cliprects);
> return ret;
> }
>
> /**
> * Find one BSD ring to dispatch the corresponding BSD command.
> - * The Ring ID is returned.
> */
> -static int gen8_dispatch_bsd_ring(struct drm_device *dev,
> - struct drm_file *file)
> +static struct intel_engine_cs *
> +gen8_select_bsd_engine(struct drm_i915_private *dev_priv,
> + struct drm_file *file)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> struct drm_i915_file_private *file_priv = file->driver_priv;
>
> - /* Check whether the file_priv is using one ring */
> - if (file_priv->bsd_ring)
> - return file_priv->bsd_ring->id;
> - else {
> - /* If no, use the ping-pong mechanism to select one ring */
> - int ring_id;
> + /* Use the ping-pong mechanism to select one ring for this client */
> + if (file_priv->bsd_engine == NULL) {
> + int id;
>
> - mutex_lock(&dev->struct_mutex);
> + mutex_lock(&dev_priv->dev->struct_mutex);
> if (dev_priv->mm.bsd_ring_dispatch_index == 0) {
> - ring_id = VCS;
> + id = VCS;
> dev_priv->mm.bsd_ring_dispatch_index = 1;
> } else {
> - ring_id = VCS2;
> + id = VCS2;
> dev_priv->mm.bsd_ring_dispatch_index = 0;
> }
> - file_priv->bsd_ring = &dev_priv->ring[ring_id];
> - mutex_unlock(&dev->struct_mutex);
> - return ring_id;
> + file_priv->bsd_engine = &dev_priv->engine[id];
> + mutex_unlock(&dev_priv->dev->struct_mutex);
> }
> -}
> -
> -static struct drm_i915_gem_object *
> -eb_get_batch(struct eb_vmas *eb)
> -{
> - struct i915_vma *vma = list_entry(eb->vmas.prev, typeof(*vma), exec_list);
>
> - /*
> - * SNA is doing fancy tricks with compressing batch buffers, which leads
> - * to negative relocation deltas. Usually that works out ok since the
> - * relocate address is still positive, except when the batch is placed
> - * very low in the GTT. Ensure this doesn't happen.
> - *
> - * Note that actual hangs have only been observed on gen7, but for
> - * paranoia do it everywhere.
> - */
> - vma->exec_entry->flags |= __EXEC_OBJECT_NEEDS_BIAS;
> -
> - return vma->obj;
> + return file_priv->bsd_engine;
> }
>
> static int
> -i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> +i915_gem_do_execbuffer(struct drm_i915_private *dev_priv, void *data,
> struct drm_file *file,
> struct drm_i915_gem_execbuffer2 *args,
> struct drm_i915_gem_exec_object2 *exec)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> struct eb_vmas *eb;
> - struct drm_i915_gem_object *batch_obj;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> struct intel_context *ctx;
> struct i915_address_space *vm;
> const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
> u64 exec_start = args->batch_start_offset;
> + struct drm_i915_gem_object *batch;
> u32 flags;
> int ret;
> bool need_relocs;
> @@ -1292,7 +1259,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> if (!i915_gem_check_execbuffer(args))
> return -EINVAL;
>
> - ret = validate_exec_list(dev, exec, args->buffer_count);
> + ret = validate_exec_list(dev_priv, exec, args->buffer_count);
> if (ret)
> return ret;
>
> @@ -1313,18 +1280,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> }
>
> if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_DEFAULT)
> - ring = &dev_priv->ring[RCS];
> + engine = &dev_priv->engine[RCS];
> else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
> - if (HAS_BSD2(dev)) {
> - int ring_id;
> - ring_id = gen8_dispatch_bsd_ring(dev, file);
> - ring = &dev_priv->ring[ring_id];
> - } else
> - ring = &dev_priv->ring[VCS];
> + if (HAS_BSD2(dev_priv))
> + engine = gen8_select_bsd_engine(dev_priv, file);
> + else
> + engine = &dev_priv->engine[VCS];
> } else
> - ring = &dev_priv->ring[(args->flags & I915_EXEC_RING_MASK) - 1];
> + engine = &dev_priv->engine[(args->flags & I915_EXEC_RING_MASK) - 1];
>
> - if (!intel_ring_initialized(ring)) {
> + if (!intel_engine_initialized(engine)) {
> DRM_DEBUG("execbuf with invalid ring: %d\n",
> (int)(args->flags & I915_EXEC_RING_MASK));
> return -EINVAL;
> @@ -1337,19 +1302,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>
> intel_runtime_pm_get(dev_priv);
>
> - ret = i915_mutex_lock_interruptible(dev);
> + ret = i915_mutex_lock_interruptible(dev_priv->dev);
> if (ret)
> goto pre_mutex_err;
>
> if (dev_priv->ums.mm_suspended) {
> - mutex_unlock(&dev->struct_mutex);
> + mutex_unlock(&dev_priv->dev->struct_mutex);
> ret = -EBUSY;
> goto pre_mutex_err;
> }
>
> - ctx = i915_gem_validate_context(dev, file, ring, ctx_id);
> + ctx = i915_gem_validate_context(file, engine, ctx_id);
> if (IS_ERR(ctx)) {
> - mutex_unlock(&dev->struct_mutex);
> + mutex_unlock(&dev_priv->dev->struct_mutex);
> ret = PTR_ERR(ctx);
> goto pre_mutex_err;
> }
> @@ -1364,7 +1329,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> eb = eb_create(args);
> if (eb == NULL) {
> i915_gem_context_unreference(ctx);
> - mutex_unlock(&dev->struct_mutex);
> + mutex_unlock(&dev_priv->dev->struct_mutex);
> ret = -ENOMEM;
> goto pre_mutex_err;
> }
> @@ -1374,12 +1339,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> if (ret)
> goto err;
>
> - /* take note of the batch buffer before we might reorder the lists */
> - batch_obj = eb_get_batch(eb);
> -
> /* Move the objects en-masse into the GTT, evicting if necessary. */
> need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
> - ret = i915_gem_execbuffer_reserve(ring, &eb->vmas, &need_relocs);
> + ret = i915_gem_execbuffer_reserve(engine, &eb->vmas, &need_relocs);
> if (ret)
> goto err;
>
> @@ -1388,25 +1350,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> ret = i915_gem_execbuffer_relocate(eb);
> if (ret) {
> if (ret == -EFAULT) {
> - ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
> + ret = i915_gem_execbuffer_relocate_slow(dev_priv, args, file, engine,
> eb, exec);
> - BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> + BUG_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
> }
> if (ret)
> goto err;
> }
>
> /* Set the pending read domains for the batch buffer to COMMAND */
> - if (batch_obj->base.pending_write_domain) {
> + batch = eb->batch->obj;
> + if (batch->base.pending_write_domain) {
> DRM_DEBUG("Attempting to use self-modifying batch buffer\n");
> ret = -EINVAL;
> goto err;
> }
> - batch_obj->base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND;
> + batch->base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND;
>
> - if (i915_needs_cmd_parser(ring)) {
> - ret = i915_parse_cmds(ring,
> - batch_obj,
> + if (i915_needs_cmd_parser(engine)) {
> + ret = i915_parse_cmds(engine, batch,
> args->batch_start_offset,
> file->is_master);
> if (ret)
> @@ -1436,16 +1398,15 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> * fitting due to fragmentation.
> * So this is actually safe.
> */
> - ret = i915_gem_obj_ggtt_pin(batch_obj, 0, 0);
> + ret = i915_gem_obj_ggtt_pin(batch, 0, 0);
> if (ret)
> goto err;
>
> - exec_start += i915_gem_obj_ggtt_offset(batch_obj);
> + exec_start += i915_gem_obj_ggtt_offset(batch);
> } else
> - exec_start += i915_gem_obj_offset(batch_obj, vm);
> + exec_start += i915_gem_obj_offset(batch, vm);
>
> - ret = dev_priv->gt.do_execbuf(dev, file, ring, ctx, args,
> - &eb->vmas, batch_obj, exec_start, flags);
> + ret = submit_execbuf(engine, ctx, args, eb, exec_start, flags);
>
> /*
> * FIXME: We crucially rely upon the active tracking for the (ppgtt)
> @@ -1454,13 +1415,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> * active.
> */
> if (flags & I915_DISPATCH_SECURE)
> - i915_gem_object_ggtt_unpin(batch_obj);
> + i915_gem_object_ggtt_unpin(batch);
> err:
> /* the request owns the ref now */
> i915_gem_context_unreference(ctx);
> eb_destroy(eb);
>
> - mutex_unlock(&dev->struct_mutex);
> + mutex_unlock(&dev_priv->dev->struct_mutex);
>
> pre_mutex_err:
> /* intel_gpu_busy should also get a ref, so it will free when the device
> @@ -1532,7 +1493,7 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
> exec2.flags = I915_EXEC_RENDER;
> i915_execbuffer2_set_context_id(exec2, 0);
>
> - ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
> + ret = i915_gem_do_execbuffer(to_i915(dev), data, file, &exec2, exec2_list);
> if (!ret) {
> struct drm_i915_gem_exec_object __user *user_exec_list =
> to_user_ptr(args->buffers_ptr);
> @@ -1596,7 +1557,7 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
> return -EFAULT;
> }
>
> - ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
> + ret = i915_gem_do_execbuffer(to_i915(dev), data, file, args, exec2_list);
> if (!ret) {
> /* Copy the new buffer offsets back to the user's exec list. */
> struct drm_i915_gem_exec_object2 __user *user_exec_list =
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 6f410cfb0510..ba9bce1a2f07 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -203,30 +203,28 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
> }
>
> /* Broadwell Page Directory Pointer Descriptors */
> -static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry,
> - uint64_t val)
> +static int gen8_write_pdp(struct i915_gem_request *rq, unsigned entry, uint64_t val)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> BUG_ON(entry >= 4);
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 5);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> - intel_ring_emit(ring, GEN8_RING_PDP_UDW(ring, entry));
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
> + intel_ring_emit(ring, GEN8_RING_PDP_UDW(rq->engine, entry));
> intel_ring_emit(ring, (u32)(val >> 32));
> - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> - intel_ring_emit(ring, GEN8_RING_PDP_LDW(ring, entry));
> + intel_ring_emit(ring, GEN8_RING_PDP_LDW(rq->engine, entry));
> intel_ring_emit(ring, (u32)(val));
> intel_ring_advance(ring);
>
> return 0;
> }
>
> -static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
> - struct intel_engine_cs *ring)
> +static int gen8_mm_switch(struct i915_gem_request *rq,
> + struct i915_hw_ppgtt *ppgtt)
> {
> int i, ret;
>
> @@ -235,7 +233,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
>
> for (i = used_pd - 1; i >= 0; i--) {
> dma_addr_t addr = ppgtt->pd_dma_addr[i];
> - ret = gen8_write_pdp(ring, i, addr);
> + ret = gen8_write_pdp(rq, i, addr);
> if (ret)
> return ret;
> }
> @@ -699,94 +697,81 @@ static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
> return (ppgtt->pd_offset / 64) << 16;
> }
>
> -static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
> - struct intel_engine_cs *ring)
> +static int hsw_mm_switch(struct i915_gem_request *rq,
> + struct i915_hw_ppgtt *ppgtt)
> {
> + struct intel_ringbuffer *ring;
> int ret;
>
> /* NB: TLBs must be flushed and invalidated before a switch */
> - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
> + ret = i915_request_emit_flush(rq, I915_INVALIDATE_CACHES);
> if (ret)
> return ret;
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 5);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
> - intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
> + intel_ring_emit(ring, RING_PP_DIR_DCLV(rq->engine));
> intel_ring_emit(ring, PP_DIR_DCLV_2G);
> - intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
> + intel_ring_emit(ring, RING_PP_DIR_BASE(rq->engine));
> intel_ring_emit(ring, get_pd_offset(ppgtt));
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> return 0;
> }
>
> -static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
> - struct intel_engine_cs *ring)
> +static int gen7_mm_switch(struct i915_gem_request *rq,
> + struct i915_hw_ppgtt *ppgtt)
> {
> + struct intel_ringbuffer *ring;
> int ret;
>
> /* NB: TLBs must be flushed and invalidated before a switch */
> - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
> + ret = i915_request_emit_flush(rq, I915_INVALIDATE_CACHES);
> if (ret)
> return ret;
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 5);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
> - intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
> + intel_ring_emit(ring, RING_PP_DIR_DCLV(rq->engine));
> intel_ring_emit(ring, PP_DIR_DCLV_2G);
> - intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
> + intel_ring_emit(ring, RING_PP_DIR_BASE(rq->engine));
> intel_ring_emit(ring, get_pd_offset(ppgtt));
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> /* XXX: RCS is the only one to auto invalidate the TLBs? */
> - if (ring->id != RCS) {
> - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
> - if (ret)
> - return ret;
> - }
> + if (rq->engine->id != RCS)
> + rq->pending_flush |= I915_INVALIDATE_CACHES;
>
> return 0;
> }
>
> -static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
> - struct intel_engine_cs *ring)
> +static int gen6_mm_switch(struct i915_gem_request *rq,
> + struct i915_hw_ppgtt *ppgtt)
> {
> - struct drm_device *dev = ppgtt->base.dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> -
> - I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
> - I915_WRITE(RING_PP_DIR_BASE(ring), get_pd_offset(ppgtt));
> -
> - POSTING_READ(RING_PP_DIR_DCLV(ring));
> -
> - return 0;
> + return -ENODEV;
> }
>
> static void gen8_ppgtt_enable(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int j;
>
> - for_each_ring(ring, dev_priv, j) {
> - I915_WRITE(RING_MODE_GEN7(ring),
> + for_each_engine(engine, dev_priv, j)
> + I915_WRITE(RING_MODE_GEN7(engine),
> _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
> - }
> }
>
> static void gen7_ppgtt_enable(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> uint32_t ecochk, ecobits;
> int i;
>
> @@ -802,10 +787,15 @@ static void gen7_ppgtt_enable(struct drm_device *dev)
> }
> I915_WRITE(GAM_ECOCHK, ecochk);
>
> - for_each_ring(ring, dev_priv, i) {
> + for_each_engine(engine, dev_priv, i) {
> /* GFX_MODE is per-ring on gen7+ */
> - I915_WRITE(RING_MODE_GEN7(ring),
> + I915_WRITE(RING_MODE_GEN7(engine),
> _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
> +
> + I915_WRITE(RING_PP_DIR_DCLV(engine), PP_DIR_DCLV_2G);
> + I915_WRITE(RING_PP_DIR_BASE(engine), get_pd_offset(dev_priv->mm.aliasing_ppgtt));
> +
> + POSTING_READ(RING_PP_DIR_DCLV(engine));
> }
> }
>
> @@ -813,6 +803,8 @@ static void gen6_ppgtt_enable(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> uint32_t ecochk, gab_ctl, ecobits;
> + struct intel_engine_cs *engine;
> + int i;
>
> ecobits = I915_READ(GAC_ECO_BITS);
> I915_WRITE(GAC_ECO_BITS, ecobits | ECOBITS_SNB_BIT |
> @@ -825,6 +817,13 @@ static void gen6_ppgtt_enable(struct drm_device *dev)
> I915_WRITE(GAM_ECOCHK, ecochk | ECOCHK_SNB_BIT | ECOCHK_PPGTT_CACHE64B);
>
> I915_WRITE(GFX_MODE, _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
> +
> + for_each_engine(engine, dev_priv, i) {
> + I915_WRITE(RING_PP_DIR_DCLV(engine), PP_DIR_DCLV_2G);
> + I915_WRITE(RING_PP_DIR_BASE(engine), get_pd_offset(dev_priv->mm.aliasing_ppgtt));
> +
> + POSTING_READ(RING_PP_DIR_DCLV(engine));
> + }
> }
>
> /* PPGTT support for Sandybdrige/Gen6 and later */
> @@ -1115,18 +1114,13 @@ int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
>
> int i915_ppgtt_init_hw(struct drm_device *dev)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> - struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> - int i, ret = 0;
> + if (!USES_PPGTT(dev))
> + return 0;
>
> /* In the case of execlists, PPGTT is enabled by the context descriptor
> * and the PDPs are contained within the context itself. We don't
> * need to do anything here. */
> - if (i915.enable_execlists)
> - return 0;
> -
> - if (!USES_PPGTT(dev))
> + if (RCS_ENGINE(dev)->execlists_enabled)
> return 0;
>
> if (IS_GEN6(dev))
> @@ -1138,15 +1132,7 @@ int i915_ppgtt_init_hw(struct drm_device *dev)
> else
> WARN_ON(1);
>
> - if (ppgtt) {
> - for_each_ring(ring, dev_priv, i) {
> - ret = ppgtt->switch_mm(ppgtt, ring);
> - if (ret != 0)
> - return ret;
> - }
> - }
> -
> - return ret;
> + return 0;
> }
> struct i915_hw_ppgtt *
> i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv)
> @@ -1247,15 +1233,15 @@ static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
> void i915_check_and_clear_faults(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int i;
>
> if (INTEL_INFO(dev)->gen < 6)
> return;
>
> - for_each_ring(ring, dev_priv, i) {
> + for_each_engine(engine, dev_priv, i) {
> u32 fault_reg;
> - fault_reg = I915_READ(RING_FAULT_REG(ring));
> + fault_reg = I915_READ(RING_FAULT_REG(engine));
> if (fault_reg & RING_FAULT_VALID) {
> DRM_DEBUG_DRIVER("Unexpected fault\n"
> "\tAddr: 0x%08lx\\n"
> @@ -1266,11 +1252,11 @@ void i915_check_and_clear_faults(struct drm_device *dev)
> fault_reg & RING_FAULT_GTTSEL_MASK ? "GGTT" : "PPGTT",
> RING_FAULT_SRCID(fault_reg),
> RING_FAULT_FAULT_TYPE(fault_reg));
> - I915_WRITE(RING_FAULT_REG(ring),
> + I915_WRITE(RING_FAULT_REG(engine),
> fault_reg & ~RING_FAULT_VALID);
> }
> }
> - POSTING_READ(RING_FAULT_REG(&dev_priv->ring[RCS]));
> + POSTING_READ(RING_FAULT_REG(RCS_ENGINE(dev_priv)));
> }
>
> void i915_gem_suspend_gtt_mappings(struct drm_device *dev)
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index d5c14af51e99..0802832df28c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -263,8 +263,8 @@ struct i915_hw_ppgtt {
> struct drm_i915_file_private *file_priv;
>
> int (*enable)(struct i915_hw_ppgtt *ppgtt);
> - int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
> - struct intel_engine_cs *ring);
> + int (*switch_mm)(struct i915_gem_request *rq,
> + struct i915_hw_ppgtt *ppgtt);
> void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
> };
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index a9a62d75aa57..fffd26dfa4dd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -28,8 +28,15 @@
> #include "i915_drv.h"
> #include "intel_renderstate.h"
>
> +struct render_state {
> + const struct intel_renderstate_rodata *rodata;
> + struct drm_i915_gem_object *obj;
> + u64 ggtt_offset;
> + int gen;
> +};
> +
> static const struct intel_renderstate_rodata *
> -render_state_get_rodata(struct drm_device *dev, const int gen)
> +render_state_get_rodata(const int gen)
> {
> switch (gen) {
> case 6:
> @@ -43,19 +50,19 @@ render_state_get_rodata(struct drm_device *dev, const int gen)
> return NULL;
> }
>
> -static int render_state_init(struct render_state *so, struct drm_device *dev)
> +static int render_state_init(struct render_state *so, struct i915_gem_request *rq)
> {
> int ret;
>
> - so->gen = INTEL_INFO(dev)->gen;
> - so->rodata = render_state_get_rodata(dev, so->gen);
> + so->gen = INTEL_INFO(rq->i915)->gen;
> + so->rodata = render_state_get_rodata(so->gen);
> if (so->rodata == NULL)
> return 0;
>
> if (so->rodata->batch_items * 4 > 4096)
> return -EINVAL;
>
> - so->obj = i915_gem_alloc_object(dev, 4096);
> + so->obj = i915_gem_alloc_object(rq->i915->dev, 4096);
> if (so->obj == NULL)
> return -ENOMEM;
>
> @@ -108,10 +115,6 @@ static int render_state_setup(struct render_state *so)
> }
> kunmap(page);
>
> - ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
> - if (ret)
> - return ret;
> -
> if (rodata->reloc[reloc_index] != -1) {
> DRM_ERROR("only %d relocs resolved\n", reloc_index);
> return -EINVAL;
> @@ -120,60 +123,46 @@ static int render_state_setup(struct render_state *so)
> return 0;
> }
>
> -void i915_gem_render_state_fini(struct render_state *so)
> +static void render_state_fini(struct render_state *so)
> {
> i915_gem_object_ggtt_unpin(so->obj);
> drm_gem_object_unreference(&so->obj->base);
> }
>
> -int i915_gem_render_state_prepare(struct intel_engine_cs *ring,
> - struct render_state *so)
> +int i915_gem_render_state_init(struct i915_gem_request *rq)
> {
> + struct render_state so;
> int ret;
>
> - if (WARN_ON(ring->id != RCS))
> + if (WARN_ON(rq->engine->id != RCS))
> return -ENOENT;
>
> - ret = render_state_init(so, ring->dev);
> + ret = render_state_init(&so, rq);
> if (ret)
> return ret;
>
> - if (so->rodata == NULL)
> + if (so.rodata == NULL)
> return 0;
>
> - ret = render_state_setup(so);
> - if (ret) {
> - i915_gem_render_state_fini(so);
> - return ret;
> - }
> -
> - return 0;
> -}
> -
> -int i915_gem_render_state_init(struct intel_engine_cs *ring)
> -{
> - struct render_state so;
> - int ret;
> -
> - ret = i915_gem_render_state_prepare(ring, &so);
> + ret = render_state_setup(&so);
> if (ret)
> - return ret;
> + goto out;
>
> - if (so.rodata == NULL)
> - return 0;
> + if (i915_gem_clflush_object(so.obj, false))
> + i915_gem_chipset_flush(rq->i915->dev);
>
> - ret = ring->dispatch_execbuffer(ring,
> - so.ggtt_offset,
> - so.rodata->batch_items * 4,
> - I915_DISPATCH_SECURE);
> + ret = i915_request_emit_batchbuffer(rq, NULL,
> + so.ggtt_offset,
> + so.rodata->batch_items * 4,
> + I915_DISPATCH_SECURE);
> if (ret)
> goto out;
>
> - i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
> + so.obj->base.pending_read_domains = I915_GEM_DOMAIN_COMMAND;
> + ret = i915_request_add_vma(rq, i915_gem_obj_to_ggtt(so.obj), 0);
>
> - ret = __i915_add_request(ring, NULL, so.obj, NULL);
> /* __i915_add_request moves object to inactive if it fails */
> out:
> - i915_gem_render_state_fini(&so);
> + render_state_fini(&so);
> return ret;
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h b/drivers/gpu/drm/i915/i915_gem_render_state.h
> deleted file mode 100644
> index c44961ed3fad..000000000000
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.h
> +++ /dev/null
> @@ -1,47 +0,0 @@
> -/*
> - * Copyright © 2014 Intel Corporation
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a
> - * copy of this software and associated documentation files (the "Software"),
> - * to deal in the Software without restriction, including without limitation
> - * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> - * and/or sell copies of the Software, and to permit persons to whom the
> - * Software is furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice (including the next
> - * paragraph) shall be included in all copies or substantial portions of the
> - * Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> - * DEALINGS IN THE SOFTWARE.
> - */
> -
> -#ifndef _I915_GEM_RENDER_STATE_H_
> -#define _I915_GEM_RENDER_STATE_H_
> -
> -#include <linux/types.h>
> -
> -struct intel_renderstate_rodata {
> - const u32 *reloc;
> - const u32 *batch;
> - const u32 batch_items;
> -};
> -
> -struct render_state {
> - const struct intel_renderstate_rodata *rodata;
> - struct drm_i915_gem_object *obj;
> - u64 ggtt_offset;
> - int gen;
> -};
> -
> -int i915_gem_render_state_init(struct intel_engine_cs *ring);
> -void i915_gem_render_state_fini(struct render_state *so);
> -int i915_gem_render_state_prepare(struct intel_engine_cs *ring,
> - struct render_state *so);
> -
> -#endif /* _I915_GEM_RENDER_STATE_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> new file mode 100644
> index 000000000000..582c5df2933e
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -0,0 +1,651 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <drm/drmP.h>
> +#include "i915_drv.h"
> +#include <drm/i915_drm.h>
> +#include "i915_trace.h"
> +#include "intel_drv.h"
> +
> +struct i915_gem_request__vma {
> + struct list_head link;
> + struct i915_vma *vma;
> + struct drm_i915_gem_object *obj;
> + u32 write, fence;
> +};
> +
> +static bool check_reset(struct i915_gem_request *rq)
> +{
> + unsigned reset = atomic_read(&rq->i915->gpu_error.reset_counter);
> + return likely(reset == rq->reset_counter);
> +}
> +
> +int
> +i915_request_add_vma(struct i915_gem_request *rq,
> + struct i915_vma *vma,
> + unsigned fenced)
> +{
> + struct drm_i915_gem_object *obj = vma->obj;
> + u32 old_read = obj->base.read_domains;
> + u32 old_write = obj->base.write_domain;
> + struct i915_gem_request__vma *ref;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> + BUG_ON(!rq->outstanding);
> +
> + obj->base.write_domain = obj->base.pending_write_domain;
> + if (obj->base.write_domain == 0)
> + obj->base.pending_read_domains |= obj->base.read_domains;
> + obj->base.read_domains = obj->base.pending_read_domains;
> +
> + obj->base.pending_read_domains = 0;
> + obj->base.pending_write_domain = 0;
> +
> + trace_i915_gem_object_change_domain(obj, old_read, old_write);
> + if (obj->base.read_domains == 0)
> + return 0;
> +
> + ref = kmalloc(sizeof(*ref), GFP_KERNEL);
> + if (ref == NULL)
> + return -ENOMEM;
> +
> + list_add(&ref->link, &rq->vmas);
> + ref->vma = vma;
> + ref->obj = obj;
> + drm_gem_object_reference(&obj->base);
> + ref->write = obj->base.write_domain & I915_GEM_GPU_DOMAINS;
> + ref->fence = fenced;
> +
> + if (ref->write) {
> + rq->pending_flush |= I915_FLUSH_CACHES;
> + intel_fb_obj_invalidate(obj, rq);
> + }
> +
> + /* update for the implicit flush after the rq */
> + obj->base.write_domain &= ~I915_GEM_GPU_DOMAINS;
> + return 0;
> +}
> +
> +static void vma_free(struct i915_gem_request__vma *ref)
> +{
> + drm_gem_object_unreference(&ref->obj->base);
> + list_del(&ref->link);
> + kfree(ref);
> +}
> +
> +int
> +i915_request_emit_flush(struct i915_gem_request *rq,
> + unsigned flags)
> +{
> + struct intel_engine_cs *engine = rq->engine;
> + int ret;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> + BUG_ON(!rq->outstanding);
> +
> + if ((flags & rq->pending_flush) == 0)
> + return 0;
> +
> + trace_i915_gem_request_emit_flush(rq);
> + ret = engine->emit_flush(rq, rq->pending_flush);
> + if (ret)
> + return ret;
> +
> + rq->pending_flush = 0;
> + return 0;
> +}
> +
> +int
> +__i915_request_emit_breadcrumb(struct i915_gem_request *rq, int id)
> +{
> + struct intel_engine_cs *engine = rq->engine;
> + u32 seqno;
> + int ret;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + if (rq->breadcrumb[id])
> + return 0;
> +
> + if (rq->outstanding) {
> + ret = i915_request_emit_flush(rq, I915_COMMAND_BARRIER);
> + if (ret)
> + return ret;
> +
> + trace_i915_gem_request_emit_breadcrumb(rq);
> + if (id == engine->id)
> + ret = engine->emit_breadcrumb(rq);
> + else
> + ret = engine->semaphore.signal(rq, id);
> + if (ret)
> + return ret;
> +
> + seqno = rq->seqno;
> + } else if (engine->breadcrumb[id] == 0 ||
> + __i915_seqno_passed(rq->seqno, engine->breadcrumb[id])) {
> + struct i915_gem_request *tmp;
> +
> + tmp = intel_engine_alloc_request(engine,
> + rq->ring->last_context);
> + if (IS_ERR(tmp))
> + return PTR_ERR(tmp);
> +
> + /* Masquerade as a continuation of the earlier request */
> + tmp->reset_counter = rq->reset_counter;
> +
> + ret = __i915_request_emit_breadcrumb(tmp, id);
> + if (ret == 0 && id != engine->id) {
> + /* semaphores are unstable across a wrap */
> + if (tmp->seqno < engine->breadcrumb[id])
> + ret = i915_request_wait(tmp);
> + }
> + if (ret == 0)
> + ret = i915_request_commit(tmp);
> +
> + i915_request_put(tmp);
> + if (ret)
> + return ret;
> +
> + seqno = tmp->seqno;
> + } else
> + seqno = engine->breadcrumb[id];
> +
> + rq->breadcrumb[id] = seqno;
> + return 0;
> +}
> +
> +int
> +i915_request_emit_batchbuffer(struct i915_gem_request *rq,
> + struct i915_vma *batch,
> + uint64_t start, uint32_t len,
> + unsigned flags)
> +{
> + struct intel_engine_cs *engine = rq->engine;
> + int ret;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> + BUG_ON(!rq->outstanding);
> + BUG_ON(rq->breadcrumb[rq->engine->id]);
> +
> + trace_i915_gem_request_emit_batch(rq);
> + ret = engine->emit_batchbuffer(rq, start, len, flags);
> + if (ret)
> + return ret;
> +
> + /* We track the associated batch vma for debugging and error capture.
> + * Whilst this request exists, the batch obj will be on the active_list,
> + * and so will hold the active reference. Only when this request is
> + * retired will the the batch be moved onto the inactive_list and lose
> + * its active reference. Hence we do not need to explicitly hold
> + * another reference here.
> + */
> + rq->batch = batch;
> + rq->pending_flush |= I915_COMMAND_BARRIER;
> + return 0;
> +}
> +
> +/* Track the batches submitted by clients for throttling */
> +static void
> +add_to_client(struct i915_gem_request *rq)
> +{
> + struct drm_i915_file_private *file_priv = rq->ctx->file_priv;
> +
> + if (file_priv) {
> + spin_lock(&file_priv->mm.lock);
> + list_add_tail(&rq->client_list,
> + &file_priv->mm.request_list);
> + rq->file_priv = file_priv;
> + spin_unlock(&file_priv->mm.lock);
> + }
> +}
> +
> +static void
> +remove_from_client(struct i915_gem_request *rq)
> +{
> + struct drm_i915_file_private *file_priv = rq->file_priv;
> +
> + if (!file_priv)
> + return;
> +
> + spin_lock(&file_priv->mm.lock);
> + if (rq->file_priv) {
> + list_del(&rq->client_list);
> + rq->file_priv = NULL;
> + }
> + spin_unlock(&file_priv->mm.lock);
> +}
> +
> +/* Activity tracking on the object so that we can serialise CPU access to
> + * the object's memory with the GPU.
> + */
> +static void
> +add_to_obj(struct i915_gem_request *rq,
> + struct i915_gem_request__vma *ref)
> +{
> + struct drm_i915_gem_object *obj = ref->obj;
> + struct intel_engine_cs *engine = rq->engine;
> +
> + /* Add a reference if we're newly entering the active list. */
> + if (obj->last_read[engine->id].request == NULL && obj->active++ == 0)
> + drm_gem_object_reference(&obj->base);
> +
> + if (ref->write) {
> + obj->dirty = 1;
> + i915_request_put(obj->last_write.request);
> + obj->last_write.request = i915_request_get(rq);
> + list_move_tail(&obj->last_write.engine_list,
> + &engine->write_list);
> +
> + if (obj->active > 1) {
> + int i;
> +
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + if (obj->last_read[i].request == NULL)
> + continue;
> +
> + list_del_init(&obj->last_read[i].engine_list);
> + i915_request_put(obj->last_read[i].request);
> + obj->last_read[i].request = NULL;
> + }
> +
> + obj->active = 1;
> + }
> + }
> +
> + if (ref->fence & VMA_IS_FENCED) {
> + i915_request_put(obj->last_fence.request);
> + obj->last_fence.request = i915_request_get(rq);
> + list_move_tail(&obj->last_fence.engine_list,
> + &engine->fence_list);
> + if (ref->fence & VMA_HAS_FENCE)
> + list_move_tail(&rq->i915->fence_regs[obj->fence_reg].lru_list,
> + &rq->i915->mm.fence_list);
> + }
> +
> + i915_request_put(obj->last_read[engine->id].request);
> + obj->last_read[engine->id].request = i915_request_get(rq);
> + list_move_tail(&obj->last_read[engine->id].engine_list,
> + &engine->read_list);
> +
> + list_move_tail(&ref->vma->mm_list, &ref->vma->vm->active_list);
> +}
> +
> +static bool leave_breadcrumb(struct i915_gem_request *rq)
> +{
> + if (rq->breadcrumb[rq->engine->id])
> + return false;
> +
> + /* Auto-report HEAD every 4k to make sure that we can always wait on
> + * some available ring space in the future. This also caps the
> + * latency of future waits for missed breadcrumbs.
> + */
> + if (__intel_ring_space(rq->ring->tail, rq->ring->breadcrumb_tail,
> + rq->ring->size, 0) >= PAGE_SIZE)
> + return true;
> +
> + return false;
> +}
> +
> +int i915_request_commit(struct i915_gem_request *rq)
> +{
> + int ret, n;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + if (!rq->outstanding)
> + return 0;
> +
> + if (rq->head == rq->ring->tail) {
> + rq->completed = true;
> + goto done;
> + }
> +
> + if (intel_engine_hang(rq->engine))
> + i915_handle_error(rq->i915->dev, true, "Simulated hang");
> +
> + if (!check_reset(rq))
> + return rq->i915->mm.interruptible ? -EAGAIN : -EIO;
> +
> + if (leave_breadcrumb(rq)) {
> + ret = i915_request_emit_breadcrumb(rq);
> + if (ret)
> + return ret;
> + }
> +
> + /* TAIL must be aligned to a qword */
> + if ((rq->ring->tail / sizeof (uint32_t)) & 1) {
> + intel_ring_emit(rq->ring, MI_NOOP);
> + intel_ring_advance(rq->ring);
> + }
> + rq->tail = rq->ring->tail;
> + rq->emitted_jiffies = jiffies;
> +
> + intel_runtime_pm_get(rq->i915);
> +
> + trace_i915_gem_request_commit(rq);
> + ret = rq->engine->add_request(rq);
> + if (ret) {
> + intel_runtime_pm_put(rq->i915);
> + return ret;
> + }
> +
> + i915_request_get(rq);
> +
> + rq->outstanding = false;
> + if (rq->breadcrumb[rq->engine->id]) {
> + list_add_tail(&rq->breadcrumb_link, &rq->ring->breadcrumbs);
> + rq->ring->breadcrumb_tail = rq->tail;
> + }
> +
> + memcpy(rq->engine->semaphore.sync,
> + rq->semaphore,
> + sizeof(rq->semaphore));
> + for (n = 0; n < ARRAY_SIZE(rq->breadcrumb); n++)
> + if (rq->breadcrumb[n])
> + rq->engine->breadcrumb[n] = rq->breadcrumb[n];
> +
> + rq->ring->pending_flush = rq->pending_flush;
> +
> + if (rq->batch) {
> + add_to_client(rq);
> + while (!list_empty(&rq->vmas)) {
> + struct i915_gem_request__vma *ref =
> + list_first_entry(&rq->vmas, typeof(*ref), link);
> +
> + add_to_obj(rq, ref);
> + vma_free(ref);
> + }
> + }
> +
> + i915_request_switch_context__commit(rq);
> +
> + rq->engine->last_request = rq;
> +done:
> + rq->ring->last_context = rq->ctx;
> + return 0;
> +}
> +
> +static void fake_irq(unsigned long data)
> +{
> + wake_up_process((struct task_struct *)data);
> +}
> +
> +static bool missed_irq(struct i915_gem_request *rq)
> +{
> + return test_bit(rq->engine->id, &rq->i915->gpu_error.missed_irq_rings);
> +}
> +
> +static bool can_wait_boost(struct drm_i915_file_private *file_priv)
> +{
> + if (file_priv == NULL)
> + return true;
> +
> + return !atomic_xchg(&file_priv->rps_wait_boost, true);
> +}
> +
> +bool __i915_request_complete__wa(struct i915_gem_request *rq)
> +{
> + struct drm_i915_private *dev_priv = rq->i915;
> + unsigned head, tail;
> +
> + if (i915_request_complete(rq))
> + return true;
> +
> + /* With execlists, we rely on interrupts to track request completion */
> + if (rq->engine->execlists_enabled)
> + return false;
> +
> + /* As we may not emit a breadcrumb with every request, we
> + * often have unflushed requests. In the event of an emergency,
> + * just assume that if the RING_HEAD has reached the tail, then
> + * the request is complete. However, note that the RING_HEAD
> + * advances before the instruction completes, so this is quite lax,
> + * and should only be used carefully.
> + *
> + * As we treat this as only an advisory completion, we forgo
> + * marking the request as actually complete.
> + */
> + head = __intel_ring_space(I915_READ_HEAD(rq->engine) & HEAD_ADDR,
> + rq->ring->tail, rq->ring->size, 0);
> + tail = __intel_ring_space(rq->tail,
> + rq->ring->tail, rq->ring->size, 0);
> + return head >= tail;
> +}
> +
> +/**
> + * __i915_request_wait - wait until execution of request has finished
> + * @request: the request to wait upon
> + * @interruptible: do an interruptible wait (normally yes)
> + * @timeout_ns: in - how long to wait (NULL forever); out - how much time remaining
> + *
> + * Returns 0 if the request was completed within the alloted time. Else returns the
> + * errno with remaining time filled in timeout argument.
> + */
> +int __i915_request_wait(struct i915_gem_request *rq,
> + bool interruptible,
> + s64 *timeout_ns,
> + struct drm_i915_file_private *file_priv)
> +{
> + const bool irq_test_in_progress =
> + ACCESS_ONCE(rq->i915->gpu_error.test_irq_rings) & intel_engine_flag(rq->engine);
> + DEFINE_WAIT(wait);
> + unsigned long timeout_expire;
> + unsigned long before, now;
> + int ret = 0;
> +
> + WARN(!intel_irqs_enabled(rq->i915), "IRQs disabled");
> +
> + if (i915_request_complete(rq))
> + return 0;
> +
> + timeout_expire = timeout_ns ? jiffies + nsecs_to_jiffies((u64)*timeout_ns) : 0;
> +
> + if (INTEL_INFO(rq->i915)->gen >= 6 && rq->engine->id == RCS && can_wait_boost(file_priv)) {
> + gen6_rps_boost(rq->i915);
> + if (file_priv)
> + mod_delayed_work(rq->i915->wq,
> + &file_priv->mm.idle_work,
> + msecs_to_jiffies(100));
> + }
> +
> + if (!irq_test_in_progress && WARN_ON(!rq->engine->irq_get(rq->engine)))
> + return -ENODEV;
> +
> + /* Record current time in case interrupted by signal, or wedged */
> + trace_i915_gem_request_wait_begin(rq);
> + before = jiffies;
> + for (;;) {
> + struct timer_list timer;
> +
> + prepare_to_wait(&rq->engine->irq_queue, &wait,
> + interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
> +
> + if (!check_reset(rq))
> + break;
> +
> + rq->engine->irq_barrier(rq->engine);
> +
> + if (i915_request_complete(rq))
> + break;
> +
> + if (timeout_ns && time_after_eq(jiffies, timeout_expire)) {
> + ret = -ETIME;
> + break;
> + }
> +
> + if (interruptible && signal_pending(current)) {
> + ret = -ERESTARTSYS;
> + break;
> + }
> +
> + timer.function = NULL;
> + if (timeout_ns || missed_irq(rq)) {
> + unsigned long expire;
> +
> + setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
> + expire = missed_irq(rq) ? jiffies + 1 : timeout_expire;
> + mod_timer(&timer, expire);
> + }
> +
> + io_schedule();
> +
> + if (timer.function) {
> + del_singleshot_timer_sync(&timer);
> + destroy_timer_on_stack(&timer);
> + }
> + }
> + now = jiffies;
> + trace_i915_gem_request_wait_end(rq);
> +
> + if (!irq_test_in_progress)
> + rq->engine->irq_put(rq->engine);
> +
> + finish_wait(&rq->engine->irq_queue, &wait);
> +
> + if (timeout_ns) {
> + s64 tres = *timeout_ns - jiffies_to_nsecs(now - before);
> + *timeout_ns = tres <= 0 ? 0 : tres;
> + }
> +
> + return ret;
> +}
> +
> +struct i915_gem_request *
> +i915_request_get_breadcrumb(struct i915_gem_request *rq)
> +{
> + struct list_head *list;
> + u32 seqno;
> + int ret;
> +
> + /* Writes are only coherent from the cpu (in the general case) when
> + * the interrupt following the write to memory is complete. That is
> + * when the breadcrumb after the write request is complete.
> + *
> + * Reads are only complete when then command streamer barrier is
> + * passed.
> + *
> + * In both cases, the CPU needs to wait upon the subsequent breadcrumb,
> + * which ensures that all pending flushes have been emitted and are
> + * complete, before reporting that the request is finished and
> + * the CPU's view of memory is coherent with the GPU.
> + */
> +
> + ret = i915_request_emit_breadcrumb(rq);
> + if (ret)
> + return ERR_PTR(ret);
> +
> + ret = i915_request_commit(rq);
> + if (ret)
> + return ERR_PTR(ret);
> +
> + if (!list_empty(&rq->breadcrumb_link))
> + return i915_request_get(rq);
> +
> + seqno = rq->breadcrumb[rq->engine->id];
> + list = &rq->ring->breadcrumbs;
> + list_for_each_entry_reverse(rq, list, breadcrumb_link) {
> + if (rq->seqno == seqno)
> + return i915_request_get(rq);
> + }
> +
> + return ERR_PTR(-EIO);
> +}
> +
> +int
> +i915_request_wait(struct i915_gem_request *rq)
> +{
> + int ret;
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + rq = i915_request_get_breadcrumb(rq);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> + ret = __i915_request_wait(rq, rq->i915->mm.interruptible,
> + NULL, NULL);
> + i915_request_put(rq);
> +
> + return ret;
> +}
> +
> +void
> +i915_request_retire(struct i915_gem_request *rq)
> +{
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + if (!rq->completed) {
> + trace_i915_gem_request_complete(rq);
> + rq->completed = true;
> + }
> + trace_i915_gem_request_retire(rq);
> +
> + /* We know the GPU must have read the request to have
> + * sent us the seqno + interrupt, we can use the position
> + * of tail of the request to update the last known position
> + * of the GPU head.
> + */
> + if (!list_empty(&rq->breadcrumb_link))
> + rq->ring->retired_head = rq->tail;
> +
> + rq->batch = NULL;
> +
> + /* We need to protect against simultaneous hangcheck/capture */
> + spin_lock(&rq->engine->lock);
> + if (rq->engine->last_request == rq)
> + rq->engine->last_request = NULL;
> + list_del(&rq->engine_list);
> + spin_unlock(&rq->engine->lock);
> +
> + list_del(&rq->breadcrumb_link);
> + remove_from_client(rq);
> +
> + intel_runtime_pm_put(rq->i915);
> + i915_request_put(rq);
> +}
> +
> +void
> +__i915_request_free(struct kref *kref)
> +{
> + struct i915_gem_request *rq = container_of(kref, struct i915_gem_request, kref);
> +
> + lockdep_assert_held(&rq->i915->dev->struct_mutex);
> +
> + if (rq->outstanding) {
> + /* Rollback this partial transaction as we never committed
> + * the request to the hardware queue.
> + */
> + rq->ring->tail = rq->head;
> + rq->ring->space = intel_ring_space(rq->ring);
> + }
> +
> + while (!list_empty(&rq->vmas))
> + vma_free(list_first_entry(&rq->vmas,
> + struct i915_gem_request__vma,
> + link));
> +
> + i915_request_switch_context__undo(rq);
> + i915_gem_context_unreference(rq->ctx);
> + kfree(rq);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 2cefb597df6d..a48355c4ef88 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -383,7 +383,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>
> if (ret == 0) {
> obj->fence_dirty =
> - obj->last_fenced_seqno ||
> + obj->last_fence.request ||
> obj->fence_reg != I915_FENCE_REG_NONE;
>
> obj->tiling_mode = args->tiling_mode;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 2c87a797213f..adb6358a8f6e 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -192,15 +192,18 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
> struct drm_i915_error_buffer *err,
> int count)
> {
> - err_printf(m, " %s [%d]:\n", name, count);
> + int n;
>
> + err_printf(m, " %s [%d]:\n", name, count);
> while (count--) {
> - err_printf(m, " %08x %8u %02x %02x %x %x",
> + err_printf(m, " %08x %8u %02x %02x [",
> err->gtt_offset,
> err->size,
> err->read_domains,
> - err->write_domain,
> - err->rseqno, err->wseqno);
> + err->write_domain);
> + for (n = 0; n < ARRAY_SIZE(err->rseqno); n++)
> + err_printf(m, " %x", err->rseqno[n]);
> + err_printf(m, " ] %x %x ", err->wseqno, err->fseqno);
> err_puts(m, pin_flag(err->pinned));
> err_puts(m, tiling_flag(err->tiling));
> err_puts(m, dirty_flag(err->dirty));
> @@ -220,11 +223,13 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
> }
> }
>
> -static const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a)
> +static const char *hangcheck_action_to_str(enum intel_engine_hangcheck_action a)
> {
> switch (a) {
> case HANGCHECK_IDLE:
> return "idle";
> + case HANGCHECK_IDLE_WAITERS:
> + return "idle (with waiters)";
> case HANGCHECK_WAIT:
> return "wait";
> case HANGCHECK_ACTIVE:
> @@ -244,13 +249,19 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m,
> struct drm_device *dev,
> struct drm_i915_error_ring *ring)
> {
> + int n;
> +
> if (!ring->valid)
> return;
>
> - err_printf(m, " HEAD: 0x%08x\n", ring->head);
> - err_printf(m, " TAIL: 0x%08x\n", ring->tail);
> - err_printf(m, " CTL: 0x%08x\n", ring->ctl);
> - err_printf(m, " HWS: 0x%08x\n", ring->hws);
> + err_printf(m, "%s command stream:\n", ring_str(ring->id));
> +
> + err_printf(m, " START: 0x%08x\n", ring->start);
> + err_printf(m, " HEAD: 0x%08x\n", ring->head);
> + err_printf(m, " TAIL: 0x%08x\n", ring->tail);
> + err_printf(m, " CTL: 0x%08x\n", ring->ctl);
> + err_printf(m, " MODE: 0x%08x [idle? %d]\n", ring->mode, !!(ring->mode & MODE_IDLE));
> + err_printf(m, " HWS: 0x%08x\n", ring->hws);
> err_printf(m, " ACTHD: 0x%08x %08x\n", (u32)(ring->acthd>>32), (u32)ring->acthd);
> err_printf(m, " IPEIR: 0x%08x\n", ring->ipeir);
> err_printf(m, " IPEHR: 0x%08x\n", ring->ipehr);
> @@ -266,17 +277,13 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m,
> if (INTEL_INFO(dev)->gen >= 6) {
> err_printf(m, " RC PSMI: 0x%08x\n", ring->rc_psmi);
> err_printf(m, " FAULT_REG: 0x%08x\n", ring->fault_reg);
> - err_printf(m, " SYNC_0: 0x%08x [last synced 0x%08x]\n",
> - ring->semaphore_mboxes[0],
> - ring->semaphore_seqno[0]);
> - err_printf(m, " SYNC_1: 0x%08x [last synced 0x%08x]\n",
> - ring->semaphore_mboxes[1],
> - ring->semaphore_seqno[1]);
> - if (HAS_VEBOX(dev)) {
> - err_printf(m, " SYNC_2: 0x%08x [last synced 0x%08x]\n",
> - ring->semaphore_mboxes[2],
> - ring->semaphore_seqno[2]);
> - }
> + err_printf(m, " SYNC_0: 0x%08x\n",
> + ring->semaphore_mboxes[0]);
> + err_printf(m, " SYNC_1: 0x%08x\n",
> + ring->semaphore_mboxes[1]);
> + if (HAS_VEBOX(dev))
> + err_printf(m, " SYNC_2: 0x%08x\n",
> + ring->semaphore_mboxes[2]);
> }
> if (USES_PPGTT(dev)) {
> err_printf(m, " GFX_MODE: 0x%08x\n", ring->vm_info.gfx_mode);
> @@ -291,8 +298,20 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m,
> ring->vm_info.pp_dir_base);
> }
> }
> - err_printf(m, " seqno: 0x%08x\n", ring->seqno);
> - err_printf(m, " waiting: %s\n", yesno(ring->waiting));
> + err_printf(m, " tag: 0x%04x\n", ring->tag);
> + err_printf(m, " seqno: 0x%08x [hangcheck 0x%08x, breadcrumb 0x%08x, request 0x%08x]\n",
> + ring->seqno, ring->hangcheck, ring->breadcrumb[ring->id], ring->request);
> + err_printf(m, " sem.signal: [");
> + for (n = 0; n < ARRAY_SIZE(ring->breadcrumb); n++)
> + err_printf(m, " %s%08x", n == ring->id ? "*" : "", ring->breadcrumb[n]);
> + err_printf(m, " ]\n");
> + err_printf(m, " sem.waited: [");
> + for (n = 0; n < ARRAY_SIZE(ring->semaphore_sync); n++)
> + err_printf(m, " %s%08x", n == ring->id ? "*" : "", ring->semaphore_sync[n]);
> + err_printf(m, " ]\n");
> + err_printf(m, " waiting: %s [irq count %d]\n",
> + yesno(ring->waiting), ring->irq_count);
> + err_printf(m, " interrupts: %d\n", ring->interrupts);
> err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head);
> err_printf(m, " ring->tail: 0x%08x\n", ring->cpu_ring_tail);
> err_printf(m, " hangcheck: %s [%d]\n",
> @@ -362,11 +381,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> err_printf(m, "EIR: 0x%08x\n", error->eir);
> err_printf(m, "IER: 0x%08x\n", error->ier);
> if (INTEL_INFO(dev)->gen >= 8) {
> - for (i = 0; i < 4; i++)
> + for (i = 0; i < 4; i++) {
> err_printf(m, "GTIER gt %d: 0x%08x\n", i,
> error->gtier[i]);
> - } else if (HAS_PCH_SPLIT(dev) || IS_VALLEYVIEW(dev))
> + err_printf(m, "GTIMR gt %d: 0x%08x\n", i,
> + error->gtimr[i]);
> + }
> + } else if (HAS_PCH_SPLIT(dev) || IS_VALLEYVIEW(dev)) {
> err_printf(m, "GTIER: 0x%08x\n", error->gtier[0]);
> + err_printf(m, "GTIMR: 0x%08x\n", error->gtimr[0]);
> + }
> err_printf(m, "PGTBL_ER: 0x%08x\n", error->pgtbl_er);
> err_printf(m, "FORCEWAKE: 0x%08x\n", error->forcewake);
> err_printf(m, "DERRMR: 0x%08x\n", error->derrmr);
> @@ -388,10 +412,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> if (INTEL_INFO(dev)->gen == 7)
> err_printf(m, "ERR_INT: 0x%08x\n", error->err_int);
>
> - for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
> - err_printf(m, "%s command stream:\n", ring_str(i));
> + for (i = 0; i < ARRAY_SIZE(error->ring); i++)
> i915_ring_error_state(m, dev, &error->ring[i]);
> - }
>
> for (i = 0; i < error->vm_count; i++) {
> err_printf(m, "vm[%d]\n", i);
> @@ -406,48 +428,53 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> }
>
> for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
> - obj = error->ring[i].batchbuffer;
> + const struct drm_i915_error_ring *ering = &error->ring[i];
> + const char *name = dev_priv->engine[ering->id].name;
> +
> + obj = ering->batchbuffer;
> if (obj) {
> - err_puts(m, dev_priv->ring[i].name);
> - if (error->ring[i].pid != -1)
> + err_puts(m, name);
> + if (ering->pid != -1)
> err_printf(m, " (submitted by %s [%d])",
> - error->ring[i].comm,
> - error->ring[i].pid);
> + ering->comm, ering->pid);
> err_printf(m, " --- gtt_offset = 0x%08x\n",
> obj->gtt_offset);
> print_error_obj(m, obj);
> }
>
> - obj = error->ring[i].wa_batchbuffer;
> + obj = ering->wa_batchbuffer;
> if (obj) {
> err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
> - dev_priv->ring[i].name, obj->gtt_offset);
> + name, obj->gtt_offset);
> print_error_obj(m, obj);
> }
>
> - if (error->ring[i].num_requests) {
> + if (ering->num_requests) {
> err_printf(m, "%s --- %d requests\n",
> - dev_priv->ring[i].name,
> - error->ring[i].num_requests);
> - for (j = 0; j < error->ring[i].num_requests; j++) {
> - err_printf(m, " seqno 0x%08x, emitted %ld, tail 0x%08x\n",
> - error->ring[i].requests[j].seqno,
> - error->ring[i].requests[j].jiffies,
> - error->ring[i].requests[j].tail);
> + name, ering->num_requests);
> + for (j = 0; j < ering->num_requests; j++) {
> + err_printf(m, " pid %ld, seqno 0x%08x, tag 0x%04x, emitted %dms ago (at %ld jiffies), head 0x%08x, tail 0x%08x, batch 0x%08x, complete? %d\n",
> + ering->requests[j].pid,
> + ering->requests[j].seqno,
> + ering->requests[j].tag,
> + jiffies_to_usecs(jiffies - ering->requests[j].jiffies) / 1000,
> + ering->requests[j].jiffies,
> + ering->requests[j].head,
> + ering->requests[j].tail,
> + ering->requests[j].batch,
> + ering->requests[j].complete);
> }
> }
>
> - if ((obj = error->ring[i].ringbuffer)) {
> + if ((obj = ering->ringbuffer)) {
> err_printf(m, "%s --- ringbuffer = 0x%08x\n",
> - dev_priv->ring[i].name,
> - obj->gtt_offset);
> + name, obj->gtt_offset);
> print_error_obj(m, obj);
> }
>
> - if ((obj = error->ring[i].hws_page)) {
> + if ((obj = ering->hws_page)) {
> err_printf(m, "%s --- HW Status = 0x%08x\n",
> - dev_priv->ring[i].name,
> - obj->gtt_offset);
> + name, obj->gtt_offset);
> offset = 0;
> for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
> err_printf(m, "[%04x] %08x %08x %08x %08x\n",
> @@ -462,8 +489,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>
> if ((obj = error->ring[i].ctx)) {
> err_printf(m, "%s --- HW Context = 0x%08x\n",
> - dev_priv->ring[i].name,
> - obj->gtt_offset);
> + name, obj->gtt_offset);
> print_error_obj(m, obj);
> }
> }
> @@ -561,16 +587,20 @@ static void i915_error_state_free(struct kref *error_ref)
>
> static struct drm_i915_error_object *
> i915_error_object_create(struct drm_i915_private *dev_priv,
> - struct drm_i915_gem_object *src,
> - struct i915_address_space *vm)
> + struct i915_vma *vma)
> {
> + struct drm_i915_gem_object *src;
> struct drm_i915_error_object *dst;
> int num_pages;
> bool use_ggtt;
> int i = 0;
> u32 reloc_offset;
>
> - if (src == NULL || src->pages == NULL)
> + if (vma == NULL)
> + return NULL;
> +
> + src = vma->obj;
> + if (src->pages == NULL)
> return NULL;
>
> num_pages = src->base.size >> PAGE_SHIFT;
> @@ -579,14 +609,11 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
> if (dst == NULL)
> return NULL;
>
> - if (i915_gem_obj_bound(src, vm))
> - dst->gtt_offset = i915_gem_obj_offset(src, vm);
> - else
> - dst->gtt_offset = -1;
> + dst->gtt_offset = vma->node.start;
>
> reloc_offset = dst->gtt_offset;
> use_ggtt = (src->cache_level == I915_CACHE_NONE &&
> - i915_is_ggtt(vm) &&
> + i915_is_ggtt(vma->vm) &&
> src->has_global_gtt_mapping &&
> reloc_offset + num_pages * PAGE_SIZE <= dev_priv->gtt.mappable_end);
>
> @@ -656,18 +683,31 @@ unwind:
> kfree(dst);
> return NULL;
> }
> -#define i915_error_ggtt_object_create(dev_priv, src) \
> - i915_error_object_create((dev_priv), (src), &(dev_priv)->gtt.base)
> +
> +static inline struct drm_i915_error_object *
> +i915_error_ggtt_object_create(struct drm_i915_private *i915,
> + struct drm_i915_gem_object *src)
> +{
> + if (src == NULL)
> + return NULL;
> +
> + return i915_error_object_create(i915,
> + i915_gem_obj_to_vma(src,
> + &i915->gtt.base));
> +}
>
> static void capture_bo(struct drm_i915_error_buffer *err,
> struct i915_vma *vma)
> {
> struct drm_i915_gem_object *obj = vma->obj;
> + int n;
>
> err->size = obj->base.size;
> err->name = obj->base.name;
> - err->rseqno = obj->last_read_seqno;
> - err->wseqno = obj->last_write_seqno;
> + for (n = 0; n < ARRAY_SIZE(obj->last_read); n++)
> + err->rseqno[n] = i915_request_seqno(obj->last_read[n].request);
> + err->wseqno = i915_request_seqno(obj->last_write.request);
> + err->fseqno = i915_request_seqno(obj->last_fence.request);
> err->gtt_offset = vma->node.start;
> err->read_domains = obj->base.read_domains;
> err->write_domain = obj->base.write_domain;
> @@ -681,7 +721,7 @@ static void capture_bo(struct drm_i915_error_buffer *err,
> err->dirty = obj->dirty;
> err->purgeable = obj->madv != I915_MADV_WILLNEED;
> err->userptr = obj->userptr.mm != NULL;
> - err->ring = obj->ring ? obj->ring->id : -1;
> + err->ring = i915_request_engine_id(obj->last_write.request);
> err->cache_level = obj->cache_level;
> }
>
> @@ -745,7 +785,7 @@ static uint32_t i915_error_generate_code(struct drm_i915_private *dev_priv,
> * synchronization commands which almost always appear in the case
> * strictly a client bug. Use instdone to differentiate those some.
> */
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) {
> if (ring_id)
> *ring_id = i;
> @@ -793,83 +833,77 @@ static void i915_gem_record_fences(struct drm_device *dev,
>
> static void gen8_record_semaphore_state(struct drm_i915_private *dev_priv,
> struct drm_i915_error_state *error,
> - struct intel_engine_cs *ring,
> + struct intel_engine_cs *engine,
> struct drm_i915_error_ring *ering)
> {
> struct intel_engine_cs *to;
> + u32 *mbox;
> int i;
>
> - if (!i915_semaphore_is_enabled(dev_priv->dev))
> + if (dev_priv->semaphore_obj == NULL)
> return;
>
> - if (!error->semaphore_obj)
> + if (error->semaphore_obj == NULL)
> error->semaphore_obj =
> - i915_error_object_create(dev_priv,
> - dev_priv->semaphore_obj,
> - &dev_priv->gtt.base);
> -
> - for_each_ring(to, dev_priv, i) {
> - int idx;
> - u16 signal_offset;
> - u32 *tmp;
> + i915_error_ggtt_object_create(dev_priv,
> + dev_priv->semaphore_obj);
> + if (error->semaphore_obj == NULL)
> + return;
>
> - if (ring == to)
> + mbox = error->semaphore_obj->pages[0];
> + for_each_engine(to, dev_priv, i) {
> + if (engine == to)
> continue;
>
> - signal_offset = (GEN8_SIGNAL_OFFSET(ring, i) & (PAGE_SIZE - 1))
> - / 4;
> - tmp = error->semaphore_obj->pages[0];
> - idx = intel_ring_sync_index(ring, to);
> -
> - ering->semaphore_mboxes[idx] = tmp[signal_offset];
> - ering->semaphore_seqno[idx] = ring->semaphore.sync_seqno[idx];
> + ering->semaphore_mboxes[i] =
> + mbox[GEN8_SEMAPHORE_OFFSET(dev_priv,
> + engine->id,
> + i) & (PAGE_SIZE - 1) / 4];
> }
> }
>
> static void gen6_record_semaphore_state(struct drm_i915_private *dev_priv,
> - struct intel_engine_cs *ring,
> + struct intel_engine_cs *engine,
> struct drm_i915_error_ring *ering)
> {
> - ering->semaphore_mboxes[0] = I915_READ(RING_SYNC_0(ring->mmio_base));
> - ering->semaphore_mboxes[1] = I915_READ(RING_SYNC_1(ring->mmio_base));
> - ering->semaphore_seqno[0] = ring->semaphore.sync_seqno[0];
> - ering->semaphore_seqno[1] = ring->semaphore.sync_seqno[1];
> -
> + ering->semaphore_mboxes[0] = I915_READ(RING_SYNC_0(engine->mmio_base));
> + ering->semaphore_mboxes[1] = I915_READ(RING_SYNC_1(engine->mmio_base));
> if (HAS_VEBOX(dev_priv->dev)) {
> ering->semaphore_mboxes[2] =
> - I915_READ(RING_SYNC_2(ring->mmio_base));
> - ering->semaphore_seqno[2] = ring->semaphore.sync_seqno[2];
> + I915_READ(RING_SYNC_2(engine->mmio_base));
> }
> }
>
> static void i915_record_ring_state(struct drm_device *dev,
> struct drm_i915_error_state *error,
> - struct intel_engine_cs *ring,
> + struct intel_engine_cs *engine,
> + struct i915_gem_request *rq,
> struct drm_i915_error_ring *ering)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> + struct intel_ringbuffer *ring;
>
> if (INTEL_INFO(dev)->gen >= 6) {
> - ering->rc_psmi = I915_READ(ring->mmio_base + 0x50);
> - ering->fault_reg = I915_READ(RING_FAULT_REG(ring));
> + ering->rc_psmi = I915_READ(engine->mmio_base + 0x50);
> + ering->fault_reg = I915_READ(RING_FAULT_REG(engine));
> if (INTEL_INFO(dev)->gen >= 8)
> - gen8_record_semaphore_state(dev_priv, error, ring, ering);
> + gen8_record_semaphore_state(dev_priv, error, engine, ering);
> else
> - gen6_record_semaphore_state(dev_priv, ring, ering);
> + gen6_record_semaphore_state(dev_priv, engine, ering);
> }
>
> if (INTEL_INFO(dev)->gen >= 4) {
> - ering->faddr = I915_READ(RING_DMA_FADD(ring->mmio_base));
> - ering->ipeir = I915_READ(RING_IPEIR(ring->mmio_base));
> - ering->ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
> - ering->instdone = I915_READ(RING_INSTDONE(ring->mmio_base));
> - ering->instps = I915_READ(RING_INSTPS(ring->mmio_base));
> - ering->bbaddr = I915_READ(RING_BBADDR(ring->mmio_base));
> + ering->faddr = I915_READ(RING_DMA_FADD(engine->mmio_base));
> + ering->ipeir = I915_READ(RING_IPEIR(engine->mmio_base));
> + ering->ipehr = I915_READ(RING_IPEHR(engine->mmio_base));
> + ering->instdone = I915_READ(RING_INSTDONE(engine->mmio_base));
> + ering->instps = I915_READ(RING_INSTPS(engine->mmio_base));
> + ering->bbaddr = I915_READ(RING_BBADDR(engine->mmio_base));
> if (INTEL_INFO(dev)->gen >= 8) {
> - ering->faddr |= (u64) I915_READ(RING_DMA_FADD_UDW(ring->mmio_base)) << 32;
> - ering->bbaddr |= (u64) I915_READ(RING_BBADDR_UDW(ring->mmio_base)) << 32;
> + ering->faddr |= (u64) I915_READ(RING_DMA_FADD_UDW(engine->mmio_base)) << 32;
> + ering->bbaddr |= (u64) I915_READ(RING_BBADDR_UDW(engine->mmio_base)) << 32;
> }
> - ering->bbstate = I915_READ(RING_BBSTATE(ring->mmio_base));
> + ering->bbstate = I915_READ(RING_BBSTATE(engine->mmio_base));
> } else {
> ering->faddr = I915_READ(DMA_FADD_I8XX);
> ering->ipeir = I915_READ(IPEIR);
> @@ -877,19 +911,29 @@ static void i915_record_ring_state(struct drm_device *dev,
> ering->instdone = I915_READ(INSTDONE);
> }
>
> - ering->waiting = waitqueue_active(&ring->irq_queue);
> - ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
> - ering->seqno = ring->get_seqno(ring, false);
> - ering->acthd = intel_ring_get_active_head(ring);
> - ering->head = I915_READ_HEAD(ring);
> - ering->tail = I915_READ_TAIL(ring);
> - ering->ctl = I915_READ_CTL(ring);
> + ering->waiting = waitqueue_active(&engine->irq_queue);
> + ering->instpm = I915_READ(RING_INSTPM(engine->mmio_base));
> + ering->acthd = intel_engine_get_active_head(engine);
> + ering->seqno = engine->get_seqno(engine);
> + ering->request = engine->last_request ? engine->last_request->seqno : 0;
> + ering->hangcheck = engine->hangcheck.seqno;
> + memcpy(ering->breadcrumb, engine->breadcrumb, sizeof(ering->breadcrumb));
> + memcpy(ering->semaphore_sync, engine->semaphore.sync, sizeof(ering->semaphore_sync));
> + ering->tag = engine->tag;
> + ering->interrupts = atomic_read(&engine->interrupts);
> + ering->irq_count = engine->irq_refcount;
> + ering->start = I915_READ_START(engine);
> + ering->head = I915_READ_HEAD(engine);
> + ering->tail = I915_READ_TAIL(engine);
> + ering->ctl = I915_READ_CTL(engine);
> + if (!IS_GEN2(dev_priv))
> + ering->mode = I915_READ_MODE(engine);
>
> if (I915_NEED_GFX_HWS(dev)) {
> int mmio;
>
> if (IS_GEN7(dev)) {
> - switch (ring->id) {
> + switch (engine->id) {
> default:
> case RCS:
> mmio = RENDER_HWS_PGA_GEN7;
> @@ -904,56 +948,67 @@ static void i915_record_ring_state(struct drm_device *dev,
> mmio = VEBOX_HWS_PGA_GEN7;
> break;
> }
> - } else if (IS_GEN6(ring->dev)) {
> - mmio = RING_HWS_PGA_GEN6(ring->mmio_base);
> + } else if (IS_GEN6(engine->i915)) {
> + mmio = RING_HWS_PGA_GEN6(engine->mmio_base);
> } else {
> /* XXX: gen8 returns to sanity */
> - mmio = RING_HWS_PGA(ring->mmio_base);
> + mmio = RING_HWS_PGA(engine->mmio_base);
> }
>
> ering->hws = I915_READ(mmio);
> }
>
> - ering->hangcheck_score = ring->hangcheck.score;
> - ering->hangcheck_action = ring->hangcheck.action;
> + ring = rq ? rq->ctx->ring[engine->id].ring : engine->default_context->ring[engine->id].ring;
> + if (ring) {
> + ering->cpu_ring_head = ring->head;
> + ering->cpu_ring_tail = ring->tail;
> + ering->ringbuffer =
> + i915_error_ggtt_object_create(dev_priv, ring->obj);
> + }
> +
> + ering->hws_page =
> + i915_error_ggtt_object_create(dev_priv,
> + engine->status_page.obj);
> +
> + ering->hangcheck_score = engine->hangcheck.score;
> + ering->hangcheck_action = engine->hangcheck.action;
>
> if (USES_PPGTT(dev)) {
> int i;
>
> - ering->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(ring));
> + ering->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(engine));
>
> switch (INTEL_INFO(dev)->gen) {
> case 8:
> for (i = 0; i < 4; i++) {
> ering->vm_info.pdp[i] =
> - I915_READ(GEN8_RING_PDP_UDW(ring, i));
> + I915_READ(GEN8_RING_PDP_UDW(engine, i));
> ering->vm_info.pdp[i] <<= 32;
> ering->vm_info.pdp[i] |=
> - I915_READ(GEN8_RING_PDP_LDW(ring, i));
> + I915_READ(GEN8_RING_PDP_LDW(engine, i));
> }
> break;
> case 7:
> ering->vm_info.pp_dir_base =
> - I915_READ(RING_PP_DIR_BASE(ring));
> + I915_READ(RING_PP_DIR_BASE(engine));
> break;
> case 6:
> ering->vm_info.pp_dir_base =
> - I915_READ(RING_PP_DIR_BASE_READ(ring));
> + I915_READ(RING_PP_DIR_BASE_READ(engine));
> break;
> }
> }
> }
>
> -
> -static void i915_gem_record_active_context(struct intel_engine_cs *ring,
> +static void i915_gem_record_active_context(struct intel_engine_cs *engine,
> struct drm_i915_error_state *error,
> struct drm_i915_error_ring *ering)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> struct drm_i915_gem_object *obj;
>
> /* Currently render ring is the only HW context user */
> - if (ring->id != RCS || !error->ccid)
> + if (engine->id != RCS || !error->ccid)
> return;
>
> list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> @@ -971,49 +1026,40 @@ static void i915_gem_record_rings(struct drm_device *dev,
> struct drm_i915_error_state *error)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct drm_i915_gem_request *request;
> + struct i915_gem_request *rq;
> int i, count;
>
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> - struct intel_engine_cs *ring = &dev_priv->ring[i];
> - struct intel_ringbuffer *rbuf;
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + struct intel_engine_cs *engine = &dev_priv->engine[i];
>
> error->ring[i].pid = -1;
>
> - if (ring->dev == NULL)
> + if (engine->i915 == NULL)
> continue;
>
> error->ring[i].valid = true;
> + error->ring[i].id = i;
>
> - i915_record_ring_state(dev, error, ring, &error->ring[i]);
> -
> - request = i915_gem_find_active_request(ring);
> - if (request) {
> - struct i915_address_space *vm;
> -
> - vm = request->ctx && request->ctx->ppgtt ?
> - &request->ctx->ppgtt->base :
> - &dev_priv->gtt.base;
> -
> + spin_lock(&engine->lock);
> + rq = intel_engine_find_active_batch(engine);
> + if (rq) {
> /* We need to copy these to an anonymous buffer
> * as the simplest method to avoid being overwritten
> * by userspace.
> */
> error->ring[i].batchbuffer =
> - i915_error_object_create(dev_priv,
> - request->batch_obj,
> - vm);
> + i915_error_object_create(dev_priv, rq->batch);
>
> if (HAS_BROKEN_CS_TLB(dev_priv->dev))
> error->ring[i].wa_batchbuffer =
> i915_error_ggtt_object_create(dev_priv,
> - ring->scratch.obj);
> + engine->scratch.obj);
>
> - if (request->file_priv) {
> + if (rq->file_priv) {
> struct task_struct *task;
>
> rcu_read_lock();
> - task = pid_task(request->file_priv->file->pid,
> + task = pid_task(rq->file_priv->file->pid,
> PIDTYPE_PID);
> if (task) {
> strcpy(error->ring[i].comm, task->comm);
> @@ -1023,32 +1069,12 @@ static void i915_gem_record_rings(struct drm_device *dev,
> }
> }
>
> - if (i915.enable_execlists) {
> - /* TODO: This is only a small fix to keep basic error
> - * capture working, but we need to add more information
> - * for it to be useful (e.g. dump the context being
> - * executed).
> - */
> - if (request)
> - rbuf = request->ctx->engine[ring->id].ringbuf;
> - else
> - rbuf = ring->default_context->engine[ring->id].ringbuf;
> - } else
> - rbuf = ring->buffer;
> + i915_record_ring_state(dev, error, engine, rq, &error->ring[i]);
>
> - error->ring[i].cpu_ring_head = rbuf->head;
> - error->ring[i].cpu_ring_tail = rbuf->tail;
> -
> - error->ring[i].ringbuffer =
> - i915_error_ggtt_object_create(dev_priv, rbuf->obj);
> -
> - error->ring[i].hws_page =
> - i915_error_ggtt_object_create(dev_priv, ring->status_page.obj);
> -
> - i915_gem_record_active_context(ring, error, &error->ring[i]);
> + i915_gem_record_active_context(engine, error, &error->ring[i]);
>
> count = 0;
> - list_for_each_entry(request, &ring->request_list, list)
> + list_for_each_entry(rq, &engine->requests, engine_list)
> count++;
>
> error->ring[i].num_requests = count;
> @@ -1061,14 +1087,28 @@ static void i915_gem_record_rings(struct drm_device *dev,
> }
>
> count = 0;
> - list_for_each_entry(request, &ring->request_list, list) {
> + list_for_each_entry(rq, &engine->requests, engine_list) {
> struct drm_i915_error_request *erq;
> + struct task_struct *task;
>
> erq = &error->ring[i].requests[count++];
> - erq->seqno = request->seqno;
> - erq->jiffies = request->emitted_jiffies;
> - erq->tail = request->tail;
> + erq->seqno = rq->seqno;
> + erq->jiffies = rq->emitted_jiffies;
> + erq->head = rq->head;
> + erq->tail = rq->tail;
> + erq->batch = 0;
> + if (rq->batch)
> + erq->batch = rq->batch->node.start;
> + memcpy(erq->breadcrumb, rq->breadcrumb, sizeof(rq->breadcrumb));
> + erq->complete = i915_request_complete(rq);
> + erq->tag = rq->tag;
> +
> + rcu_read_lock();
> + task = rq->file_priv ? pid_task(rq->file_priv->file->pid, PIDTYPE_PID) : NULL;
> + erq->pid = task ? task->pid : 0;
> + rcu_read_unlock();
> }
> + spin_unlock(&engine->lock);
> }
> }
>
> @@ -1175,6 +1215,7 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
> /* 1: Registers specific to a single generation */
> if (IS_VALLEYVIEW(dev)) {
> error->gtier[0] = I915_READ(GTIER);
> + error->gtimr[0] = I915_READ(GTIMR);
> error->ier = I915_READ(VLV_IER);
> error->forcewake = I915_READ(FORCEWAKE_VLV);
> }
> @@ -1210,11 +1251,14 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
>
> if (INTEL_INFO(dev)->gen >= 8) {
> error->ier = I915_READ(GEN8_DE_MISC_IER);
> - for (i = 0; i < 4; i++)
> + for (i = 0; i < 4; i++) {
> error->gtier[i] = I915_READ(GEN8_GT_IER(i));
> + error->gtimr[i] = I915_READ(GEN8_GT_IMR(i));
> + }
> } else if (HAS_PCH_SPLIT(dev)) {
> error->ier = I915_READ(DEIER);
> error->gtier[0] = I915_READ(GTIER);
> + error->gtimr[0] = I915_READ(GTIMR);
> } else if (IS_GEN2(dev)) {
> error->ier = I915_READ16(IER);
> } else if (!IS_VALLEYVIEW(dev)) {
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index a40a8c9f9758..71bdd9b3784f 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1256,17 +1256,15 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
> }
>
> static void notify_ring(struct drm_device *dev,
> - struct intel_engine_cs *ring)
> + struct intel_engine_cs *engine)
> {
> - if (!intel_ring_initialized(ring))
> + if (!intel_engine_initialized(engine))
> return;
>
> - trace_i915_gem_request_complete(ring);
> + trace_i915_gem_ring_complete(engine);
> + atomic_inc(&engine->interrupts);
>
> - if (drm_core_check_feature(dev, DRIVER_MODESET))
> - intel_notify_mmio_flip(ring);
> -
> - wake_up_all(&ring->irq_queue);
> + wake_up_all(&engine->irq_queue);
> i915_queue_hangcheck(dev);
> }
>
> @@ -1584,9 +1582,9 @@ static void ilk_gt_irq_handler(struct drm_device *dev,
> {
> if (gt_iir &
> (GT_RENDER_USER_INTERRUPT | GT_RENDER_PIPECTL_NOTIFY_INTERRUPT))
> - notify_ring(dev, &dev_priv->ring[RCS]);
> + notify_ring(dev, &dev_priv->engine[RCS]);
> if (gt_iir & ILK_BSD_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[VCS]);
> + notify_ring(dev, &dev_priv->engine[VCS]);
> }
>
> static void snb_gt_irq_handler(struct drm_device *dev,
> @@ -1596,11 +1594,11 @@ static void snb_gt_irq_handler(struct drm_device *dev,
>
> if (gt_iir &
> (GT_RENDER_USER_INTERRUPT | GT_RENDER_PIPECTL_NOTIFY_INTERRUPT))
> - notify_ring(dev, &dev_priv->ring[RCS]);
> + notify_ring(dev, &dev_priv->engine[RCS]);
> if (gt_iir & GT_BSD_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[VCS]);
> + notify_ring(dev, &dev_priv->engine[VCS]);
> if (gt_iir & GT_BLT_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[BCS]);
> + notify_ring(dev, &dev_priv->engine[BCS]);
>
> if (gt_iir & (GT_BLT_CS_ERROR_INTERRUPT |
> GT_BSD_CS_ERROR_INTERRUPT |
> @@ -1630,7 +1628,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
> struct drm_i915_private *dev_priv,
> u32 master_ctl)
> {
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> u32 rcs, bcs, vcs;
> uint32_t tmp = 0;
> irqreturn_t ret = IRQ_NONE;
> @@ -1642,18 +1640,18 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
> ret = IRQ_HANDLED;
>
> rcs = tmp >> GEN8_RCS_IRQ_SHIFT;
> - ring = &dev_priv->ring[RCS];
> + engine = &dev_priv->engine[RCS];
> if (rcs & GT_RENDER_USER_INTERRUPT)
> - notify_ring(dev, ring);
> + notify_ring(dev, engine);
> if (rcs & GT_CONTEXT_SWITCH_INTERRUPT)
> - intel_execlists_handle_ctx_events(ring);
> + intel_execlists_irq_handler(engine);
>
> bcs = tmp >> GEN8_BCS_IRQ_SHIFT;
> - ring = &dev_priv->ring[BCS];
> + engine = &dev_priv->engine[BCS];
> if (bcs & GT_RENDER_USER_INTERRUPT)
> - notify_ring(dev, ring);
> + notify_ring(dev, engine);
> if (bcs & GT_CONTEXT_SWITCH_INTERRUPT)
> - intel_execlists_handle_ctx_events(ring);
> + intel_execlists_irq_handler(engine);
> } else
> DRM_ERROR("The master control interrupt lied (GT0)!\n");
> }
> @@ -1665,18 +1663,18 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
> ret = IRQ_HANDLED;
>
> vcs = tmp >> GEN8_VCS1_IRQ_SHIFT;
> - ring = &dev_priv->ring[VCS];
> + engine = &dev_priv->engine[VCS];
> if (vcs & GT_RENDER_USER_INTERRUPT)
> - notify_ring(dev, ring);
> + notify_ring(dev, engine);
> if (vcs & GT_CONTEXT_SWITCH_INTERRUPT)
> - intel_execlists_handle_ctx_events(ring);
> + intel_execlists_irq_handler(engine);
>
> vcs = tmp >> GEN8_VCS2_IRQ_SHIFT;
> - ring = &dev_priv->ring[VCS2];
> + engine = &dev_priv->engine[VCS2];
> if (vcs & GT_RENDER_USER_INTERRUPT)
> - notify_ring(dev, ring);
> + notify_ring(dev, engine);
> if (vcs & GT_CONTEXT_SWITCH_INTERRUPT)
> - intel_execlists_handle_ctx_events(ring);
> + intel_execlists_irq_handler(engine);
> } else
> DRM_ERROR("The master control interrupt lied (GT1)!\n");
> }
> @@ -1699,11 +1697,11 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_device *dev,
> ret = IRQ_HANDLED;
>
> vcs = tmp >> GEN8_VECS_IRQ_SHIFT;
> - ring = &dev_priv->ring[VECS];
> + engine = &dev_priv->engine[VECS];
> if (vcs & GT_RENDER_USER_INTERRUPT)
> - notify_ring(dev, ring);
> + notify_ring(dev, engine);
> if (vcs & GT_CONTEXT_SWITCH_INTERRUPT)
> - intel_execlists_handle_ctx_events(ring);
> + intel_execlists_irq_handler(engine);
> } else
> DRM_ERROR("The master control interrupt lied (GT3)!\n");
> }
> @@ -2021,7 +2019,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
>
> if (HAS_VEBOX(dev_priv->dev)) {
> if (pm_iir & PM_VEBOX_USER_INTERRUPT)
> - notify_ring(dev_priv->dev, &dev_priv->ring[VECS]);
> + notify_ring(dev_priv->dev, &dev_priv->engine[VECS]);
>
> if (pm_iir & PM_VEBOX_CS_ERROR_INTERRUPT) {
> i915_handle_error(dev_priv->dev, false,
> @@ -2654,7 +2652,7 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
> static void i915_error_wake_up(struct drm_i915_private *dev_priv,
> bool reset_completed)
> {
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int i;
>
> /*
> @@ -2665,8 +2663,8 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
> */
>
> /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
> - for_each_ring(ring, dev_priv, i)
> - wake_up_all(&ring->irq_queue);
> + for_each_engine(engine, dev_priv, i)
> + wake_up_all(&engine->irq_queue);
>
> /* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
> wake_up_all(&dev_priv->pending_flip_queue);
> @@ -2710,7 +2708,7 @@ static void i915_error_work_func(struct work_struct *work)
> * the reset in-progress bit is only ever set by code outside of this
> * work we don't need to worry about any other races.
> */
> - if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
> + if (i915_recovery_pending(error)) {
> DRM_DEBUG_DRIVER("resetting chip\n");
> kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
> reset_event);
> @@ -2746,9 +2744,7 @@ static void i915_error_work_func(struct work_struct *work)
> * updates before
> * the counter increment.
> */
> - smp_mb__before_atomic();
> - atomic_inc(&dev_priv->gpu_error.reset_counter);
> -
> + smp_mb__after_atomic();
> kobject_uevent_env(&dev->primary->kdev->kobj,
> KOBJ_CHANGE, reset_done_event);
> } else {
> @@ -3033,24 +3029,28 @@ static void gen8_disable_vblank(struct drm_device *dev, int pipe)
> spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags);
> }
>
> -static u32
> -ring_last_seqno(struct intel_engine_cs *ring)
> -{
> - return list_entry(ring->request_list.prev,
> - struct drm_i915_gem_request, list)->seqno;
> -}
> -
> static bool
> -ring_idle(struct intel_engine_cs *ring, u32 seqno)
> +engine_idle(struct intel_engine_cs *engine)
> {
> - return (list_empty(&ring->request_list) ||
> - i915_seqno_passed(seqno, ring_last_seqno(ring)));
> + bool ret = true;
> +
> + spin_lock(&engine->lock);
> + if (engine->last_request) {
> + /* poke to make sure we retire before we wake up again */
> + queue_delayed_work(engine->i915->wq,
> + &engine->i915->mm.retire_work,
> + round_jiffies_up_relative(DRM_I915_HANGCHECK_JIFFIES/2));
> + ret = __i915_request_complete__wa(engine->last_request);
> + }
> + spin_unlock(&engine->lock);
> +
> + return ret;
> }
>
> static bool
> -ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
> +ipehr_is_semaphore_wait(struct drm_i915_private *i915, u32 ipehr)
> {
> - if (INTEL_INFO(dev)->gen >= 8) {
> + if (INTEL_INFO(i915)->gen >= 8) {
> return (ipehr >> 23) == 0x1c;
> } else {
> ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
> @@ -3060,48 +3060,54 @@ ipehr_is_semaphore_wait(struct drm_device *dev, u32 ipehr)
> }
>
> static struct intel_engine_cs *
> -semaphore_wait_to_signaller_ring(struct intel_engine_cs *ring, u32 ipehr, u64 offset)
> +semaphore_wait_to_signaller_engine(struct intel_engine_cs *engine, u32 ipehr, u64 offset)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> struct intel_engine_cs *signaller;
> int i;
>
> if (INTEL_INFO(dev_priv->dev)->gen >= 8) {
> - for_each_ring(signaller, dev_priv, i) {
> - if (ring == signaller)
> + for_each_engine(signaller, dev_priv, i) {
> + if (engine == signaller)
> continue;
>
> - if (offset == signaller->semaphore.signal_ggtt[ring->id])
> + if (offset == GEN8_SEMAPHORE_OFFSET(dev_priv, signaller->id, engine->id))
> return signaller;
> }
> } else {
> u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
>
> - for_each_ring(signaller, dev_priv, i) {
> - if(ring == signaller)
> + for_each_engine(signaller, dev_priv, i) {
> + if(engine == signaller)
> continue;
>
> - if (sync_bits == signaller->semaphore.mbox.wait[ring->id])
> + if (sync_bits == signaller->semaphore.mbox.wait[engine->id])
> return signaller;
> }
> }
>
> DRM_ERROR("No signaller ring found for ring %i, ipehr 0x%08x, offset 0x%016llx\n",
> - ring->id, ipehr, offset);
> + engine->id, ipehr, offset);
>
> return NULL;
> }
>
> static struct intel_engine_cs *
> -semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
> +semaphore_waits_for(struct intel_engine_cs *engine, u32 *seqno)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> + struct intel_ringbuffer *ring;
> u32 cmd, ipehr, head;
> u64 offset = 0;
> int i, backwards;
>
> - ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
> - if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
> + ipehr = I915_READ(RING_IPEHR(engine->mmio_base));
> + if (!ipehr_is_semaphore_wait(engine->i915, ipehr))
> + return NULL;
> +
> + /* XXX execlists */
> + ring = engine->default_context->ring[RCS].ring;
> + if (ring == NULL)
> return NULL;
>
> /*
> @@ -3112,19 +3118,19 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
> * point at at batch, and semaphores are always emitted into the
> * ringbuffer itself.
> */
> - head = I915_READ_HEAD(ring) & HEAD_ADDR;
> - backwards = (INTEL_INFO(ring->dev)->gen >= 8) ? 5 : 4;
> + head = I915_READ_HEAD(engine) & HEAD_ADDR;
> + backwards = (INTEL_INFO(dev_priv)->gen >= 8) ? 5 : 4;
>
> for (i = backwards; i; --i) {
> /*
> * Be paranoid and presume the hw has gone off into the wild -
> - * our ring is smaller than what the hardware (and hence
> + * our engine is smaller than what the hardware (and hence
> * HEAD_ADDR) allows. Also handles wrap-around.
> */
> - head &= ring->buffer->size - 1;
> + head &= ring->size - 1;
>
> /* This here seems to blow up */
> - cmd = ioread32(ring->buffer->virtual_start + head);
> + cmd = ioread32(ring->virtual_start + head);
> if (cmd == ipehr)
> break;
>
> @@ -3134,32 +3140,37 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
> if (!i)
> return NULL;
>
> - *seqno = ioread32(ring->buffer->virtual_start + head + 4) + 1;
> - if (INTEL_INFO(ring->dev)->gen >= 8) {
> - offset = ioread32(ring->buffer->virtual_start + head + 12);
> + *seqno = ioread32(ring->virtual_start + head + 4) + 1;
> + if (INTEL_INFO(dev_priv)->gen >= 8) {
> + offset = ioread32(ring->virtual_start + head + 12);
> offset <<= 32;
> - offset = ioread32(ring->buffer->virtual_start + head + 8);
> + offset = ioread32(ring->virtual_start + head + 8);
> }
> - return semaphore_wait_to_signaller_ring(ring, ipehr, offset);
> + return semaphore_wait_to_signaller_engine(engine, ipehr, offset);
> }
>
> -static int semaphore_passed(struct intel_engine_cs *ring)
> +static int semaphore_passed(struct intel_engine_cs *engine)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> struct intel_engine_cs *signaller;
> + struct i915_gem_request *rq;
> u32 seqno;
>
> - ring->hangcheck.deadlock++;
> + engine->hangcheck.deadlock++;
>
> - signaller = semaphore_waits_for(ring, &seqno);
> + if (engine->semaphore.wait == NULL)
> + return -1;
> +
> + signaller = semaphore_waits_for(engine, &seqno);
> if (signaller == NULL)
> return -1;
>
> /* Prevent pathological recursion due to driver bugs */
> - if (signaller->hangcheck.deadlock >= I915_NUM_RINGS)
> + if (signaller->hangcheck.deadlock >= I915_NUM_ENGINES)
> return -1;
>
> - if (i915_seqno_passed(signaller->get_seqno(signaller, false), seqno))
> + rq = intel_engine_seqno_to_request(engine, seqno);
> + if (rq == NULL || i915_request_complete(rq))
> return 1;
>
> /* cursory check for an unkickable deadlock */
> @@ -3172,30 +3183,29 @@ static int semaphore_passed(struct intel_engine_cs *ring)
>
> static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
> {
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> int i;
>
> - for_each_ring(ring, dev_priv, i)
> - ring->hangcheck.deadlock = 0;
> + for_each_engine(engine, dev_priv, i)
> + engine->hangcheck.deadlock = 0;
> }
>
> -static enum intel_ring_hangcheck_action
> -ring_stuck(struct intel_engine_cs *ring, u64 acthd)
> +static enum intel_engine_hangcheck_action
> +engine_stuck(struct intel_engine_cs *engine, u64 acthd)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> u32 tmp;
>
> - if (acthd != ring->hangcheck.acthd) {
> - if (acthd > ring->hangcheck.max_acthd) {
> - ring->hangcheck.max_acthd = acthd;
> + if (acthd != engine->hangcheck.acthd) {
> + if (acthd > engine->hangcheck.max_acthd) {
> + engine->hangcheck.max_acthd = acthd;
> return HANGCHECK_ACTIVE;
> }
>
> return HANGCHECK_ACTIVE_LOOP;
> }
>
> - if (IS_GEN2(dev))
> + if (IS_GEN2(dev_priv))
> return HANGCHECK_HUNG;
>
> /* Is the chip hanging on a WAIT_FOR_EVENT?
> @@ -3203,24 +3213,24 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
> * and break the hang. This should work on
> * all but the second generation chipsets.
> */
> - tmp = I915_READ_CTL(ring);
> + tmp = I915_READ_CTL(engine);
> if (tmp & RING_WAIT) {
> - i915_handle_error(dev, false,
> + i915_handle_error(dev_priv->dev, false,
> "Kicking stuck wait on %s",
> - ring->name);
> - I915_WRITE_CTL(ring, tmp);
> + engine->name);
> + I915_WRITE_CTL(engine, tmp);
> return HANGCHECK_KICK;
> }
>
> - if (INTEL_INFO(dev)->gen >= 6 && tmp & RING_WAIT_SEMAPHORE) {
> - switch (semaphore_passed(ring)) {
> + if (INTEL_INFO(dev_priv)->gen >= 6 && tmp & RING_WAIT_SEMAPHORE) {
> + switch (semaphore_passed(engine)) {
> default:
> return HANGCHECK_HUNG;
> case 1:
> - i915_handle_error(dev, false,
> + i915_handle_error(dev_priv->dev, false,
> "Kicking stuck semaphore on %s",
> - ring->name);
> - I915_WRITE_CTL(ring, tmp);
> + engine->name);
> + I915_WRITE_CTL(engine, tmp);
> return HANGCHECK_KICK;
> case 0:
> return HANGCHECK_WAIT;
> @@ -3232,7 +3242,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
>
> /**
> * This is called when the chip hasn't reported back with completed
> - * batchbuffers in a long time. We keep track per ring seqno progress and
> + * batchbuffers in a long time. We keep track per engine seqno progress and
> * if there are no progress, hangcheck score for that ring is increased.
> * Further, acthd is inspected to see if the ring is stuck. On stuck case
> * we kick the ring. If we see no progress on three subsequent calls
> @@ -3240,12 +3250,11 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
> */
> static void i915_hangcheck_elapsed(unsigned long data)
> {
> - struct drm_device *dev = (struct drm_device *)data;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct drm_i915_private *dev_priv = (struct drm_i915_private *)data;
> + struct intel_engine_cs *engine;
> int i;
> int busy_count = 0, rings_hung = 0;
> - bool stuck[I915_NUM_RINGS] = { 0 };
> + bool stuck[I915_NUM_ENGINES] = { 0 };
> #define BUSY 1
> #define KICK 5
> #define HUNG 20
> @@ -3253,104 +3262,108 @@ static void i915_hangcheck_elapsed(unsigned long data)
> if (!i915.enable_hangcheck)
> return;
>
> - for_each_ring(ring, dev_priv, i) {
> + for_each_engine(engine, dev_priv, i) {
> u64 acthd;
> u32 seqno;
> + u32 interrupts;
> bool busy = true;
>
> semaphore_clear_deadlocks(dev_priv);
>
> - seqno = ring->get_seqno(ring, false);
> - acthd = intel_ring_get_active_head(ring);
> -
> - if (ring->hangcheck.seqno == seqno) {
> - if (ring_idle(ring, seqno)) {
> - ring->hangcheck.action = HANGCHECK_IDLE;
> -
> - if (waitqueue_active(&ring->irq_queue)) {
> - /* Issue a wake-up to catch stuck h/w. */
> - if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
> - if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
> - DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
> - ring->name);
> - else
> - DRM_INFO("Fake missed irq on %s\n",
> - ring->name);
> - wake_up_all(&ring->irq_queue);
> - }
> - /* Safeguard against driver failure */
> - ring->hangcheck.score += BUSY;
> - } else
> - busy = false;
> + acthd = intel_engine_get_active_head(engine);
> + seqno = engine->get_seqno(engine);
> + interrupts = atomic_read(&engine->interrupts);
> +
> + if (engine_idle(engine)) {
> + if (waitqueue_active(&engine->irq_queue)) {
> + /* Issue a wake-up to catch stuck h/w. */
> + if (engine->hangcheck.action == HANGCHECK_IDLE_WAITERS &&
> + engine->hangcheck.interrupts == interrupts &&
> + !test_and_set_bit(engine->id, &dev_priv->gpu_error.missed_irq_rings)) {
> + if (!(dev_priv->gpu_error.test_irq_rings & intel_engine_flag(engine)))
> + DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
> + engine->name);
> + else
> + DRM_INFO("Fake missed irq on %s\n",
> + engine->name);
> + wake_up_all(&engine->irq_queue);
> + }
> +
> + /* Safeguard against driver failure */
> + engine->hangcheck.score += BUSY;
> + engine->hangcheck.action = HANGCHECK_IDLE_WAITERS;
> } else {
> - /* We always increment the hangcheck score
> - * if the ring is busy and still processing
> - * the same request, so that no single request
> - * can run indefinitely (such as a chain of
> - * batches). The only time we do not increment
> - * the hangcheck score on this ring, if this
> - * ring is in a legitimate wait for another
> - * ring. In that case the waiting ring is a
> - * victim and we want to be sure we catch the
> - * right culprit. Then every time we do kick
> - * the ring, add a small increment to the
> - * score so that we can catch a batch that is
> - * being repeatedly kicked and so responsible
> - * for stalling the machine.
> - */
> - ring->hangcheck.action = ring_stuck(ring,
> - acthd);
> -
> - switch (ring->hangcheck.action) {
> + busy = false;
> + engine->hangcheck.action = HANGCHECK_IDLE;
> + }
> + } else if (engine->hangcheck.seqno == seqno) {
> + /* We always increment the hangcheck score
> + * if the ring is busy and still processing
> + * the same request, so that no single request
> + * can run indefinitely (such as a chain of
> + * batches). The only time we do not increment
> + * the hangcheck score on this ring, if this
> + * ring is in a legitimate wait for another
> + * ring. In that case the waiting ring is a
> + * victim and we want to be sure we catch the
> + * right culprit. Then every time we do kick
> + * the ring, add a small increment to the
> + * score so that we can catch a batch that is
> + * being repeatedly kicked and so responsible
> + * for stalling the machine.
> + */
> + engine->hangcheck.action = engine_stuck(engine, acthd);
> + switch (engine->hangcheck.action) {
> case HANGCHECK_IDLE:
> + case HANGCHECK_IDLE_WAITERS:
> case HANGCHECK_WAIT:
> case HANGCHECK_ACTIVE:
> break;
> case HANGCHECK_ACTIVE_LOOP:
> - ring->hangcheck.score += BUSY;
> + engine->hangcheck.score += BUSY;
> break;
> case HANGCHECK_KICK:
> - ring->hangcheck.score += KICK;
> + engine->hangcheck.score += KICK;
> break;
> case HANGCHECK_HUNG:
> - ring->hangcheck.score += HUNG;
> + engine->hangcheck.score += HUNG;
> stuck[i] = true;
> break;
> - }
> }
> } else {
> - ring->hangcheck.action = HANGCHECK_ACTIVE;
> + engine->hangcheck.action = HANGCHECK_ACTIVE;
>
> /* Gradually reduce the count so that we catch DoS
> * attempts across multiple batches.
> */
> - if (ring->hangcheck.score > 0)
> - ring->hangcheck.score--;
> + if (engine->hangcheck.score > 0)
> + engine->hangcheck.score--;
>
> - ring->hangcheck.acthd = ring->hangcheck.max_acthd = 0;
> + engine->hangcheck.acthd = engine->hangcheck.max_acthd = 0;
> }
>
> - ring->hangcheck.seqno = seqno;
> - ring->hangcheck.acthd = acthd;
> + engine->hangcheck.interrupts = interrupts;
> + engine->hangcheck.seqno = seqno;
> + engine->hangcheck.acthd = acthd;
> busy_count += busy;
> }
>
> - for_each_ring(ring, dev_priv, i) {
> - if (ring->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) {
> + for_each_engine(engine, dev_priv, i) {
> + if (engine->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) {
> DRM_INFO("%s on %s\n",
> stuck[i] ? "stuck" : "no progress",
> - ring->name);
> + engine->name);
> rings_hung++;
> }
> }
>
> if (rings_hung)
> - return i915_handle_error(dev, true, "Ring hung");
> + return i915_handle_error(dev_priv->dev, true, "Ring hung");
>
> if (busy_count)
> /* Reset timer case chip hangs without another request
> * being added */
> - i915_queue_hangcheck(dev);
> + i915_queue_hangcheck(dev_priv->dev);
> }
>
> void i915_queue_hangcheck(struct drm_device *dev)
> @@ -4110,7 +4123,7 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
> new_iir = I915_READ16(IIR); /* Flush posted writes */
>
> if (iir & I915_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[RCS]);
> + notify_ring(dev, &dev_priv->engine[RCS]);
>
> for_each_pipe(dev_priv, pipe) {
> int plane = pipe;
> @@ -4303,7 +4316,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
> new_iir = I915_READ(IIR); /* Flush posted writes */
>
> if (iir & I915_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[RCS]);
> + notify_ring(dev, &dev_priv->engine[RCS]);
>
> for_each_pipe(dev_priv, pipe) {
> int plane = pipe;
> @@ -4533,9 +4546,9 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
> new_iir = I915_READ(IIR); /* Flush posted writes */
>
> if (iir & I915_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[RCS]);
> + notify_ring(dev, &dev_priv->engine[RCS]);
> if (iir & I915_BSD_USER_INTERRUPT)
> - notify_ring(dev, &dev_priv->ring[VCS]);
> + notify_ring(dev, &dev_priv->engine[VCS]);
>
> for_each_pipe(dev_priv, pipe) {
> if (pipe_stats[pipe] & PIPE_START_VBLANK_INTERRUPT_STATUS &&
> @@ -4663,7 +4676,7 @@ void intel_irq_init(struct drm_device *dev)
>
> setup_timer(&dev_priv->gpu_error.hangcheck_timer,
> i915_hangcheck_elapsed,
> - (unsigned long) dev);
> + (unsigned long) dev_priv);
> INIT_DELAYED_WORK(&dev_priv->hotplug_reenable_work,
> intel_hpd_irq_reenable);
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 15c0eaa9f97f..59f0852d89d6 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -287,7 +287,7 @@
> #define MI_STORE_REGISTER_MEM(x) MI_INSTR(0x24, 2*(x)-1)
> #define MI_STORE_REGISTER_MEM_GEN8(x) MI_INSTR(0x24, 3*(x)-1)
> #define MI_SRM_LRM_GLOBAL_GTT (1<<22)
> -#define MI_FLUSH_DW MI_INSTR(0x26, 1) /* for GEN6 */
> +#define MI_FLUSH_DW MI_INSTR(0x26, 0) /* for GEN6 */
> #define MI_FLUSH_DW_STORE_INDEX (1<<21)
> #define MI_INVALIDATE_TLB (1<<18)
> #define MI_FLUSH_DW_OP_STOREDW (1<<14)
> @@ -2295,6 +2295,7 @@ enum punit_power_well {
> * doesn't need saving on GT1
> */
> #define CXT_SIZE 0x21a0
> +#define ILK_CXT_TOTAL_SIZE (1 * PAGE_SIZE)
> #define GEN6_CXT_POWER_SIZE(cxt_reg) ((cxt_reg >> 24) & 0x3f)
> #define GEN6_CXT_RING_SIZE(cxt_reg) ((cxt_reg >> 18) & 0x3f)
> #define GEN6_CXT_RENDER_SIZE(cxt_reg) ((cxt_reg >> 12) & 0x3f)
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index f5aa0067755a..8bb51dcb10f3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -325,11 +325,10 @@ TRACE_EVENT(i915_gem_evict_vm,
> TP_printk("dev=%d, vm=%p", __entry->dev, __entry->vm)
> );
>
> -TRACE_EVENT(i915_gem_ring_sync_to,
> - TP_PROTO(struct intel_engine_cs *from,
> - struct intel_engine_cs *to,
> - u32 seqno),
> - TP_ARGS(from, to, seqno),
> +TRACE_EVENT(i915_gem_ring_wait,
> + TP_PROTO(struct i915_gem_request *waiter,
> + struct i915_gem_request *signaller),
> + TP_ARGS(waiter, signaller),
>
> TP_STRUCT__entry(
> __field(u32, dev)
> @@ -339,18 +338,40 @@ TRACE_EVENT(i915_gem_ring_sync_to,
> ),
>
> TP_fast_assign(
> - __entry->dev = from->dev->primary->index;
> - __entry->sync_from = from->id;
> - __entry->sync_to = to->id;
> - __entry->seqno = seqno;
> + __entry->dev = waiter->i915->dev->primary->index;
> + __entry->sync_from = waiter->engine->id;
> + __entry->sync_to = signaller->engine->id;
> + __entry->seqno = signaller->breadcrumb[waiter->engine->id];
> ),
>
> - TP_printk("dev=%u, sync-from=%u, sync-to=%u, seqno=%u",
> + TP_printk("dev=%u, sync-from=%u, sync-to=%u, seqno=%x",
> __entry->dev,
> __entry->sync_from, __entry->sync_to,
> __entry->seqno)
> );
>
> +TRACE_EVENT(i915_gem_ring_switch_context,
> + TP_PROTO(struct intel_engine_cs *engine, struct intel_context *ctx, u32 flags),
> + TP_ARGS(engine, ctx, flags),
> +
> + TP_STRUCT__entry(
> + __field(u32, dev)
> + __field(u32, ring)
> + __field(u32, ctx)
> + __field(u32, flags)
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = engine->i915->dev->primary->index;
> + __entry->ring = engine->id;
> + __entry->ctx = ctx->file_priv ? ctx->user_handle : -1;
> + __entry->flags = flags;
> + ),
> +
> + TP_printk("dev=%u, ring=%u, ctx=%d, flags=0x%08x",
> + __entry->dev, __entry->ring, __entry->ctx, __entry->flags)
> +);
> +
> TRACE_EVENT(i915_gem_ring_dispatch,
> TP_PROTO(struct intel_engine_cs *ring, u32 seqno, u32 flags),
> TP_ARGS(ring, seqno, flags),
> @@ -363,66 +384,84 @@ TRACE_EVENT(i915_gem_ring_dispatch,
> ),
>
> TP_fast_assign(
> - __entry->dev = ring->dev->primary->index;
> + __entry->dev = ring->i915->dev->primary->index;
> __entry->ring = ring->id;
> __entry->seqno = seqno;
> __entry->flags = flags;
> i915_trace_irq_get(ring, seqno);
> ),
>
> - TP_printk("dev=%u, ring=%u, seqno=%u, flags=%x",
> + TP_printk("dev=%u, ring=%u, seqno=%x, flags=%x",
> __entry->dev, __entry->ring, __entry->seqno, __entry->flags)
> );
>
> -TRACE_EVENT(i915_gem_ring_flush,
> - TP_PROTO(struct intel_engine_cs *ring, u32 invalidate, u32 flush),
> - TP_ARGS(ring, invalidate, flush),
> +TRACE_EVENT(intel_ringbuffer_begin,
> + TP_PROTO(struct intel_ringbuffer *ring, int need),
> + TP_ARGS(ring, need),
>
> TP_STRUCT__entry(
> __field(u32, dev)
> __field(u32, ring)
> - __field(u32, invalidate)
> - __field(u32, flush)
> + __field(u32, need)
> + __field(u32, space)
> ),
>
> TP_fast_assign(
> - __entry->dev = ring->dev->primary->index;
> - __entry->ring = ring->id;
> - __entry->invalidate = invalidate;
> - __entry->flush = flush;
> + __entry->dev = ring->engine->i915->dev->primary->index;
> + __entry->ring = ring->engine->id;
> + __entry->need = need;
> + __entry->space = intel_ring_space(ring);
> ),
>
> - TP_printk("dev=%u, ring=%x, invalidate=%04x, flush=%04x",
> - __entry->dev, __entry->ring,
> - __entry->invalidate, __entry->flush)
> + TP_printk("dev=%u, ring=%u, need=%u, space=%u",
> + __entry->dev, __entry->ring, __entry->need, __entry->space)
> );
>
> -DECLARE_EVENT_CLASS(i915_gem_request,
> - TP_PROTO(struct intel_engine_cs *ring, u32 seqno),
> - TP_ARGS(ring, seqno),
> +TRACE_EVENT(intel_ringbuffer_wait,
> + TP_PROTO(struct intel_ringbuffer *ring, int need),
> + TP_ARGS(ring, need),
>
> TP_STRUCT__entry(
> __field(u32, dev)
> __field(u32, ring)
> - __field(u32, seqno)
> + __field(u32, need)
> + __field(u32, space)
> ),
>
> TP_fast_assign(
> - __entry->dev = ring->dev->primary->index;
> - __entry->ring = ring->id;
> - __entry->seqno = seqno;
> + __entry->dev = ring->engine->i915->dev->primary->index;
> + __entry->ring = ring->engine->id;
> + __entry->need = need;
> + __entry->space = intel_ring_space(ring);
> ),
>
> - TP_printk("dev=%u, ring=%u, seqno=%u",
> - __entry->dev, __entry->ring, __entry->seqno)
> + TP_printk("dev=%u, ring=%u, need=%u, space=%u",
> + __entry->dev, __entry->ring, __entry->need, __entry->space)
> );
>
> -DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
> - TP_PROTO(struct intel_engine_cs *ring, u32 seqno),
> - TP_ARGS(ring, seqno)
> +TRACE_EVENT(intel_ringbuffer_wrap,
> + TP_PROTO(struct intel_ringbuffer *ring, int rem),
> + TP_ARGS(ring, rem),
> +
> + TP_STRUCT__entry(
> + __field(u32, dev)
> + __field(u32, ring)
> + __field(u32, rem)
> + __field(u32, size)
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = ring->engine->i915->dev->primary->index;
> + __entry->ring = ring->engine->id;
> + __entry->rem = rem;
> + __entry->size = ring->effective_size;
> + ),
> +
> + TP_printk("dev=%u, ring=%u, rem=%u, size=%u",
> + __entry->dev, __entry->ring, __entry->rem, __entry->size)
> );
>
> -TRACE_EVENT(i915_gem_request_complete,
> +TRACE_EVENT(i915_gem_ring_complete,
> TP_PROTO(struct intel_engine_cs *ring),
> TP_ARGS(ring),
>
> @@ -433,23 +472,68 @@ TRACE_EVENT(i915_gem_request_complete,
> ),
>
> TP_fast_assign(
> - __entry->dev = ring->dev->primary->index;
> + __entry->dev = ring->i915->dev->primary->index;
> __entry->ring = ring->id;
> - __entry->seqno = ring->get_seqno(ring, false);
> + __entry->seqno = ring->get_seqno(ring);
> ),
>
> - TP_printk("dev=%u, ring=%u, seqno=%u",
> + TP_printk("dev=%u, ring=%u, seqno=%x",
> __entry->dev, __entry->ring, __entry->seqno)
> );
>
> +DECLARE_EVENT_CLASS(i915_gem_request,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq),
> +
> + TP_STRUCT__entry(
> + __field(u32, dev)
> + __field(u32, ring)
> + __field(u32, seqno)
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = rq->i915->dev->primary->index;
> + __entry->ring = rq->engine->id;
> + __entry->seqno = rq->seqno;
> + ),
> +
> + TP_printk("dev=%u, ring=%u, seqno=%x",
> + __entry->dev, __entry->ring, __entry->seqno)
> +);
> +
> +DEFINE_EVENT(i915_gem_request, i915_gem_request_emit_flush,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq)
> +);
> +
> +DEFINE_EVENT(i915_gem_request, i915_gem_request_emit_batch,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq)
> +);
> +
> +DEFINE_EVENT(i915_gem_request, i915_gem_request_emit_breadcrumb,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq)
> +);
> +
> +DEFINE_EVENT(i915_gem_request, i915_gem_request_commit,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq)
> +);
> +
> +DEFINE_EVENT(i915_gem_request, i915_gem_request_complete,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq)
> +);
> +
> DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
> - TP_PROTO(struct intel_engine_cs *ring, u32 seqno),
> - TP_ARGS(ring, seqno)
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq)
> );
>
> TRACE_EVENT(i915_gem_request_wait_begin,
> - TP_PROTO(struct intel_engine_cs *ring, u32 seqno),
> - TP_ARGS(ring, seqno),
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq),
>
> TP_STRUCT__entry(
> __field(u32, dev)
> @@ -465,47 +549,38 @@ TRACE_EVENT(i915_gem_request_wait_begin,
> * less desirable.
> */
> TP_fast_assign(
> - __entry->dev = ring->dev->primary->index;
> - __entry->ring = ring->id;
> - __entry->seqno = seqno;
> - __entry->blocking = mutex_is_locked(&ring->dev->struct_mutex);
> + __entry->dev = rq->i915->dev->primary->index;
> + __entry->ring = rq->engine->id;
> + __entry->seqno = rq->seqno;
> + __entry->blocking = mutex_is_locked(&rq->i915->dev->struct_mutex);
> ),
>
> - TP_printk("dev=%u, ring=%u, seqno=%u, blocking=%s",
> + TP_printk("dev=%u, ring=%u, seqno=%x, blocking?=%s",
> __entry->dev, __entry->ring, __entry->seqno,
> __entry->blocking ? "yes (NB)" : "no")
> );
>
> -DEFINE_EVENT(i915_gem_request, i915_gem_request_wait_end,
> - TP_PROTO(struct intel_engine_cs *ring, u32 seqno),
> - TP_ARGS(ring, seqno)
> -);
> -
> -DECLARE_EVENT_CLASS(i915_ring,
> - TP_PROTO(struct intel_engine_cs *ring),
> - TP_ARGS(ring),
> +TRACE_EVENT(i915_gem_request_wait_end,
> + TP_PROTO(struct i915_gem_request *rq),
> + TP_ARGS(rq),
>
> TP_STRUCT__entry(
> __field(u32, dev)
> __field(u32, ring)
> + __field(u32, seqno)
> + __field(bool, completed)
> ),
>
> TP_fast_assign(
> - __entry->dev = ring->dev->primary->index;
> - __entry->ring = ring->id;
> + __entry->dev = rq->i915->dev->primary->index;
> + __entry->ring = rq->engine->id;
> + __entry->seqno = rq->seqno;
> + __entry->completed = rq->completed;
> ),
>
> - TP_printk("dev=%u, ring=%u", __entry->dev, __entry->ring)
> -);
> -
> -DEFINE_EVENT(i915_ring, i915_ring_wait_begin,
> - TP_PROTO(struct intel_engine_cs *ring),
> - TP_ARGS(ring)
> -);
> -
> -DEFINE_EVENT(i915_ring, i915_ring_wait_end,
> - TP_PROTO(struct intel_engine_cs *ring),
> - TP_ARGS(ring)
> + TP_printk("dev=%u, ring=%u, seqno=%x, completed=%s",
> + __entry->dev, __entry->ring, __entry->seqno,
> + __entry->completed ? "yes" : "no")
> );
>
> TRACE_EVENT(i915_flip_request,
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 479e50a2ef98..049eb0fc09f3 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2189,7 +2189,7 @@ static int intel_align_height(struct drm_device *dev, int height, bool tiled)
> int
> intel_pin_and_fence_fb_obj(struct drm_device *dev,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *pipelined)
> + struct i915_gem_request *pipelined)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> u32 alignment;
> @@ -9053,7 +9053,7 @@ out:
> */
> static void intel_mark_fb_busy(struct drm_device *dev,
> unsigned frontbuffer_bits,
> - struct intel_engine_cs *ring)
> + struct i915_gem_request *rq)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> enum pipe pipe;
> @@ -9066,24 +9066,24 @@ static void intel_mark_fb_busy(struct drm_device *dev,
> continue;
>
> intel_increase_pllclock(dev, pipe);
> - if (ring && intel_fbc_enabled(dev))
> - ring->fbc_dirty = true;
> + if (rq && intel_fbc_enabled(dev))
> + rq->pending_flush |= I915_KICK_FBC;
> }
> }
>
> /**
> * intel_fb_obj_invalidate - invalidate frontbuffer object
> * @obj: GEM object to invalidate
> - * @ring: set for asynchronous rendering
> + * @rq: set for asynchronous rendering
> *
> * This function gets called every time rendering on the given object starts and
> * frontbuffer caching (fbc, low refresh rate for DRRS, panel self refresh) must
> - * be invalidated. If @ring is non-NULL any subsequent invalidation will be delayed
> + * be invalidated. If @engine is non-NULL any subsequent invalidation will be delayed
> * until the rendering completes or a flip on this frontbuffer plane is
> * scheduled.
> */
> void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring)
> + struct i915_gem_request *rq)
> {
> struct drm_device *dev = obj->base.dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -9093,7 +9093,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
> if (!obj->frontbuffer_bits)
> return;
>
> - if (ring) {
> + if (rq) {
> mutex_lock(&dev_priv->fb_tracking.lock);
> dev_priv->fb_tracking.busy_bits
> |= obj->frontbuffer_bits;
> @@ -9102,7 +9102,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
> mutex_unlock(&dev_priv->fb_tracking.lock);
> }
>
> - intel_mark_fb_busy(dev, obj->frontbuffer_bits, ring);
> + intel_mark_fb_busy(dev, obj->frontbuffer_bits, rq);
>
> intel_edp_psr_invalidate(dev, obj->frontbuffer_bits);
> }
> @@ -9256,6 +9256,7 @@ static void intel_unpin_work_fn(struct work_struct *__work)
> intel_unpin_fb_obj(work->old_fb_obj);
> drm_gem_object_unreference(&work->pending_flip_obj->base);
> drm_gem_object_unreference(&work->old_fb_obj->base);
> + i915_request_put(work->flip_queued_request);
>
> intel_update_fbc(dev);
> mutex_unlock(&dev->struct_mutex);
> @@ -9379,97 +9380,86 @@ static inline void intel_mark_page_flip_active(struct intel_crtc *intel_crtc)
> smp_wmb();
> }
>
> -static int intel_gen2_queue_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> +static int intel_gen2_queue_flip(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags)
> {
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct intel_ringbuffer *ring;
> u32 flip_mask;
> - int ret;
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 5);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> /* Can't queue multiple flips, so wait for the previous
> * one to finish before executing the next.
> */
> - if (intel_crtc->plane)
> + if (crtc->plane)
> flip_mask = MI_WAIT_FOR_PLANE_B_FLIP;
> else
> flip_mask = MI_WAIT_FOR_PLANE_A_FLIP;
> intel_ring_emit(ring, MI_WAIT_FOR_EVENT | flip_mask);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_emit(ring, MI_DISPLAY_FLIP |
> - MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
> + MI_DISPLAY_FLIP_PLANE(crtc->plane));
> intel_ring_emit(ring, fb->pitches[0]);
> - intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset);
> + intel_ring_emit(ring, crtc->unpin_work->gtt_offset);
> intel_ring_emit(ring, 0); /* aux display base address, unused */
> + intel_ring_advance(ring);
>
> - intel_mark_page_flip_active(intel_crtc);
> - __intel_ring_advance(ring);
> return 0;
> }
>
> -static int intel_gen3_queue_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> +static int intel_gen3_queue_flip(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags)
> {
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct intel_ringbuffer *ring;
> u32 flip_mask;
> - int ret;
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - if (intel_crtc->plane)
> + if (crtc->plane)
> flip_mask = MI_WAIT_FOR_PLANE_B_FLIP;
> else
> flip_mask = MI_WAIT_FOR_PLANE_A_FLIP;
> intel_ring_emit(ring, MI_WAIT_FOR_EVENT | flip_mask);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 |
> - MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
> + MI_DISPLAY_FLIP_PLANE(crtc->plane));
> intel_ring_emit(ring, fb->pitches[0]);
> - intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset);
> - intel_ring_emit(ring, MI_NOOP);
> + intel_ring_emit(ring, crtc->unpin_work->gtt_offset);
> + intel_ring_advance(ring);
>
> - intel_mark_page_flip_active(intel_crtc);
> - __intel_ring_advance(ring);
> return 0;
> }
>
> -static int intel_gen4_queue_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> +static int intel_gen4_queue_flip(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct drm_i915_private *dev_priv = rq->i915;
> + struct intel_ringbuffer *ring;
> uint32_t pf, pipesrc;
> - int ret;
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> /* i965+ uses the linear or tiled offsets from the
> * Display Registers (which do not change across a page-flip)
> * so we need only reprogram the base address.
> */
> intel_ring_emit(ring, MI_DISPLAY_FLIP |
> - MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
> + MI_DISPLAY_FLIP_PLANE(crtc->plane));
> intel_ring_emit(ring, fb->pitches[0]);
> - intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset |
> + intel_ring_emit(ring, crtc->unpin_work->gtt_offset |
> obj->tiling_mode);
>
> /* XXX Enabling the panel-fitter across page-flip is so far
> @@ -9477,62 +9467,57 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
> * pf = I915_READ(pipe == 0 ? PFA_CTL_1 : PFB_CTL_1) & PF_ENABLE;
> */
> pf = 0;
> - pipesrc = I915_READ(PIPESRC(intel_crtc->pipe)) & 0x0fff0fff;
> + pipesrc = I915_READ(PIPESRC(crtc->pipe)) & 0x0fff0fff;
> intel_ring_emit(ring, pf | pipesrc);
> + intel_ring_advance(ring);
>
> - intel_mark_page_flip_active(intel_crtc);
> - __intel_ring_advance(ring);
> return 0;
> }
>
> -static int intel_gen6_queue_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> +static int intel_gen6_queue_flip(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct drm_i915_private *dev_priv = rq->i915;
> + struct intel_ringbuffer *ring;
> uint32_t pf, pipesrc;
> - int ret;
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_DISPLAY_FLIP |
> - MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
> + MI_DISPLAY_FLIP_PLANE(crtc->plane));
> intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
> - intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset);
> + intel_ring_emit(ring, crtc->unpin_work->gtt_offset);
>
> /* Contrary to the suggestions in the documentation,
> * "Enable Panel Fitter" does not seem to be required when page
> * flipping with a non-native mode, and worse causes a normal
> * modeset to fail.
> - * pf = I915_READ(PF_CTL(intel_crtc->pipe)) & PF_ENABLE;
> + * pf = I915_READ(PF_CTL(crtc->pipe)) & PF_ENABLE;
> */
> pf = 0;
> - pipesrc = I915_READ(PIPESRC(intel_crtc->pipe)) & 0x0fff0fff;
> + pipesrc = I915_READ(PIPESRC(crtc->pipe)) & 0x0fff0fff;
> intel_ring_emit(ring, pf | pipesrc);
> + intel_ring_advance(ring);
>
> - intel_mark_page_flip_active(intel_crtc);
> - __intel_ring_advance(ring);
> return 0;
> }
>
> -static int intel_gen7_queue_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> +static int intel_gen7_queue_flip(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags)
> {
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct intel_ringbuffer *ring;
> uint32_t plane_bit = 0;
> int len, ret;
>
> - switch (intel_crtc->plane) {
> + switch (crtc->plane) {
> case PLANE_A:
> plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A;
> break;
> @@ -9547,16 +9532,16 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
> return -ENODEV;
> }
>
> - len = 4;
> - if (ring->id == RCS) {
> + len = 3;
> + if (rq->engine->id == RCS) {
> len += 6;
> /*
> * On Gen 8, SRM is now taking an extra dword to accommodate
> * 48bits addresses, and we need a NOOP for the batch size to
> * stay even.
> */
> - if (IS_GEN8(dev))
> - len += 2;
> + if (IS_GEN8(rq->i915))
> + len += 1;
> }
>
> /*
> @@ -9569,13 +9554,13 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
> * then do the cacheline alignment, and finally emit the
> * MI_DISPLAY_FLIP.
> */
> - ret = intel_ring_cacheline_align(ring);
> + ret = intel_ring_cacheline_align(rq);
> if (ret)
> return ret;
>
> - ret = intel_ring_begin(ring, len);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, len);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> /* Unmask the flip-done completion message. Note that the bspec says that
> * we should do this for both the BCS and RCS, and that we must not unmask
> @@ -9586,37 +9571,33 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
> * for the RCS also doesn't appear to drop events. Setting the DERRMR
> * to zero does lead to lockups within MI_DISPLAY_FLIP.
> */
> - if (ring->id == RCS) {
> + if (rq->engine->id == RCS) {
> intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> intel_ring_emit(ring, DERRMR);
> intel_ring_emit(ring, ~(DERRMR_PIPEA_PRI_FLIP_DONE |
> DERRMR_PIPEB_PRI_FLIP_DONE |
> DERRMR_PIPEC_PRI_FLIP_DONE));
> - if (IS_GEN8(dev))
> + if (IS_GEN8(rq->i915))
> intel_ring_emit(ring, MI_STORE_REGISTER_MEM_GEN8(1) |
> MI_SRM_LRM_GLOBAL_GTT);
> else
> intel_ring_emit(ring, MI_STORE_REGISTER_MEM(1) |
> MI_SRM_LRM_GLOBAL_GTT);
> intel_ring_emit(ring, DERRMR);
> - intel_ring_emit(ring, ring->scratch.gtt_offset + 256);
> - if (IS_GEN8(dev)) {
> + intel_ring_emit(ring, rq->engine->scratch.gtt_offset + 256);
> + if (IS_GEN8(rq->i915))
> intel_ring_emit(ring, 0);
> - intel_ring_emit(ring, MI_NOOP);
> - }
> }
>
> intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
> - intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
> - intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset);
> - intel_ring_emit(ring, (MI_NOOP));
> + intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
> + intel_ring_emit(ring, crtc->unpin_work->gtt_offset);
> + intel_ring_advance(ring);
>
> - intel_mark_page_flip_active(intel_crtc);
> - __intel_ring_advance(ring);
> return 0;
> }
>
> -static bool use_mmio_flip(struct intel_engine_cs *ring,
> +static bool use_mmio_flip(struct intel_engine_cs *engine,
> struct drm_i915_gem_object *obj)
> {
> /*
> @@ -9627,20 +9608,18 @@ static bool use_mmio_flip(struct intel_engine_cs *ring,
> * So using MMIO flips there would disrupt this mechanism.
> */
>
> - if (ring == NULL)
> + if (engine == NULL)
> return true;
>
> - if (INTEL_INFO(ring->dev)->gen < 5)
> + if (INTEL_INFO(engine->i915)->gen < 5)
> return false;
>
> if (i915.use_mmio_flip < 0)
> return false;
> else if (i915.use_mmio_flip > 0)
> return true;
> - else if (i915.enable_execlists)
> - return true;
> else
> - return ring != obj->ring;
> + return engine != i915_request_engine(obj->last_write.request);
> }
>
> static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> @@ -9671,102 +9650,62 @@ static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> POSTING_READ(DSPSURF(intel_crtc->plane));
> }
>
> -static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> -{
> - struct intel_engine_cs *ring;
> - int ret;
> -
> - lockdep_assert_held(&obj->base.dev->struct_mutex);
> -
> - if (!obj->last_write_seqno)
> - return 0;
> -
> - ring = obj->ring;
> -
> - if (i915_seqno_passed(ring->get_seqno(ring, true),
> - obj->last_write_seqno))
> - return 0;
> +struct flip_work {
> + struct work_struct work;
> + struct i915_gem_request *rq;
> + struct intel_crtc *crtc;
> +};
>
> - ret = i915_gem_check_olr(ring, obj->last_write_seqno);
> - if (ret)
> - return ret;
> +static void intel_mmio_flip_work(struct work_struct *work)
> +{
> + struct flip_work *flip = container_of(work, struct flip_work, work);
>
> - if (WARN_ON(!ring->irq_get(ring)))
> - return 0;
> + if (__i915_request_wait(flip->rq, false, NULL, NULL) == 0)
> + intel_do_mmio_flip(flip->crtc);
>
> - return 1;
> + i915_request_put__unlocked(flip->rq);
> + kfree(flip);
> }
>
> -void intel_notify_mmio_flip(struct intel_engine_cs *ring)
> +static int intel_queue_mmio_flip(struct intel_crtc *crtc,
> + struct i915_gem_request *rq)
> {
> - struct drm_i915_private *dev_priv = to_i915(ring->dev);
> - struct intel_crtc *intel_crtc;
> - unsigned long irq_flags;
> - u32 seqno;
> -
> - seqno = ring->get_seqno(ring, false);
> + struct flip_work *flip;
>
> - spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> - for_each_intel_crtc(ring->dev, intel_crtc) {
> - struct intel_mmio_flip *mmio_flip;
> -
> - mmio_flip = &intel_crtc->mmio_flip;
> - if (mmio_flip->seqno == 0)
> - continue;
> -
> - if (ring->id != mmio_flip->ring_id)
> - continue;
> + if (WARN_ON(crtc->mmio_flip))
> + return -EBUSY;
>
> - if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
> - intel_do_mmio_flip(intel_crtc);
> - mmio_flip->seqno = 0;
> - ring->irq_put(ring);
> - }
> + if (rq == NULL) {
> + intel_do_mmio_flip(crtc);
> + return 0;
> }
> - spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> -}
>
> -static int intel_queue_mmio_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> - struct drm_framebuffer *fb,
> - struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> - uint32_t flags)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> - unsigned long irq_flags;
> - int ret;
> + if (i915_request_complete(rq)) {
> + intel_do_mmio_flip(crtc);
> + return 0;
> + }
>
> - if (WARN_ON(intel_crtc->mmio_flip.seqno))
> - return -EBUSY;
> + flip = kmalloc(sizeof(*flip), GFP_KERNEL);
> + if (flip == NULL)
> + return -ENOMEM;
>
> - ret = intel_postpone_flip(obj);
> - if (ret < 0)
> + INIT_WORK(&flip->work, intel_mmio_flip_work);
> + flip->crtc = crtc;
> + flip->rq = i915_request_get_breadcrumb(rq);
> + if (IS_ERR(flip->rq)) {
> + int ret = PTR_ERR(flip->rq);
> + kfree(flip);
> return ret;
> - if (ret == 0) {
> - intel_do_mmio_flip(intel_crtc);
> - return 0;
> }
>
> - spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> - intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
> - intel_crtc->mmio_flip.ring_id = obj->ring->id;
> - spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> -
> - /*
> - * Double check to catch cases where irq fired before
> - * mmio flip data was ready
> - */
> - intel_notify_mmio_flip(obj->ring);
> + schedule_work(&flip->work);
> return 0;
> }
>
> -static int intel_default_queue_flip(struct drm_device *dev,
> - struct drm_crtc *crtc,
> +static int intel_default_queue_flip(struct i915_gem_request *rq,
> + struct intel_crtc *crtc,
> struct drm_framebuffer *fb,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring,
> uint32_t flags)
> {
> return -ENODEV;
> @@ -9787,9 +9726,8 @@ static bool __intel_pageflip_stall_check(struct drm_device *dev,
> return false;
>
> if (work->flip_ready_vblank == 0) {
> - if (work->flip_queued_ring &&
> - !i915_seqno_passed(work->flip_queued_ring->get_seqno(work->flip_queued_ring, true),
> - work->flip_queued_seqno))
> + if (work->flip_queued_request &&
> + !i915_request_complete(work->flip_queued_request))
> return false;
>
> work->flip_ready_vblank = drm_vblank_count(dev, intel_crtc->pipe);
> @@ -9843,7 +9781,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
> struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> enum pipe pipe = intel_crtc->pipe;
> struct intel_unpin_work *work;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> + struct i915_gem_request *rq;
> unsigned long flags;
> int ret;
>
> @@ -9930,45 +9869,63 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
> work->flip_count = I915_READ(PIPE_FLIPCOUNT_GM45(pipe)) + 1;
>
> if (IS_VALLEYVIEW(dev)) {
> - ring = &dev_priv->ring[BCS];
> + engine = &dev_priv->engine[BCS];
> if (obj->tiling_mode != work->old_fb_obj->tiling_mode)
> /* vlv: DISPLAY_FLIP fails to change tiling */
> - ring = NULL;
> + engine = NULL;
> } else if (IS_IVYBRIDGE(dev)) {
> - ring = &dev_priv->ring[BCS];
> + engine = &dev_priv->engine[BCS];
> } else if (INTEL_INFO(dev)->gen >= 7) {
> - ring = obj->ring;
> - if (ring == NULL || ring->id != RCS)
> - ring = &dev_priv->ring[BCS];
> + engine = i915_request_engine(obj->last_write.request);
> + if (engine == NULL || engine->id != RCS)
> + engine = &dev_priv->engine[BCS];
> } else {
> - ring = &dev_priv->ring[RCS];
> + engine = &dev_priv->engine[RCS];
> }
>
> - ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
> - if (ret)
> - goto cleanup_pending;
> + if (use_mmio_flip(engine, obj)) {
> + rq = i915_request_get(obj->last_write.request);
> +
> + ret = intel_pin_and_fence_fb_obj(dev, obj, rq);
> + if (ret)
> + goto cleanup_rq;
>
> - work->gtt_offset =
> - i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
> + work->gtt_offset =
> + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
>
> - if (use_mmio_flip(ring, obj)) {
> - ret = intel_queue_mmio_flip(dev, crtc, fb, obj, ring,
> - page_flip_flags);
> + ret = intel_queue_mmio_flip(intel_crtc, rq);
> if (ret)
> goto cleanup_unpin;
> -
> - work->flip_queued_seqno = obj->last_write_seqno;
> - work->flip_queued_ring = obj->ring;
> } else {
> - ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
> + struct intel_context *ctx = engine->default_context;
> + if (obj->last_write.request)
> + ctx = obj->last_write.request->ctx;
> + rq = intel_engine_alloc_request(engine, ctx);
> + if (IS_ERR(rq)) {
> + ret = PTR_ERR(rq);
> + goto cleanup_pending;
> + }
> +
> + ret = intel_pin_and_fence_fb_obj(dev, obj, rq);
> + if (ret)
> + goto cleanup_rq;
> +
> + work->gtt_offset =
> + i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
> +
> + ret = dev_priv->display.queue_flip(rq, intel_crtc, fb, obj,
> page_flip_flags);
> if (ret)
> goto cleanup_unpin;
>
> - work->flip_queued_seqno = intel_ring_get_seqno(ring);
> - work->flip_queued_ring = ring;
> + intel_mark_page_flip_active(intel_crtc);
> +
> + ret = i915_request_commit(rq);
> + if (ret)
> + goto cleanup_unpin;
> }
>
> + work->flip_queued_request = rq;
> work->flip_queued_vblank = drm_vblank_count(dev, intel_crtc->pipe);
> work->enable_stall_check = true;
>
> @@ -9985,6 +9942,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>
> cleanup_unpin:
> intel_unpin_fb_obj(obj);
> +cleanup_rq:
> + i915_request_put(rq);
> cleanup_pending:
> atomic_dec(&intel_crtc->unpin_work_count);
> crtc->primary->fb = old_fb;
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 07ce04683c30..b0115e81fb6e 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -382,11 +382,6 @@ struct intel_pipe_wm {
> bool sprites_scaled;
> };
>
> -struct intel_mmio_flip {
> - u32 seqno;
> - u32 ring_id;
> -};
> -
> struct intel_crtc {
> struct drm_crtc base;
> enum pipe pipe;
> @@ -437,7 +432,7 @@ struct intel_crtc {
> } wm;
>
> int scanline_offset;
> - struct intel_mmio_flip mmio_flip;
> + struct i915_gem_request *mmio_flip;
> };
>
> struct intel_plane_wm_parameters {
> @@ -674,8 +669,7 @@ struct intel_unpin_work {
> #define INTEL_FLIP_COMPLETE 2
> u32 flip_count;
> u32 gtt_offset;
> - struct intel_engine_cs *flip_queued_ring;
> - u32 flip_queued_seqno;
> + struct i915_gem_request *flip_queued_request;
> int flip_queued_vblank;
> int flip_ready_vblank;
> bool enable_stall_check;
> @@ -795,7 +789,7 @@ bool intel_has_pending_fb_unpin(struct drm_device *dev);
> int intel_pch_rawclk(struct drm_device *dev);
> void intel_mark_busy(struct drm_device *dev);
> void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *ring);
> + struct i915_gem_request *rq);
> void intel_frontbuffer_flip_prepare(struct drm_device *dev,
> unsigned frontbuffer_bits);
> void intel_frontbuffer_flip_complete(struct drm_device *dev,
> @@ -853,7 +847,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
> struct intel_load_detect_pipe *old);
> int intel_pin_and_fence_fb_obj(struct drm_device *dev,
> struct drm_i915_gem_object *obj,
> - struct intel_engine_cs *pipelined);
> + struct i915_gem_request *pipelined);
> void intel_unpin_fb_obj(struct drm_i915_gem_object *obj);
> struct drm_framebuffer *
> __intel_framebuffer_create(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index bd1b28d99920..d47af931d5ab 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -204,57 +204,12 @@ enum {
> };
> #define GEN8_CTX_ID_SHIFT 32
>
> -/**
> - * intel_sanitize_enable_execlists() - sanitize i915.enable_execlists
> - * @dev: DRM device.
> - * @enable_execlists: value of i915.enable_execlists module parameter.
> - *
> - * Only certain platforms support Execlists (the prerequisites being
> - * support for Logical Ring Contexts and Aliasing PPGTT or better),
> - * and only when enabled via module parameter.
> - *
> - * Return: 1 if Execlists is supported and has to be enabled.
> - */
> -int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists)
> -{
> - WARN_ON(i915.enable_ppgtt == -1);
> -
> - if (enable_execlists == 0)
> - return 0;
> -
> - if (HAS_LOGICAL_RING_CONTEXTS(dev) && USES_PPGTT(dev) &&
> - i915.use_mmio_flip >= 0)
> - return 1;
> -
> - return 0;
> -}
> -
> -/**
> - * intel_execlists_ctx_id() - get the Execlists Context ID
> - * @ctx_obj: Logical Ring Context backing object.
> - *
> - * Do not confuse with ctx->id! Unfortunately we have a name overload
> - * here: the old context ID we pass to userspace as a handler so that
> - * they can refer to a context, and the new context ID we pass to the
> - * ELSP so that the GPU can inform us of the context status via
> - * interrupts.
> - *
> - * Return: 20-bits globally unique context ID.
> - */
> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
> +static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_object *ctx_obj,
> + u32 ctx_id)
> {
> - u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> -
> - /* LRCA is required to be 4K aligned so the more significant 20 bits
> - * are globally unique */
> - return lrca >> 12;
> -}
> -
> -static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_object *ctx_obj)
> -{
> - uint64_t desc;
> - uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> + uint64_t desc, lrca;
>
> + lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
>
> desc = GEN8_CTX_VALID;
> @@ -262,7 +217,7 @@ static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_object *ctx_obj)
> desc |= GEN8_CTX_L3LLC_COHERENT;
> desc |= GEN8_CTX_PRIVILEGE;
> desc |= lrca;
> - desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
> + desc |= (u64)ctx_id << GEN8_CTX_ID_SHIFT;
>
> /* TODO: WaDisableLiteRestore when we start using semaphore
> * signalling between Command Streamers */
> @@ -271,26 +226,39 @@ static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_object *ctx_obj)
> return desc;
> }
>
> -static void execlists_elsp_write(struct intel_engine_cs *ring,
> - struct drm_i915_gem_object *ctx_obj0,
> - struct drm_i915_gem_object *ctx_obj1)
> +static u32 execlists_ctx_write_tail(struct drm_i915_gem_object *obj, u32 tail, u32 tag)
> +{
> + uint32_t *reg_state;
> +
> + reg_state = kmap_atomic(i915_gem_object_get_page(obj, 1));
> + reg_state[CTX_RING_TAIL+1] = tail;
> + kunmap_atomic(reg_state);
> +
> + return execlists_ctx_descriptor(obj, tag);
> +}
> +
> +static void execlists_submit_pair(struct intel_engine_cs *engine,
> + struct i915_gem_request *rq[2])
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - uint64_t temp = 0;
> + struct drm_i915_private *dev_priv = engine->i915;
> + uint64_t tmp;
> uint32_t desc[4];
> unsigned long flags;
>
> /* XXX: You must always write both descriptors in the order below. */
> - if (ctx_obj1)
> - temp = execlists_ctx_descriptor(ctx_obj1);
> - else
> - temp = 0;
> - desc[1] = (u32)(temp >> 32);
> - desc[0] = (u32)temp;
>
> - temp = execlists_ctx_descriptor(ctx_obj0);
> - desc[3] = (u32)(temp >> 32);
> - desc[2] = (u32)temp;
> + tmp = execlists_ctx_write_tail(rq[0]->ctx->ring[engine->id].state,
> + rq[0]->tail, rq[0]->tag);
> + desc[3] = upper_32_bits(tmp);
> + desc[2] = lower_32_bits(tmp);
> +
> + if (rq[1])
> + tmp = execlists_ctx_write_tail(rq[1]->ctx->ring[engine->id].state,
> + rq[1]->tail, rq[1]->tag);
> + else
> + tmp = 0;
> + desc[1] = upper_32_bits(tmp);
> + desc[0] = lower_32_bits(tmp);
>
> /* Set Force Wakeup bit to prevent GT from entering C6 while ELSP writes
> * are in progress.
> @@ -304,14 +272,14 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
> dev_priv->uncore.funcs.force_wake_get(dev_priv, FORCEWAKE_ALL);
> spin_unlock_irqrestore(&dev_priv->uncore.lock, flags);
>
> - I915_WRITE(RING_ELSP(ring), desc[1]);
> - I915_WRITE(RING_ELSP(ring), desc[0]);
> - I915_WRITE(RING_ELSP(ring), desc[3]);
> + I915_WRITE(RING_ELSP(engine), desc[1]);
> + I915_WRITE(RING_ELSP(engine), desc[0]);
> + I915_WRITE(RING_ELSP(engine), desc[3]);
> /* The context is automatically loaded after the following */
> - I915_WRITE(RING_ELSP(ring), desc[2]);
> + I915_WRITE(RING_ELSP(engine), desc[2]);
>
> /* ELSP is a wo register, so use another nearby reg for posting instead */
> - POSTING_READ(RING_EXECLIST_STATUS(ring));
> + POSTING_READ(RING_EXECLIST_STATUS(engine));
>
> /* Release Force Wakeup (see the big comment above). */
> spin_lock_irqsave(&dev_priv->uncore.lock, flags);
> @@ -320,115 +288,58 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
> spin_unlock_irqrestore(&dev_priv->uncore.lock, flags);
> }
>
> -static int execlists_ctx_write_tail(struct drm_i915_gem_object *ctx_obj, u32 tail)
> -{
> - struct page *page;
> - uint32_t *reg_state;
> -
> - page = i915_gem_object_get_page(ctx_obj, 1);
> - reg_state = kmap_atomic(page);
> -
> - reg_state[CTX_RING_TAIL+1] = tail;
> -
> - kunmap_atomic(reg_state);
> -
> - return 0;
> -}
> -
> -static int execlists_submit_context(struct intel_engine_cs *ring,
> - struct intel_context *to0, u32 tail0,
> - struct intel_context *to1, u32 tail1)
> +static u16 next_tag(struct intel_engine_cs *engine)
> {
> - struct drm_i915_gem_object *ctx_obj0;
> - struct drm_i915_gem_object *ctx_obj1 = NULL;
> -
> - ctx_obj0 = to0->engine[ring->id].state;
> - BUG_ON(!ctx_obj0);
> - WARN_ON(!i915_gem_obj_is_pinned(ctx_obj0));
> -
> - execlists_ctx_write_tail(ctx_obj0, tail0);
> -
> - if (to1) {
> - ctx_obj1 = to1->engine[ring->id].state;
> - BUG_ON(!ctx_obj1);
> - WARN_ON(!i915_gem_obj_is_pinned(ctx_obj1));
> -
> - execlists_ctx_write_tail(ctx_obj1, tail1);
> - }
> -
> - execlists_elsp_write(ring, ctx_obj0, ctx_obj1);
> -
> - return 0;
> + /* status tags are limited to 20b, so we use a u16 for convenience */
> + if (++engine->next_tag == 0)
> + ++engine->next_tag;
> + WARN_ON((s16)(engine->next_tag - engine->tag) < 0);
> + return engine->next_tag;
> }
>
> -static void execlists_context_unqueue(struct intel_engine_cs *ring)
> +static void execlists_submit(struct intel_engine_cs *engine)
> {
> - struct intel_ctx_submit_request *req0 = NULL, *req1 = NULL;
> - struct intel_ctx_submit_request *cursor = NULL, *tmp = NULL;
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct i915_gem_request *rq[2] = {};
> + int i = 0;
>
> - assert_spin_locked(&ring->execlist_lock);
> -
> - if (list_empty(&ring->execlist_queue))
> - return;
> + assert_spin_locked(&engine->irqlock);
>
> /* Try to read in pairs */
> - list_for_each_entry_safe(cursor, tmp, &ring->execlist_queue,
> - execlist_link) {
> - if (!req0) {
> - req0 = cursor;
> - } else if (req0->ctx == cursor->ctx) {
> + while (!list_empty(&engine->pending)) {
> + struct i915_gem_request *next;
> +
> + next = list_first_entry(&engine->pending,
> + typeof(*next),
> + engine_list);
> +
> + if (rq[i] == NULL) {
> +new_slot:
> + next->tag = next_tag(engine);
> + rq[i] = next;
> + } else if (rq[i]->ctx == next->ctx) {
> /* Same ctx: ignore first request, as second request
> * will update tail past first request's workload */
> - cursor->elsp_submitted = req0->elsp_submitted;
> - list_del(&req0->execlist_link);
> - queue_work(dev_priv->wq, &req0->work);
> - req0 = cursor;
> + next->tag = rq[i]->tag;
> + rq[i] = next;
> } else {
> - req1 = cursor;
> - break;
> - }
> - }
> -
> - WARN_ON(req1 && req1->elsp_submitted);
> -
> - WARN_ON(execlists_submit_context(ring, req0->ctx, req0->tail,
> - req1 ? req1->ctx : NULL,
> - req1 ? req1->tail : 0));
> -
> - req0->elsp_submitted++;
> - if (req1)
> - req1->elsp_submitted++;
> -}
> + if (++i == ARRAY_SIZE(rq))
> + break;
>
> -static bool execlists_check_remove_request(struct intel_engine_cs *ring,
> - u32 request_id)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - struct intel_ctx_submit_request *head_req;
> -
> - assert_spin_locked(&ring->execlist_lock);
> -
> - head_req = list_first_entry_or_null(&ring->execlist_queue,
> - struct intel_ctx_submit_request,
> - execlist_link);
> -
> - if (head_req != NULL) {
> - struct drm_i915_gem_object *ctx_obj =
> - head_req->ctx->engine[ring->id].state;
> - if (intel_execlists_ctx_id(ctx_obj) == request_id) {
> - WARN(head_req->elsp_submitted == 0,
> - "Never submitted head request\n");
> -
> - if (--head_req->elsp_submitted <= 0) {
> - list_del(&head_req->execlist_link);
> - queue_work(dev_priv->wq, &head_req->work);
> - return true;
> - }
> + goto new_slot;
> }
> +
> + /* Move to requests is staged via the submitted list
> + * so that we can keep the main request list out of
> + * the spinlock coverage.
> + */
> + list_move_tail(&next->engine_list, &engine->submitted);
> }
>
> - return false;
> + execlists_submit_pair(engine, rq);
> +
> + engine->execlists_submitted++;
> + if (rq[1])
> + engine->execlists_submitted++;
> }
>
> /**
> @@ -438,1308 +349,378 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
> * Check the unread Context Status Buffers and manage the submission of new
> * contexts to the ELSP accordingly.
> */
> -void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring)
> +void intel_execlists_irq_handler(struct intel_engine_cs *engine)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - u32 status_pointer;
> + struct drm_i915_private *dev_priv = engine->i915;
> + unsigned long flags;
> u8 read_pointer;
> u8 write_pointer;
> - u32 status;
> - u32 status_id;
> - u32 submit_contexts = 0;
>
> - status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> -
> - read_pointer = ring->next_context_status_buffer;
> - write_pointer = status_pointer & 0x07;
> + read_pointer = engine->next_context_status_buffer;
> + write_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(engine)) & 0x07;
> if (read_pointer > write_pointer)
> write_pointer += 6;
>
> - spin_lock(&ring->execlist_lock);
> + spin_lock_irqsave(&engine->irqlock, flags);
>
> - while (read_pointer < write_pointer) {
> - read_pointer++;
> - status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> - (read_pointer % 6) * 8);
> - status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
> - (read_pointer % 6) * 8 + 4);
> + while (read_pointer++ < write_pointer) {
> + u32 reg = (RING_CONTEXT_STATUS_BUF(engine) +
> + (read_pointer % 6) * 8);
> + u32 status = I915_READ(reg);
>
> if (status & GEN8_CTX_STATUS_PREEMPTED) {
> - if (status & GEN8_CTX_STATUS_LITE_RESTORE) {
> - if (execlists_check_remove_request(ring, status_id))
> - WARN(1, "Lite Restored request removed from queue\n");
> - } else
> + if (status & GEN8_CTX_STATUS_LITE_RESTORE)
> + WARN(1, "Lite Restored request removed from queue\n");
> + else
> WARN(1, "Preemption without Lite Restore\n");
> }
>
> - if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) ||
> - (status & GEN8_CTX_STATUS_ELEMENT_SWITCH)) {
> - if (execlists_check_remove_request(ring, status_id))
> - submit_contexts++;
> + if (status & (GEN8_CTX_STATUS_ACTIVE_IDLE | GEN8_CTX_STATUS_ELEMENT_SWITCH)) {
> + engine->tag = I915_READ(reg + 4);
> + engine->execlists_submitted--;
> }
> }
>
> - if (submit_contexts != 0)
> - execlists_context_unqueue(ring);
> -
> - spin_unlock(&ring->execlist_lock);
> + if (engine->execlists_submitted < 2)
> + execlists_submit(engine);
>
> - WARN(submit_contexts > 2, "More than two context complete events?\n");
> - ring->next_context_status_buffer = write_pointer % 6;
> + spin_unlock_irqrestore(&engine->irqlock, flags);
>
> - I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
> - ((u32)ring->next_context_status_buffer & 0x07) << 8);
> + engine->next_context_status_buffer = write_pointer % 6;
> + I915_WRITE(RING_CONTEXT_STATUS_PTR(engine),
> + ((u32)engine->next_context_status_buffer & 0x07) << 8);
> }
>
> -static void execlists_free_request_task(struct work_struct *work)
> +static int
> +populate_lr_context(struct intel_context *ctx,
> + struct drm_i915_gem_object *ctx_obj,
> + struct intel_engine_cs *engine)
> {
> - struct intel_ctx_submit_request *req =
> - container_of(work, struct intel_ctx_submit_request, work);
> - struct drm_device *dev = req->ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> - intel_runtime_pm_put(dev_priv);
> -
> - mutex_lock(&dev->struct_mutex);
> - i915_gem_context_unreference(req->ctx);
> - mutex_unlock(&dev->struct_mutex);
> + struct intel_ringbuffer *ring = ctx->ring[engine->id].ring;
> + struct i915_hw_ppgtt *ppgtt;
> + uint32_t *reg_state;
> + int ret;
>
> - kfree(req);
> -}
> + ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true);
> + if (ret) {
> + DRM_DEBUG_DRIVER("Could not set to CPU domain\n");
> + return ret;
> + }
>
> -static int execlists_context_queue(struct intel_engine_cs *ring,
> - struct intel_context *to,
> - u32 tail)
> -{
> - struct intel_ctx_submit_request *req = NULL, *cursor;
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - unsigned long flags;
> - int num_elements = 0;
> + ret = i915_gem_object_get_pages(ctx_obj);
> + if (ret) {
> + DRM_DEBUG_DRIVER("Could not get object pages\n");
> + return ret;
> + }
>
> - req = kzalloc(sizeof(*req), GFP_KERNEL);
> - if (req == NULL)
> - return -ENOMEM;
> - req->ctx = to;
> - i915_gem_context_reference(req->ctx);
> - req->ring = ring;
> - req->tail = tail;
> - INIT_WORK(&req->work, execlists_free_request_task);
> + /* The second page of the context object contains some fields which must
> + * be set up prior to the first execution. */
> + reg_state = kmap_atomic(i915_gem_object_get_page(ctx_obj, 1));
>
> - intel_runtime_pm_get(dev_priv);
> + /* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
> + * commands followed by (reg, value) pairs. The values we are setting here are
> + * only for the first context restore: on a subsequent save, the GPU will
> + * recreate this batchbuffer with new values (including all the missing
> + * MI_LOAD_REGISTER_IMM commands that we are not initializing here). */
> + if (engine->id == RCS)
> + reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> + else
> + reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> + reg_state[CTX_LRI_HEADER_0] |= MI_LRI_FORCE_POSTED;
>
> - spin_lock_irqsave(&ring->execlist_lock, flags);
> + reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(engine);
> + reg_state[CTX_CONTEXT_CONTROL+1] =
> + _MASKED_BIT_ENABLE((1<<3) | MI_RESTORE_INHIBIT);
>
> - list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
> - if (++num_elements > 2)
> - break;
> + reg_state[CTX_RING_HEAD] = RING_HEAD(engine->mmio_base);
> + reg_state[CTX_RING_HEAD+1] = 0;
> + reg_state[CTX_RING_TAIL] = RING_TAIL(engine->mmio_base);
> + reg_state[CTX_RING_TAIL+1] = 0;
> + reg_state[CTX_RING_BUFFER_START] = RING_START(engine->mmio_base);
> + reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring->obj);
> + reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(engine->mmio_base);
> + reg_state[CTX_RING_BUFFER_CONTROL+1] =
> + ((ring->size - PAGE_SIZE) & RING_NR_PAGES) | RING_VALID;
>
> - if (num_elements > 2) {
> - struct intel_ctx_submit_request *tail_req;
> + reg_state[CTX_BB_HEAD_U] = engine->mmio_base + 0x168;
> + reg_state[CTX_BB_HEAD_U+1] = 0;
> + reg_state[CTX_BB_HEAD_L] = engine->mmio_base + 0x140;
> + reg_state[CTX_BB_HEAD_L+1] = 0;
> + reg_state[CTX_BB_STATE] = engine->mmio_base + 0x110;
> + reg_state[CTX_BB_STATE+1] = (1<<5);
>
> - tail_req = list_last_entry(&ring->execlist_queue,
> - struct intel_ctx_submit_request,
> - execlist_link);
> + reg_state[CTX_SECOND_BB_HEAD_U] = engine->mmio_base + 0x11c;
> + reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> + reg_state[CTX_SECOND_BB_HEAD_L] = engine->mmio_base + 0x114;
> + reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> + reg_state[CTX_SECOND_BB_STATE] = engine->mmio_base + 0x118;
> + reg_state[CTX_SECOND_BB_STATE+1] = 0;
>
> - if (to == tail_req->ctx) {
> - WARN(tail_req->elsp_submitted != 0,
> - "More than 2 already-submitted reqs queued\n");
> - list_del(&tail_req->execlist_link);
> - queue_work(dev_priv->wq, &tail_req->work);
> - }
> + if (engine->id == RCS) {
> + /* TODO: according to BSpec, the register state context
> + * for CHV does not have these. OTOH, these registers do
> + * exist in CHV. I'm waiting for a clarification */
> + reg_state[CTX_BB_PER_CTX_PTR] = engine->mmio_base + 0x1c0;
> + reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> + reg_state[CTX_RCS_INDIRECT_CTX] = engine->mmio_base + 0x1c4;
> + reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> + reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = engine->mmio_base + 0x1c8;
> + reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> }
>
> - list_add_tail(&req->execlist_link, &ring->execlist_queue);
> - if (num_elements == 0)
> - execlists_context_unqueue(ring);
> -
> - spin_unlock_irqrestore(&ring->execlist_lock, flags);
> + reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> + reg_state[CTX_LRI_HEADER_1] |= MI_LRI_FORCE_POSTED;
> + reg_state[CTX_CTX_TIMESTAMP] = engine->mmio_base + 0x3a8;
> + reg_state[CTX_CTX_TIMESTAMP+1] = 0;
>
> - return 0;
> -}
> + reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(engine, 3);
> + reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(engine, 3);
> + reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(engine, 2);
> + reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(engine, 2);
> + reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(engine, 1);
> + reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(engine, 1);
> + reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(engine, 0);
> + reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(engine, 0);
>
> -static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf)
> -{
> - struct intel_engine_cs *ring = ringbuf->ring;
> - uint32_t flush_domains;
> - int ret;
> + ppgtt = ctx->ppgtt ?: engine->i915->mm.aliasing_ppgtt;
> + reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[3]);
> + reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[3]);
> + reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[2]);
> + reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[2]);
> + reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[1]);
> + reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[1]);
> + reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[0]);
> + reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[0]);
>
> - flush_domains = 0;
> - if (ring->gpu_caches_dirty)
> - flush_domains = I915_GEM_GPU_DOMAINS;
> + if (engine->id == RCS) {
> + reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> + reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> + reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> + }
>
> - ret = ring->emit_flush(ringbuf, I915_GEM_GPU_DOMAINS, flush_domains);
> - if (ret)
> - return ret;
> + kunmap_atomic(reg_state);
>
> - ring->gpu_caches_dirty = false;
> return 0;
> }
>
> -static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf,
> - struct list_head *vmas)
> +static uint32_t get_lr_context_size(struct intel_engine_cs *engine)
> {
> - struct intel_engine_cs *ring = ringbuf->ring;
> - struct i915_vma *vma;
> - uint32_t flush_domains = 0;
> - bool flush_chipset = false;
> - int ret;
> -
> - list_for_each_entry(vma, vmas, exec_list) {
> - struct drm_i915_gem_object *obj = vma->obj;
> -
> - ret = i915_gem_object_sync(obj, ring);
> - if (ret)
> - return ret;
> + int ret = 0;
>
> - if (obj->base.write_domain & I915_GEM_DOMAIN_CPU)
> - flush_chipset |= i915_gem_clflush_object(obj, false);
> + WARN_ON(INTEL_INFO(engine->i915)->gen != 8);
>
> - flush_domains |= obj->base.write_domain;
> + switch (engine->id) {
> + case RCS:
> + ret = GEN8_LR_CONTEXT_RENDER_SIZE;
> + break;
> + case VCS:
> + case BCS:
> + case VECS:
> + case VCS2:
> + ret = GEN8_LR_CONTEXT_OTHER_SIZE;
> + break;
> }
>
> - if (flush_domains & I915_GEM_DOMAIN_GTT)
> - wmb();
> -
> - /* Unconditionally invalidate gpu caches and ensure that we do flush
> - * any residual writes from the previous batch.
> - */
> - return logical_ring_invalidate_all_caches(ringbuf);
> + return ret;
> }
>
> -/**
> - * execlists_submission() - submit a batchbuffer for execution, Execlists style
> - * @dev: DRM device.
> - * @file: DRM file.
> - * @ring: Engine Command Streamer to submit to.
> - * @ctx: Context to employ for this submission.
> - * @args: execbuffer call arguments.
> - * @vmas: list of vmas.
> - * @batch_obj: the batchbuffer to submit.
> - * @exec_start: batchbuffer start virtual address pointer.
> - * @flags: translated execbuffer call flags.
> - *
> - * This is the evil twin version of i915_gem_ringbuffer_submission. It abstracts
> - * away the submission details of the execbuffer ioctl call.
> - *
> - * Return: non-zero if the submission fails.
> - */
> -int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct intel_context *ctx,
> - struct drm_i915_gem_execbuffer2 *args,
> - struct list_head *vmas,
> - struct drm_i915_gem_object *batch_obj,
> - u64 exec_start, u32 flags)
> +static struct intel_ringbuffer *
> +execlists_get_ring(struct intel_engine_cs *engine,
> + struct intel_context *ctx)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
> - int instp_mode;
> - u32 instp_mask;
> + struct drm_i915_gem_object *ctx_obj;
> + struct intel_ringbuffer *ring;
> + uint32_t context_size;
> int ret;
>
> - instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> - instp_mask = I915_EXEC_CONSTANTS_MASK;
> - switch (instp_mode) {
> - case I915_EXEC_CONSTANTS_REL_GENERAL:
> - case I915_EXEC_CONSTANTS_ABSOLUTE:
> - case I915_EXEC_CONSTANTS_REL_SURFACE:
> - if (instp_mode != 0 && ring != &dev_priv->ring[RCS]) {
> - DRM_DEBUG("non-0 rel constants mode on non-RCS\n");
> - return -EINVAL;
> - }
> -
> - if (instp_mode != dev_priv->relative_constants_mode) {
> - if (instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) {
> - DRM_DEBUG("rel surface constants mode invalid on gen5+\n");
> - return -EINVAL;
> - }
> -
> - /* The HW changed the meaning on this bit on gen6 */
> - instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
> - }
> - break;
> - default:
> - DRM_DEBUG("execbuf with unknown constants: %d\n", instp_mode);
> - return -EINVAL;
> + ring = intel_engine_alloc_ring(engine, ctx, 32 * PAGE_SIZE);
> + if (IS_ERR(ring)) {
> + DRM_ERROR("Failed to allocate ringbuffer %s: %ld\n",
> + engine->name, PTR_ERR(ring));
> + return ERR_CAST(ring);
> }
>
> - if (args->num_cliprects != 0) {
> - DRM_DEBUG("clip rectangles are only valid on pre-gen5\n");
> - return -EINVAL;
> - } else {
> - if (args->DR4 == 0xffffffff) {
> - DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
> - args->DR4 = 0;
> - }
> + context_size = round_up(get_lr_context_size(engine), 4096);
>
> - if (args->DR1 || args->DR4 || args->cliprects_ptr) {
> - DRM_DEBUG("0 cliprects but dirt in cliprects fields\n");
> - return -EINVAL;
> - }
> + ctx_obj = i915_gem_alloc_context_obj(engine->i915->dev, context_size);
> + if (IS_ERR(ctx_obj)) {
> + ret = PTR_ERR(ctx_obj);
> + DRM_DEBUG_DRIVER("Alloc LRC backing obj failed: %d\n", ret);
> + return ERR_CAST(ctx_obj);
> }
>
> - if (args->flags & I915_EXEC_GEN7_SOL_RESET) {
> - DRM_DEBUG("sol reset is gen7 only\n");
> - return -EINVAL;
> + ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
> + if (ret) {
> + DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n", ret);
> + goto err_unref;
> }
>
> - ret = execlists_move_to_gpu(ringbuf, vmas);
> - if (ret)
> - return ret;
> + ret = populate_lr_context(ctx, ctx_obj, engine);
> + if (ret) {
> + DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret);
> + goto err_unpin;
> + }
>
> - if (ring == &dev_priv->ring[RCS] &&
> - instp_mode != dev_priv->relative_constants_mode) {
> - ret = intel_logical_ring_begin(ringbuf, 4);
> - if (ret)
> - return ret;
> + ctx->ring[engine->id].state = ctx_obj;
>
> - intel_logical_ring_emit(ringbuf, MI_NOOP);
> - intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
> - intel_logical_ring_emit(ringbuf, INSTPM);
> - intel_logical_ring_emit(ringbuf, instp_mask << 16 | instp_mode);
> - intel_logical_ring_advance(ringbuf);
> + if (ctx == engine->default_context) {
> + struct drm_i915_private *dev_priv = engine->i915;
> + u32 reg;
>
> - dev_priv->relative_constants_mode = instp_mode;
> - }
> + /* The status page is offset 0 from the context object in LRCs. */
> + engine->status_page.gfx_addr = i915_gem_obj_ggtt_offset(ctx_obj);
> + engine->status_page.page_addr = kmap(sg_page(ctx_obj->pages->sgl));
> + if (engine->status_page.page_addr == NULL) {
> + ret = -ENOMEM;
> + goto err_unpin;
> + }
>
> - ret = ring->emit_bb_start(ringbuf, exec_start, flags);
> - if (ret)
> - return ret;
> + engine->status_page.obj = ctx_obj;
>
> - i915_gem_execbuffer_move_to_active(vmas, ring);
> - i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
> + reg = RING_HWS_PGA(engine->mmio_base);
> + I915_WRITE(reg, engine->status_page.gfx_addr);
> + POSTING_READ(reg);
> + }
>
> return 0;
> +
> +err_unpin:
> + i915_gem_object_ggtt_unpin(ctx_obj);
> +err_unref:
> + drm_gem_object_unreference(&ctx_obj->base);
> + return ERR_PTR(ret);
> }
>
> -void intel_logical_ring_stop(struct intel_engine_cs *ring)
> +static void execlists_put_ring(struct intel_ringbuffer *ring,
> + struct intel_context *ctx)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - int ret;
> -
> - if (!intel_ring_initialized(ring))
> - return;
> -
> - ret = intel_ring_idle(ring);
> - if (ret && !i915_reset_in_progress(&to_i915(ring->dev)->gpu_error))
> - DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
> - ring->name, ret);
> -
> - /* TODO: Is this correct with Execlists enabled? */
> - I915_WRITE_MODE(ring, _MASKED_BIT_ENABLE(STOP_RING));
> - if (wait_for_atomic((I915_READ_MODE(ring) & MODE_IDLE) != 0, 1000)) {
> - DRM_ERROR("%s :timed out trying to stop ring\n", ring->name);
> - return;
> - }
> - I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING));
> + intel_ring_free(ring);
> }
>
> -int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf)
> +static int execlists_add_request(struct i915_gem_request *rq)
> {
> - struct intel_engine_cs *ring = ringbuf->ring;
> - int ret;
> + unsigned long flags;
>
> - if (!ring->gpu_caches_dirty)
> - return 0;
> + spin_lock_irqsave(&rq->engine->irqlock, flags);
>
> - ret = ring->emit_flush(ringbuf, 0, I915_GEM_GPU_DOMAINS);
> - if (ret)
> - return ret;
> + list_add_tail(&rq->engine_list, &rq->engine->pending);
> + if (rq->engine->execlists_submitted < 2)
> + execlists_submit(rq->engine);
> +
> + spin_unlock_irqrestore(&rq->engine->irqlock, flags);
>
> - ring->gpu_caches_dirty = false;
> return 0;
> }
>
> -/**
> - * intel_logical_ring_advance_and_submit() - advance the tail and submit the workload
> - * @ringbuf: Logical Ringbuffer to advance.
> - *
> - * The tail is updated in our logical ringbuffer struct, not in the actual context. What
> - * really happens during submission is that the context and current tail will be placed
> - * on a queue waiting for the ELSP to be ready to accept a new context submission. At that
> - * point, the tail *inside* the context is updated and the ELSP written to.
> - */
> -void intel_logical_ring_advance_and_submit(struct intel_ringbuffer *ringbuf)
> +static bool execlists_rq_is_complete(struct i915_gem_request *rq)
> {
> - struct intel_engine_cs *ring = ringbuf->ring;
> - struct intel_context *ctx = ringbuf->FIXME_lrc_ctx;
> -
> - intel_logical_ring_advance(ringbuf);
> -
> - if (intel_ring_stopped(ring))
> - return;
> -
> - execlists_context_queue(ring, ctx, ringbuf->tail);
> + return (s16)(rq->engine->tag - rq->tag) >= 0;
> }
>
> -static int logical_ring_alloc_seqno(struct intel_engine_cs *ring,
> - struct intel_context *ctx)
> +static int execlists_suspend(struct intel_engine_cs *engine)
> {
> - if (ring->outstanding_lazy_seqno)
> - return 0;
> -
> - if (ring->preallocated_lazy_request == NULL) {
> - struct drm_i915_gem_request *request;
> -
> - request = kmalloc(sizeof(*request), GFP_KERNEL);
> - if (request == NULL)
> - return -ENOMEM;
> + struct drm_i915_private *dev_priv = engine->i915;
> + unsigned long flags;
>
> - /* Hold a reference to the context this request belongs to
> - * (we will need it when the time comes to emit/retire the
> - * request).
> - */
> - request->ctx = ctx;
> - i915_gem_context_reference(request->ctx);
> + /* disable submitting more requests until resume */
> + spin_lock_irqsave(&engine->irqlock, flags);
> + engine->execlists_submitted = ~0;
> + spin_unlock_irqrestore(&engine->irqlock, flags);
>
> - ring->preallocated_lazy_request = request;
> - }
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_ENABLE(GFX_REPLAY_MODE) |
> + _MASKED_BIT_DISABLE(GFX_RUN_LIST_ENABLE));
> + POSTING_READ(RING_MODE_GEN7(engine));
> + DRM_DEBUG_DRIVER("Execlists disabled for %s\n", engine->name);
>
> - return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno);
> + return 0;
> }
>
> -static int logical_ring_wait_request(struct intel_ringbuffer *ringbuf,
> - int bytes)
> +static int execlists_resume(struct intel_engine_cs *engine)
> {
> - struct intel_engine_cs *ring = ringbuf->ring;
> - struct drm_i915_gem_request *request;
> - u32 seqno = 0;
> - int ret;
> -
> - if (ringbuf->last_retired_head != -1) {
> - ringbuf->head = ringbuf->last_retired_head;
> - ringbuf->last_retired_head = -1;
> -
> - ringbuf->space = intel_ring_space(ringbuf);
> - if (ringbuf->space >= bytes)
> - return 0;
> - }
> -
> - list_for_each_entry(request, &ring->request_list, list) {
> - if (__intel_ring_space(request->tail, ringbuf->tail,
> - ringbuf->size) >= bytes) {
> - seqno = request->seqno;
> - break;
> - }
> - }
> + struct drm_i915_private *dev_priv = engine->i915;
> + unsigned long flags;
>
> - if (seqno == 0)
> - return -ENOSPC;
> + /* XXX */
> + I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
>
> - ret = i915_wait_seqno(ring, seqno);
> - if (ret)
> - return ret;
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
> + _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> + POSTING_READ(RING_MODE_GEN7(engine));
> + DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
>
> - i915_gem_retire_requests_ring(ring);
> - ringbuf->head = ringbuf->last_retired_head;
> - ringbuf->last_retired_head = -1;
> + spin_lock_irqsave(&engine->irqlock, flags);
> + engine->execlists_submitted = 0;
> + execlists_submit(engine);
> + spin_unlock_irqrestore(&engine->irqlock, flags);
>
> - ringbuf->space = intel_ring_space(ringbuf);
> return 0;
> }
>
> -static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
> - int bytes)
> +static void execlists_retire(struct intel_engine_cs *engine,
> + u32 seqno)
> {
> - struct intel_engine_cs *ring = ringbuf->ring;
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - unsigned long end;
> - int ret;
> -
> - ret = logical_ring_wait_request(ringbuf, bytes);
> - if (ret != -ENOSPC)
> - return ret;
> -
> - /* Force the context submission in case we have been skipping it */
> - intel_logical_ring_advance_and_submit(ringbuf);
> -
> - /* With GEM the hangcheck timer should kick us out of the loop,
> - * leaving it early runs the risk of corrupting GEM state (due
> - * to running on almost untested codepaths). But on resume
> - * timers don't work yet, so prevent a complete hang in that
> - * case by choosing an insanely large timeout. */
> - end = jiffies + 60 * HZ;
> -
> - do {
> - ringbuf->head = I915_READ_HEAD(ring);
> - ringbuf->space = intel_ring_space(ringbuf);
> - if (ringbuf->space >= bytes) {
> - ret = 0;
> - break;
> - }
> -
> - msleep(1);
> -
> - if (dev_priv->mm.interruptible && signal_pending(current)) {
> - ret = -ERESTARTSYS;
> - break;
> - }
> -
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> - dev_priv->mm.interruptible);
> - if (ret)
> - break;
> -
> - if (time_after(jiffies, end)) {
> - ret = -EBUSY;
> - break;
> - }
> - } while (1);
> + unsigned long flags;
>
> - return ret;
> + spin_lock_irqsave(&engine->irqlock, flags);
> + list_splice_tail_init(&engine->submitted, &engine->requests);
> + spin_unlock_irqrestore(&engine->irqlock, flags);
> }
>
> -static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf)
> +static void execlists_reset(struct intel_engine_cs *engine)
> {
> - uint32_t __iomem *virt;
> - int rem = ringbuf->size - ringbuf->tail;
> -
> - if (ringbuf->space < rem) {
> - int ret = logical_ring_wait_for_space(ringbuf, rem);
> -
> - if (ret)
> - return ret;
> - }
> -
> - virt = ringbuf->virtual_start + ringbuf->tail;
> - rem /= 4;
> - while (rem--)
> - iowrite32(MI_NOOP, virt++);
> -
> - ringbuf->tail = 0;
> - ringbuf->space = intel_ring_space(ringbuf);
> + unsigned long flags;
>
> - return 0;
> + spin_lock_irqsave(&engine->irqlock, flags);
> + list_splice_tail_init(&engine->pending, &engine->submitted);
> + spin_unlock_irqrestore(&engine->irqlock, flags);
> }
>
> -static int logical_ring_prepare(struct intel_ringbuffer *ringbuf, int bytes)
> +static bool enable_execlists(struct drm_i915_private *dev_priv)
> {
> - int ret;
> -
> - if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> - ret = logical_ring_wrap_buffer(ringbuf);
> - if (unlikely(ret))
> - return ret;
> - }
> -
> - if (unlikely(ringbuf->space < bytes)) {
> - ret = logical_ring_wait_for_space(ringbuf, bytes);
> - if (unlikely(ret))
> - return ret;
> - }
> + if (!HAS_LOGICAL_RING_CONTEXTS(dev_priv) ||
> + !USES_PPGTT(dev_priv))
> + return false;
>
> - return 0;
> + return i915.enable_execlists;
> }
>
> -/**
> - * intel_logical_ring_begin() - prepare the logical ringbuffer to accept some commands
> - *
> - * @ringbuf: Logical ringbuffer.
> - * @num_dwords: number of DWORDs that we plan to write to the ringbuffer.
> - *
> - * The ringbuffer might not be ready to accept the commands right away (maybe it needs to
> - * be wrapped, or wait a bit for the tail to be updated). This function takes care of that
> - * and also preallocates a request (every workload submission is still mediated through
> - * requests, same as it did with legacy ringbuffer submission).
> - *
> - * Return: non-zero if the ringbuffer is not ready to be written to.
> - */
> -int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, int num_dwords)
> +static const int gen8_irq_shift[] = {
> + [RCS] = GEN8_RCS_IRQ_SHIFT,
> + [VCS] = GEN8_VCS1_IRQ_SHIFT,
> + [BCS] = GEN8_BCS_IRQ_SHIFT,
> + [VECS] = GEN8_VECS_IRQ_SHIFT,
> + [VCS2] = GEN8_VCS2_IRQ_SHIFT,
> +};
> +
> +int intel_engine_enable_execlists(struct intel_engine_cs *engine)
> {
> - struct intel_engine_cs *ring = ringbuf->ring;
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret;
> + if (!enable_execlists(engine->i915))
> + return 0;
>
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> - dev_priv->mm.interruptible);
> - if (ret)
> - return ret;
> + if (WARN_ON(!IS_GEN8(engine->i915)))
> + return 0;
>
> - ret = logical_ring_prepare(ringbuf, num_dwords * sizeof(uint32_t));
> - if (ret)
> - return ret;
> + engine->irq_keep_mask |=
> + GT_CONTEXT_SWITCH_INTERRUPT << gen8_irq_shift[engine->id];
>
> - /* Preallocate the olr before touching the ring */
> - ret = logical_ring_alloc_seqno(ring, ringbuf->FIXME_lrc_ctx);
> - if (ret)
> - return ret;
> + engine->get_ring = execlists_get_ring;
> + engine->put_ring = execlists_put_ring;
> + engine->add_request = execlists_add_request;
> + engine->is_complete = execlists_rq_is_complete;
>
> - ringbuf->space -= num_dwords * sizeof(uint32_t);
> - return 0;
> -}
> + /* Disable semaphores until further notice */
> + engine->semaphore.wait = NULL;
>
> -static int gen8_init_common_ring(struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + engine->suspend = execlists_suspend;
> + engine->resume = execlists_resume;
> + engine->reset = execlists_reset;
> + engine->retire = execlists_retire;
>
> - I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
> - I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff);
> -
> - I915_WRITE(RING_MODE_GEN7(ring),
> - _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
> - _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> - POSTING_READ(RING_MODE_GEN7(ring));
> - DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
> -
> - memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
> -
> - return 0;
> -}
> -
> -static int gen8_init_render_ring(struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret;
> -
> - ret = gen8_init_common_ring(ring);
> - if (ret)
> - return ret;
> -
> - /* We need to disable the AsyncFlip performance optimisations in order
> - * to use MI_WAIT_FOR_EVENT within the CS. It should already be
> - * programmed to '1' on all products.
> - *
> - * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv,bdw,chv
> - */
> - I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(ASYNC_FLIP_PERF_DISABLE));
> -
> - ret = intel_init_pipe_control(ring);
> - if (ret)
> - return ret;
> -
> - I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
> -
> - return ret;
> -}
> -
> -static int gen8_emit_bb_start(struct intel_ringbuffer *ringbuf,
> - u64 offset, unsigned flags)
> -{
> - bool ppgtt = !(flags & I915_DISPATCH_SECURE);
> - int ret;
> -
> - ret = intel_logical_ring_begin(ringbuf, 4);
> - if (ret)
> - return ret;
> -
> - /* FIXME(BDW): Address space and security selectors. */
> - intel_logical_ring_emit(ringbuf, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
> - intel_logical_ring_emit(ringbuf, lower_32_bits(offset));
> - intel_logical_ring_emit(ringbuf, upper_32_bits(offset));
> - intel_logical_ring_emit(ringbuf, MI_NOOP);
> - intel_logical_ring_advance(ringbuf);
> -
> - return 0;
> -}
> -
> -static bool gen8_logical_ring_get_irq(struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - unsigned long flags;
> -
> - if (!dev->irq_enabled)
> - return false;
> -
> - spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0) {
> - I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
> - POSTING_READ(RING_IMR(ring->mmio_base));
> - }
> - spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> -
> - return true;
> -}
> -
> -static void gen8_logical_ring_put_irq(struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - unsigned long flags;
> -
> - spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0) {
> - I915_WRITE_IMR(ring, ~ring->irq_keep_mask);
> - POSTING_READ(RING_IMR(ring->mmio_base));
> - }
> - spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> -}
> -
> -static int gen8_emit_flush(struct intel_ringbuffer *ringbuf,
> - u32 invalidate_domains,
> - u32 unused)
> -{
> - struct intel_engine_cs *ring = ringbuf->ring;
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - uint32_t cmd;
> - int ret;
> -
> - ret = intel_logical_ring_begin(ringbuf, 4);
> - if (ret)
> - return ret;
> -
> - cmd = MI_FLUSH_DW + 1;
> -
> - if (ring == &dev_priv->ring[VCS]) {
> - if (invalidate_domains & I915_GEM_GPU_DOMAINS)
> - cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD |
> - MI_FLUSH_DW_STORE_INDEX |
> - MI_FLUSH_DW_OP_STOREDW;
> - } else {
> - if (invalidate_domains & I915_GEM_DOMAIN_RENDER)
> - cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
> - MI_FLUSH_DW_OP_STOREDW;
> - }
> -
> - intel_logical_ring_emit(ringbuf, cmd);
> - intel_logical_ring_emit(ringbuf,
> - I915_GEM_HWS_SCRATCH_ADDR |
> - MI_FLUSH_DW_USE_GTT);
> - intel_logical_ring_emit(ringbuf, 0); /* upper addr */
> - intel_logical_ring_emit(ringbuf, 0); /* value */
> - intel_logical_ring_advance(ringbuf);
> + /* start suspended */
> + engine->execlists_enabled = true;
> + engine->execlists_submitted = ~0;
>
> return 0;
> }
> -
> -static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
> - u32 invalidate_domains,
> - u32 flush_domains)
> -{
> - struct intel_engine_cs *ring = ringbuf->ring;
> - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> - u32 flags = 0;
> - int ret;
> -
> - flags |= PIPE_CONTROL_CS_STALL;
> -
> - if (flush_domains) {
> - flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> - flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> - }
> -
> - if (invalidate_domains) {
> - flags |= PIPE_CONTROL_TLB_INVALIDATE;
> - flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_QW_WRITE;
> - flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
> - }
> -
> - ret = intel_logical_ring_begin(ringbuf, 6);
> - if (ret)
> - return ret;
> -
> - intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(6));
> - intel_logical_ring_emit(ringbuf, flags);
> - intel_logical_ring_emit(ringbuf, scratch_addr);
> - intel_logical_ring_emit(ringbuf, 0);
> - intel_logical_ring_emit(ringbuf, 0);
> - intel_logical_ring_emit(ringbuf, 0);
> - intel_logical_ring_advance(ringbuf);
> -
> - return 0;
> -}
> -
> -static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
> -{
> - return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
> -}
> -
> -static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno)
> -{
> - intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
> -}
> -
> -static int gen8_emit_request(struct intel_ringbuffer *ringbuf)
> -{
> - struct intel_engine_cs *ring = ringbuf->ring;
> - u32 cmd;
> - int ret;
> -
> - ret = intel_logical_ring_begin(ringbuf, 6);
> - if (ret)
> - return ret;
> -
> - cmd = MI_STORE_DWORD_IMM_GEN8;
> - cmd |= MI_GLOBAL_GTT;
> -
> - intel_logical_ring_emit(ringbuf, cmd);
> - intel_logical_ring_emit(ringbuf,
> - (ring->status_page.gfx_addr +
> - (I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)));
> - intel_logical_ring_emit(ringbuf, 0);
> - intel_logical_ring_emit(ringbuf, ring->outstanding_lazy_seqno);
> - intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
> - intel_logical_ring_emit(ringbuf, MI_NOOP);
> - intel_logical_ring_advance_and_submit(ringbuf);
> -
> - return 0;
> -}
> -
> -/**
> - * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
> - *
> - * @ring: Engine Command Streamer.
> - *
> - */
> -void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> -
> - if (!intel_ring_initialized(ring))
> - return;
> -
> - intel_logical_ring_stop(ring);
> - WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
> - ring->preallocated_lazy_request = NULL;
> - ring->outstanding_lazy_seqno = 0;
> -
> - if (ring->cleanup)
> - ring->cleanup(ring);
> -
> - i915_cmd_parser_fini_ring(ring);
> -
> - if (ring->status_page.obj) {
> - kunmap(sg_page(ring->status_page.obj->pages->sgl));
> - ring->status_page.obj = NULL;
> - }
> -}
> -
> -static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *ring)
> -{
> - int ret;
> -
> - /* Intentionally left blank. */
> - ring->buffer = NULL;
> -
> - ring->dev = dev;
> - INIT_LIST_HEAD(&ring->active_list);
> - INIT_LIST_HEAD(&ring->request_list);
> - init_waitqueue_head(&ring->irq_queue);
> -
> - INIT_LIST_HEAD(&ring->execlist_queue);
> - spin_lock_init(&ring->execlist_lock);
> - ring->next_context_status_buffer = 0;
> -
> - ret = i915_cmd_parser_init_ring(ring);
> - if (ret)
> - return ret;
> -
> - if (ring->init) {
> - ret = ring->init(ring);
> - if (ret)
> - return ret;
> - }
> -
> - ret = intel_lr_context_deferred_create(ring->default_context, ring);
> -
> - return ret;
> -}
> -
> -static int logical_render_ring_init(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> -
> - ring->name = "render ring";
> - ring->id = RCS;
> - ring->mmio_base = RENDER_RING_BASE;
> - ring->irq_enable_mask =
> - GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
> - ring->irq_keep_mask =
> - GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT;
> - if (HAS_L3_DPF(dev))
> - ring->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
> -
> - ring->init = gen8_init_render_ring;
> - ring->cleanup = intel_fini_pipe_control;
> - ring->get_seqno = gen8_get_seqno;
> - ring->set_seqno = gen8_set_seqno;
> - ring->emit_request = gen8_emit_request;
> - ring->emit_flush = gen8_emit_flush_render;
> - ring->irq_get = gen8_logical_ring_get_irq;
> - ring->irq_put = gen8_logical_ring_put_irq;
> - ring->emit_bb_start = gen8_emit_bb_start;
> -
> - return logical_ring_init(dev, ring);
> -}
> -
> -static int logical_bsd_ring_init(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[VCS];
> -
> - ring->name = "bsd ring";
> - ring->id = VCS;
> - ring->mmio_base = GEN6_BSD_RING_BASE;
> - ring->irq_enable_mask =
> - GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
> - ring->irq_keep_mask =
> - GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
> -
> - ring->init = gen8_init_common_ring;
> - ring->get_seqno = gen8_get_seqno;
> - ring->set_seqno = gen8_set_seqno;
> - ring->emit_request = gen8_emit_request;
> - ring->emit_flush = gen8_emit_flush;
> - ring->irq_get = gen8_logical_ring_get_irq;
> - ring->irq_put = gen8_logical_ring_put_irq;
> - ring->emit_bb_start = gen8_emit_bb_start;
> -
> - return logical_ring_init(dev, ring);
> -}
> -
> -static int logical_bsd2_ring_init(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[VCS2];
> -
> - ring->name = "bds2 ring";
> - ring->id = VCS2;
> - ring->mmio_base = GEN8_BSD2_RING_BASE;
> - ring->irq_enable_mask =
> - GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
> - ring->irq_keep_mask =
> - GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
> -
> - ring->init = gen8_init_common_ring;
> - ring->get_seqno = gen8_get_seqno;
> - ring->set_seqno = gen8_set_seqno;
> - ring->emit_request = gen8_emit_request;
> - ring->emit_flush = gen8_emit_flush;
> - ring->irq_get = gen8_logical_ring_get_irq;
> - ring->irq_put = gen8_logical_ring_put_irq;
> - ring->emit_bb_start = gen8_emit_bb_start;
> -
> - return logical_ring_init(dev, ring);
> -}
> -
> -static int logical_blt_ring_init(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[BCS];
> -
> - ring->name = "blitter ring";
> - ring->id = BCS;
> - ring->mmio_base = BLT_RING_BASE;
> - ring->irq_enable_mask =
> - GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> - ring->irq_keep_mask =
> - GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> -
> - ring->init = gen8_init_common_ring;
> - ring->get_seqno = gen8_get_seqno;
> - ring->set_seqno = gen8_set_seqno;
> - ring->emit_request = gen8_emit_request;
> - ring->emit_flush = gen8_emit_flush;
> - ring->irq_get = gen8_logical_ring_get_irq;
> - ring->irq_put = gen8_logical_ring_put_irq;
> - ring->emit_bb_start = gen8_emit_bb_start;
> -
> - return logical_ring_init(dev, ring);
> -}
> -
> -static int logical_vebox_ring_init(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[VECS];
> -
> - ring->name = "video enhancement ring";
> - ring->id = VECS;
> - ring->mmio_base = VEBOX_RING_BASE;
> - ring->irq_enable_mask =
> - GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
> - ring->irq_keep_mask =
> - GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
> -
> - ring->init = gen8_init_common_ring;
> - ring->get_seqno = gen8_get_seqno;
> - ring->set_seqno = gen8_set_seqno;
> - ring->emit_request = gen8_emit_request;
> - ring->emit_flush = gen8_emit_flush;
> - ring->irq_get = gen8_logical_ring_get_irq;
> - ring->irq_put = gen8_logical_ring_put_irq;
> - ring->emit_bb_start = gen8_emit_bb_start;
> -
> - return logical_ring_init(dev, ring);
> -}
> -
> -/**
> - * intel_logical_rings_init() - allocate, populate and init the Engine Command Streamers
> - * @dev: DRM device.
> - *
> - * This function inits the engines for an Execlists submission style (the equivalent in the
> - * legacy ringbuffer submission world would be i915_gem_init_rings). It does it only for
> - * those engines that are present in the hardware.
> - *
> - * Return: non-zero if the initialization failed.
> - */
> -int intel_logical_rings_init(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret;
> -
> - ret = logical_render_ring_init(dev);
> - if (ret)
> - return ret;
> -
> - if (HAS_BSD(dev)) {
> - ret = logical_bsd_ring_init(dev);
> - if (ret)
> - goto cleanup_render_ring;
> - }
> -
> - if (HAS_BLT(dev)) {
> - ret = logical_blt_ring_init(dev);
> - if (ret)
> - goto cleanup_bsd_ring;
> - }
> -
> - if (HAS_VEBOX(dev)) {
> - ret = logical_vebox_ring_init(dev);
> - if (ret)
> - goto cleanup_blt_ring;
> - }
> -
> - if (HAS_BSD2(dev)) {
> - ret = logical_bsd2_ring_init(dev);
> - if (ret)
> - goto cleanup_vebox_ring;
> - }
> -
> - ret = i915_gem_set_seqno(dev, ((u32)~0 - 0x1000));
> - if (ret)
> - goto cleanup_bsd2_ring;
> -
> - return 0;
> -
> -cleanup_bsd2_ring:
> - intel_logical_ring_cleanup(&dev_priv->ring[VCS2]);
> -cleanup_vebox_ring:
> - intel_logical_ring_cleanup(&dev_priv->ring[VECS]);
> -cleanup_blt_ring:
> - intel_logical_ring_cleanup(&dev_priv->ring[BCS]);
> -cleanup_bsd_ring:
> - intel_logical_ring_cleanup(&dev_priv->ring[VCS]);
> -cleanup_render_ring:
> - intel_logical_ring_cleanup(&dev_priv->ring[RCS]);
> -
> - return ret;
> -}
> -
> -int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
> - struct intel_context *ctx)
> -{
> - struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
> - struct render_state so;
> - struct drm_i915_file_private *file_priv = ctx->file_priv;
> - struct drm_file *file = file_priv ? file_priv->file : NULL;
> - int ret;
> -
> - ret = i915_gem_render_state_prepare(ring, &so);
> - if (ret)
> - return ret;
> -
> - if (so.rodata == NULL)
> - return 0;
> -
> - ret = ring->emit_bb_start(ringbuf,
> - so.ggtt_offset,
> - I915_DISPATCH_SECURE);
> - if (ret)
> - goto out;
> -
> - i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
> -
> - ret = __i915_add_request(ring, file, so.obj, NULL);
> - /* intel_logical_ring_add_request moves object to inactive if it
> - * fails */
> -out:
> - i915_gem_render_state_fini(&so);
> - return ret;
> -}
> -
> -static int
> -populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_obj,
> - struct intel_engine_cs *ring, struct intel_ringbuffer *ringbuf)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct drm_i915_gem_object *ring_obj = ringbuf->obj;
> - struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
> - struct page *page;
> - uint32_t *reg_state;
> - int ret;
> -
> - if (!ppgtt)
> - ppgtt = dev_priv->mm.aliasing_ppgtt;
> -
> - ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true);
> - if (ret) {
> - DRM_DEBUG_DRIVER("Could not set to CPU domain\n");
> - return ret;
> - }
> -
> - ret = i915_gem_object_get_pages(ctx_obj);
> - if (ret) {
> - DRM_DEBUG_DRIVER("Could not get object pages\n");
> - return ret;
> - }
> -
> - i915_gem_object_pin_pages(ctx_obj);
> -
> - /* The second page of the context object contains some fields which must
> - * be set up prior to the first execution. */
> - page = i915_gem_object_get_page(ctx_obj, 1);
> - reg_state = kmap_atomic(page);
> -
> - /* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
> - * commands followed by (reg, value) pairs. The values we are setting here are
> - * only for the first context restore: on a subsequent save, the GPU will
> - * recreate this batchbuffer with new values (including all the missing
> - * MI_LOAD_REGISTER_IMM commands that we are not initializing here). */
> - if (ring->id == RCS)
> - reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14);
> - else
> - reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11);
> - reg_state[CTX_LRI_HEADER_0] |= MI_LRI_FORCE_POSTED;
> - reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring);
> - reg_state[CTX_CONTEXT_CONTROL+1] =
> - _MASKED_BIT_ENABLE((1<<3) | MI_RESTORE_INHIBIT);
> - reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base);
> - reg_state[CTX_RING_HEAD+1] = 0;
> - reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base);
> - reg_state[CTX_RING_TAIL+1] = 0;
> - reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base);
> - reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj);
> - reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base);
> - reg_state[CTX_RING_BUFFER_CONTROL+1] =
> - ((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES) | RING_VALID;
> - reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168;
> - reg_state[CTX_BB_HEAD_U+1] = 0;
> - reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140;
> - reg_state[CTX_BB_HEAD_L+1] = 0;
> - reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110;
> - reg_state[CTX_BB_STATE+1] = (1<<5);
> - reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c;
> - reg_state[CTX_SECOND_BB_HEAD_U+1] = 0;
> - reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114;
> - reg_state[CTX_SECOND_BB_HEAD_L+1] = 0;
> - reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118;
> - reg_state[CTX_SECOND_BB_STATE+1] = 0;
> - if (ring->id == RCS) {
> - /* TODO: according to BSpec, the register state context
> - * for CHV does not have these. OTOH, these registers do
> - * exist in CHV. I'm waiting for a clarification */
> - reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0;
> - reg_state[CTX_BB_PER_CTX_PTR+1] = 0;
> - reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4;
> - reg_state[CTX_RCS_INDIRECT_CTX+1] = 0;
> - reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8;
> - reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0;
> - }
> - reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9);
> - reg_state[CTX_LRI_HEADER_1] |= MI_LRI_FORCE_POSTED;
> - reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8;
> - reg_state[CTX_CTX_TIMESTAMP+1] = 0;
> - reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3);
> - reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3);
> - reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2);
> - reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2);
> - reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1);
> - reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1);
> - reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
> - reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
> - reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[3]);
> - reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[3]);
> - reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[2]);
> - reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[2]);
> - reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[1]);
> - reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[1]);
> - reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pd_dma_addr[0]);
> - reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pd_dma_addr[0]);
> - if (ring->id == RCS) {
> - reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> - reg_state[CTX_R_PWR_CLK_STATE] = 0x20c8;
> - reg_state[CTX_R_PWR_CLK_STATE+1] = 0;
> - }
> -
> - kunmap_atomic(reg_state);
> -
> - ctx_obj->dirty = 1;
> - set_page_dirty(page);
> - i915_gem_object_unpin_pages(ctx_obj);
> -
> - return 0;
> -}
> -
> -/**
> - * intel_lr_context_free() - free the LRC specific bits of a context
> - * @ctx: the LR context to free.
> - *
> - * The real context freeing is done in i915_gem_context_free: this only
> - * takes care of the bits that are LRC related: the per-engine backing
> - * objects and the logical ringbuffer.
> - */
> -void intel_lr_context_free(struct intel_context *ctx)
> -{
> - int i;
> -
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> - struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
> - struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
> -
> - if (ctx_obj) {
> - intel_destroy_ringbuffer_obj(ringbuf);
> - kfree(ringbuf);
> - i915_gem_object_ggtt_unpin(ctx_obj);
> - drm_gem_object_unreference(&ctx_obj->base);
> - }
> - }
> -}
> -
> -static uint32_t get_lr_context_size(struct intel_engine_cs *ring)
> -{
> - int ret = 0;
> -
> - WARN_ON(INTEL_INFO(ring->dev)->gen != 8);
> -
> - switch (ring->id) {
> - case RCS:
> - ret = GEN8_LR_CONTEXT_RENDER_SIZE;
> - break;
> - case VCS:
> - case BCS:
> - case VECS:
> - case VCS2:
> - ret = GEN8_LR_CONTEXT_OTHER_SIZE;
> - break;
> - }
> -
> - return ret;
> -}
> -
> -/**
> - * intel_lr_context_deferred_create() - create the LRC specific bits of a context
> - * @ctx: LR context to create.
> - * @ring: engine to be used with the context.
> - *
> - * This function can be called more than once, with different engines, if we plan
> - * to use the context with them. The context backing objects and the ringbuffers
> - * (specially the ringbuffer backing objects) suck a lot of memory up, and that's why
> - * the creation is a deferred call: it's better to make sure first that we need to use
> - * a given ring with the context.
> - *
> - * Return: non-zero on eror.
> - */
> -int intel_lr_context_deferred_create(struct intel_context *ctx,
> - struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_gem_object *ctx_obj;
> - uint32_t context_size;
> - struct intel_ringbuffer *ringbuf;
> - int ret;
> -
> - WARN_ON(ctx->legacy_hw_ctx.rcs_state != NULL);
> - if (ctx->engine[ring->id].state)
> - return 0;
> -
> - context_size = round_up(get_lr_context_size(ring), 4096);
> -
> - ctx_obj = i915_gem_alloc_context_obj(dev, context_size);
> - if (IS_ERR(ctx_obj)) {
> - ret = PTR_ERR(ctx_obj);
> - DRM_DEBUG_DRIVER("Alloc LRC backing obj failed: %d\n", ret);
> - return ret;
> - }
> -
> - ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
> - if (ret) {
> - DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n", ret);
> - drm_gem_object_unreference(&ctx_obj->base);
> - return ret;
> - }
> -
> - ringbuf = kzalloc(sizeof(*ringbuf), GFP_KERNEL);
> - if (!ringbuf) {
> - DRM_DEBUG_DRIVER("Failed to allocate ringbuffer %s\n",
> - ring->name);
> - i915_gem_object_ggtt_unpin(ctx_obj);
> - drm_gem_object_unreference(&ctx_obj->base);
> - ret = -ENOMEM;
> - return ret;
> - }
> -
> - ringbuf->ring = ring;
> - ringbuf->FIXME_lrc_ctx = ctx;
> -
> - ringbuf->size = 32 * PAGE_SIZE;
> - ringbuf->effective_size = ringbuf->size;
> - ringbuf->head = 0;
> - ringbuf->tail = 0;
> - ringbuf->space = ringbuf->size;
> - ringbuf->last_retired_head = -1;
> -
> - /* TODO: For now we put this in the mappable region so that we can reuse
> - * the existing ringbuffer code which ioremaps it. When we start
> - * creating many contexts, this will no longer work and we must switch
> - * to a kmapish interface.
> - */
> - ret = intel_alloc_ringbuffer_obj(dev, ringbuf);
> - if (ret) {
> - DRM_DEBUG_DRIVER("Failed to allocate ringbuffer obj %s: %d\n",
> - ring->name, ret);
> - goto error;
> - }
> -
> - ret = populate_lr_context(ctx, ctx_obj, ring, ringbuf);
> - if (ret) {
> - DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret);
> - intel_destroy_ringbuffer_obj(ringbuf);
> - goto error;
> - }
> -
> - ctx->engine[ring->id].ringbuf = ringbuf;
> - ctx->engine[ring->id].state = ctx_obj;
> -
> - if (ctx == ring->default_context) {
> - /* The status page is offset 0 from the default context object
> - * in LRC mode. */
> - ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(ctx_obj);
> - ring->status_page.page_addr =
> - kmap(sg_page(ctx_obj->pages->sgl));
> - if (ring->status_page.page_addr == NULL)
> - return -ENOMEM;
> - ring->status_page.obj = ctx_obj;
> - }
> -
> - if (ring->id == RCS && !ctx->rcs_initialized) {
> - ret = intel_lr_context_render_state_init(ring, ctx);
> - if (ret) {
> - DRM_ERROR("Init render state failed: %d\n", ret);
> - ctx->engine[ring->id].ringbuf = NULL;
> - ctx->engine[ring->id].state = NULL;
> - intel_destroy_ringbuffer_obj(ringbuf);
> - goto error;
> - }
> - ctx->rcs_initialized = true;
> - }
> -
> - return 0;
> -
> -error:
> - kfree(ringbuf);
> - i915_gem_object_ggtt_unpin(ctx_obj);
> - drm_gem_object_unreference(&ctx_obj->base);
> - return ret;
> -}
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 33c3b4bf28c5..8b9f5b164ef0 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -31,84 +31,8 @@
> #define RING_CONTEXT_STATUS_BUF(ring) ((ring)->mmio_base+0x370)
> #define RING_CONTEXT_STATUS_PTR(ring) ((ring)->mmio_base+0x3a0)
>
> -/* Logical Rings */
> -void intel_logical_ring_stop(struct intel_engine_cs *ring);
> -void intel_logical_ring_cleanup(struct intel_engine_cs *ring);
> -int intel_logical_rings_init(struct drm_device *dev);
> -
> -int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf);
> -void intel_logical_ring_advance_and_submit(struct intel_ringbuffer *ringbuf);
> -/**
> - * intel_logical_ring_advance() - advance the ringbuffer tail
> - * @ringbuf: Ringbuffer to advance.
> - *
> - * The tail is only updated in our logical ringbuffer struct.
> - */
> -static inline void intel_logical_ring_advance(struct intel_ringbuffer *ringbuf)
> -{
> - ringbuf->tail &= ringbuf->size - 1;
> -}
> -/**
> - * intel_logical_ring_emit() - write a DWORD to the ringbuffer.
> - * @ringbuf: Ringbuffer to write to.
> - * @data: DWORD to write.
> - */
> -static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf,
> - u32 data)
> -{
> - iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
> - ringbuf->tail += 4;
> -}
> -int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, int num_dwords);
> -
> -/* Logical Ring Contexts */
> -int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
> - struct intel_context *ctx);
> -void intel_lr_context_free(struct intel_context *ctx);
> -int intel_lr_context_deferred_create(struct intel_context *ctx,
> - struct intel_engine_cs *ring);
> -
> /* Execlists */
> -int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
> -int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
> - struct intel_engine_cs *ring,
> - struct intel_context *ctx,
> - struct drm_i915_gem_execbuffer2 *args,
> - struct list_head *vmas,
> - struct drm_i915_gem_object *batch_obj,
> - u64 exec_start, u32 flags);
> -u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
> -
> -/**
> - * struct intel_ctx_submit_request - queued context submission request
> - * @ctx: Context to submit to the ELSP.
> - * @ring: Engine to submit it to.
> - * @tail: how far in the context's ringbuffer this request goes to.
> - * @execlist_link: link in the submission queue.
> - * @work: workqueue for processing this request in a bottom half.
> - * @elsp_submitted: no. of times this request has been sent to the ELSP.
> - *
> - * The ELSP only accepts two elements at a time, so we queue context/tail
> - * pairs on a given queue (ring->execlist_queue) until the hardware is
> - * available. The queue serves a double purpose: we also use it to keep track
> - * of the up to 2 contexts currently in the hardware (usually one in execution
> - * and the other queued up by the GPU): We only remove elements from the head
> - * of the queue when the hardware informs us that an element has been
> - * completed.
> - *
> - * All accesses to the queue are mediated by a spinlock (ring->execlist_lock).
> - */
> -struct intel_ctx_submit_request {
> - struct intel_context *ctx;
> - struct intel_engine_cs *ring;
> - u32 tail;
> -
> - struct list_head execlist_link;
> - struct work_struct work;
> -
> - int elsp_submitted;
> -};
> -
> -void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring);
> +int intel_engine_enable_execlists(struct intel_engine_cs *engine);
> +void intel_execlists_irq_handler(struct intel_engine_cs *engine);
>
> #endif /* _INTEL_LRC_H_ */
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index dc2f4f26c961..ae0e5771f730 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -182,7 +182,7 @@ struct intel_overlay {
> u32 flip_addr;
> struct drm_i915_gem_object *reg_bo;
> /* flip handling */
> - uint32_t last_flip_req;
> + struct i915_gem_request *flip_request;
> void (*flip_tail)(struct intel_overlay *);
> };
>
> @@ -208,53 +208,86 @@ static void intel_overlay_unmap_regs(struct intel_overlay *overlay,
> io_mapping_unmap(regs);
> }
>
> -static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
> - void (*tail)(struct intel_overlay *))
> +/* recover from an interruption due to a signal
> + * We have to be careful not to repeat work forever an make forward progess. */
> +static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
> {
> - struct drm_device *dev = overlay->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> int ret;
>
> - BUG_ON(overlay->last_flip_req);
> - ret = i915_add_request(ring, &overlay->last_flip_req);
> - if (ret)
> - return ret;
> + if (overlay->flip_request == NULL)
> + return 0;
>
> - overlay->flip_tail = tail;
> - ret = i915_wait_seqno(ring, overlay->last_flip_req);
> + ret = i915_request_wait(overlay->flip_request);
> if (ret)
> return ret;
> - i915_gem_retire_requests(dev);
>
> - overlay->last_flip_req = 0;
> + i915_request_put(overlay->flip_request);
> + overlay->flip_request = NULL;
> +
> + i915_gem_retire_requests(overlay->dev);
> +
> + if (overlay->flip_tail)
> + overlay->flip_tail(overlay);
> +
> return 0;
> }
>
> +static int intel_overlay_add_request(struct intel_overlay *overlay,
> + struct i915_gem_request *rq,
> + void (*tail)(struct intel_overlay *))
> +{
> + BUG_ON(overlay->flip_request);
> + overlay->flip_request = rq;
> + overlay->flip_tail = tail;
> +
> + return i915_request_commit(rq);
> +}
> +
> +static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
> + struct i915_gem_request *rq,
> + void (*tail)(struct intel_overlay *))
> +{
> + intel_overlay_add_request(overlay, rq, tail);
> + return intel_overlay_recover_from_interrupt(overlay);
> +}
> +
> +static struct i915_gem_request *
> +intel_overlay_alloc_request(struct intel_overlay *overlay)
> +{
> + struct drm_i915_private *i915 = to_i915(overlay->dev);
> + return intel_engine_alloc_request(RCS_ENGINE(i915),
> + RCS_ENGINE(i915)->default_context);
> +}
> +
> /* overlay needs to be disable in OCMD reg */
> static int intel_overlay_on(struct intel_overlay *overlay)
> {
> struct drm_device *dev = overlay->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> - int ret;
> + struct i915_gem_request *rq;
> + struct intel_ringbuffer *ring;
>
> BUG_ON(overlay->active);
> overlay->active = 1;
>
> WARN_ON(IS_I830(dev) && !(dev_priv->quirks & QUIRK_PIPEA_FORCE));
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + rq = intel_overlay_alloc_request(overlay);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> + ring = intel_ring_begin(rq, 3);
> + if (IS_ERR(ring)) {
> + i915_request_put(rq);
> + return PTR_ERR(ring);
> + }
>
> intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_ON);
> intel_ring_emit(ring, overlay->flip_addr | OFC_UPDATE);
> intel_ring_emit(ring, MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> - return intel_overlay_do_wait_request(overlay, NULL);
> + return intel_overlay_do_wait_request(overlay, rq, NULL);
> }
>
> /* overlay needs to be enabled in OCMD reg */
> @@ -263,10 +296,10 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
> {
> struct drm_device *dev = overlay->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> u32 flip_addr = overlay->flip_addr;
> + struct i915_gem_request *rq;
> + struct intel_ringbuffer *ring;
> u32 tmp;
> - int ret;
>
> BUG_ON(!overlay->active);
>
> @@ -278,21 +311,30 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
> if (tmp & (1 << 17))
> DRM_DEBUG("overlay underrun, DOVSTA: %x\n", tmp);
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + rq = intel_overlay_alloc_request(overlay);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> + ring = intel_ring_begin(rq, 2);
> + if (IS_ERR(ring)) {
> + i915_request_put(rq);
> + return PTR_ERR(ring);
> + }
>
> intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_CONTINUE);
> intel_ring_emit(ring, flip_addr);
> intel_ring_advance(ring);
>
> - return i915_add_request(ring, &overlay->last_flip_req);
> + return intel_overlay_add_request(overlay, rq, NULL);
> }
>
> static void intel_overlay_release_old_vid_tail(struct intel_overlay *overlay)
> {
> struct drm_i915_gem_object *obj = overlay->old_vid_bo;
>
> + i915_gem_track_fb(obj, NULL,
> + INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
> +
> i915_gem_object_ggtt_unpin(obj);
> drm_gem_object_unreference(&obj->base);
>
> @@ -319,10 +361,10 @@ static void intel_overlay_off_tail(struct intel_overlay *overlay)
> static int intel_overlay_off(struct intel_overlay *overlay)
> {
> struct drm_device *dev = overlay->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> u32 flip_addr = overlay->flip_addr;
> - int ret;
> + struct i915_gem_request *rq;
> + struct intel_ringbuffer *ring;
> + int len;
>
> BUG_ON(!overlay->active);
>
> @@ -332,53 +374,36 @@ static int intel_overlay_off(struct intel_overlay *overlay)
> * of the hw. Do it in both cases */
> flip_addr |= OFC_UPDATE;
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + rq = intel_overlay_alloc_request(overlay);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> + len = 3;
> + if (!IS_I830(dev))
> + len += 3;
> +
> + ring = intel_ring_begin(rq, len);
> + if (IS_ERR(ring)) {
> + i915_request_put(rq);
> + return PTR_ERR(ring);
> + }
>
> /* wait for overlay to go idle */
> intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_CONTINUE);
> intel_ring_emit(ring, flip_addr);
> intel_ring_emit(ring, MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP);
> - /* turn overlay off */
> - if (IS_I830(dev)) {
> - /* Workaround: Don't disable the overlay fully, since otherwise
> - * it dies on the next OVERLAY_ON cmd. */
> - intel_ring_emit(ring, MI_NOOP);
> - intel_ring_emit(ring, MI_NOOP);
> - intel_ring_emit(ring, MI_NOOP);
> - } else {
> + /* turn overlay off
> + * Workaround for i830: Don't disable the overlay fully, since
> + * otherwise it dies on the next OVERLAY_ON cmd.
> + */
> + if (!IS_I830(dev)) {
> intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_OFF);
> intel_ring_emit(ring, flip_addr);
> intel_ring_emit(ring, MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP);
> }
> intel_ring_advance(ring);
>
> - return intel_overlay_do_wait_request(overlay, intel_overlay_off_tail);
> -}
> -
> -/* recover from an interruption due to a signal
> - * We have to be careful not to repeat work forever an make forward progess. */
> -static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
> -{
> - struct drm_device *dev = overlay->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> - int ret;
> -
> - if (overlay->last_flip_req == 0)
> - return 0;
> -
> - ret = i915_wait_seqno(ring, overlay->last_flip_req);
> - if (ret)
> - return ret;
> - i915_gem_retire_requests(dev);
> -
> - if (overlay->flip_tail)
> - overlay->flip_tail(overlay);
> -
> - overlay->last_flip_req = 0;
> - return 0;
> + return intel_overlay_do_wait_request(overlay, rq, intel_overlay_off_tail);
> }
>
> /* Wait for pending overlay flip and release old frame.
> @@ -387,10 +412,8 @@ static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
> */
> static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
> {
> - struct drm_device *dev = overlay->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> - int ret;
> + struct drm_i915_private *dev_priv = to_i915(overlay->dev);
> + int ret = 0;
>
> /* Only wait if there is actually an old frame to release to
> * guarantee forward progress.
> @@ -399,27 +422,29 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
> return 0;
>
> if (I915_READ(ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) {
> + struct i915_gem_request *rq;
> + struct intel_ringbuffer *ring;
> +
> + rq = intel_overlay_alloc_request(overlay);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> /* synchronous slowpath */
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 1);
> + if (IS_ERR(ring)) {
> + i915_request_put(rq);
> + return PTR_ERR(ring);
> + }
>
> intel_ring_emit(ring, MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> - ret = intel_overlay_do_wait_request(overlay,
> + ret = intel_overlay_do_wait_request(overlay, rq,
> intel_overlay_release_old_vid_tail);
> - if (ret)
> - return ret;
> - }
> -
> - intel_overlay_release_old_vid_tail(overlay);
> -
> + } else
> + intel_overlay_release_old_vid_tail(overlay);
>
> - i915_gem_track_fb(overlay->old_vid_bo, NULL,
> - INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
> - return 0;
> + return ret;
> }
>
> struct put_image_params {
> @@ -821,12 +846,7 @@ int intel_overlay_switch_off(struct intel_overlay *overlay)
> iowrite32(0, ®s->OCMD);
> intel_overlay_unmap_regs(overlay, regs);
>
> - ret = intel_overlay_off(overlay);
> - if (ret != 0)
> - return ret;
> -
> - intel_overlay_off_tail(overlay);
> - return 0;
> + return intel_overlay_off(overlay);
> }
>
> static int check_overlay_possible_on_crtc(struct intel_overlay *overlay,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 45f71e6dc544..46e7cbb5e4d8 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3628,9 +3628,11 @@ static int sanitize_rc6_option(const struct drm_device *dev, int enable_rc6)
> return enable_rc6 & mask;
> }
>
> - /* Disable RC6 on Ironlake */
> - if (INTEL_INFO(dev)->gen == 5)
> +#ifdef CONFIG_INTEL_IOMMU
> + /* Ironlake + RC6 + VT-d empirically blows up */
> + if (IS_GEN5(dev) && intel_iommu_gfx_mapped)
> return 0;
> +#endif
>
> if (IS_IVYBRIDGE(dev))
> return (INTEL_RC6_ENABLE | INTEL_RC6p_ENABLE);
> @@ -3781,7 +3783,7 @@ void bdw_software_turbo(struct drm_device *dev)
> static void gen8_enable_rps(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> uint32_t rc6_mask = 0, rp_state_cap;
> uint32_t threshold_up_pct, threshold_down_pct;
> uint32_t ei_up, ei_down; /* up and down evaluation interval */
> @@ -3808,8 +3810,8 @@ static void gen8_enable_rps(struct drm_device *dev)
> I915_WRITE(GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16);
> I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
> I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
> - for_each_ring(ring, dev_priv, unused)
> - I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
> + for_each_engine(engine, dev_priv, unused)
> + I915_WRITE(RING_MAX_IDLE(engine->mmio_base), 10);
> I915_WRITE(GEN6_RC_SLEEP, 0);
> if (IS_BROADWELL(dev))
> I915_WRITE(GEN6_RC6_THRESHOLD, 625); /* 800us/1.28 for TO */
> @@ -3909,7 +3911,7 @@ static void gen8_enable_rps(struct drm_device *dev)
> static void gen6_enable_rps(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> u32 rp_state_cap;
> u32 rc6vids, pcu_mbox = 0, rc6_mask = 0;
> u32 gtfifodbg;
> @@ -3947,8 +3949,8 @@ static void gen6_enable_rps(struct drm_device *dev)
> I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000);
> I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25);
>
> - for_each_ring(ring, dev_priv, i)
> - I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
> + for_each_engine(engine, dev_priv, i)
> + I915_WRITE(RING_MAX_IDLE(engine->mmio_base), 10);
>
> I915_WRITE(GEN6_RC_SLEEP, 0);
> I915_WRITE(GEN6_RC1e_THRESHOLD, 1000);
> @@ -4408,7 +4410,7 @@ static void valleyview_cleanup_gt_powersave(struct drm_device *dev)
> static void cherryview_enable_rps(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> u32 gtfifodbg, val, rc6_mode = 0, pcbr;
> int i;
>
> @@ -4432,8 +4434,8 @@ static void cherryview_enable_rps(struct drm_device *dev)
> I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
> I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
>
> - for_each_ring(ring, dev_priv, i)
> - I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
> + for_each_engine(engine, dev_priv, i)
> + I915_WRITE(RING_MAX_IDLE(engine->mmio_base), 10);
> I915_WRITE(GEN6_RC_SLEEP, 0);
>
> I915_WRITE(GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
> @@ -4500,7 +4502,7 @@ static void cherryview_enable_rps(struct drm_device *dev)
> static void valleyview_enable_rps(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> u32 gtfifodbg, val, rc6_mode = 0;
> int i;
>
> @@ -4537,8 +4539,8 @@ static void valleyview_enable_rps(struct drm_device *dev)
> I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000);
> I915_WRITE(GEN6_RC_IDLE_HYSTERSIS, 25);
>
> - for_each_ring(ring, dev_priv, i)
> - I915_WRITE(RING_MAX_IDLE(ring->mmio_base), 10);
> + for_each_engine(engine, dev_priv, i)
> + I915_WRITE(RING_MAX_IDLE(engine->mmio_base), 10);
>
> I915_WRITE(GEN6_RC6_THRESHOLD, 0x557);
>
> @@ -4581,12 +4583,6 @@ void ironlake_teardown_rc6(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
>
> - if (dev_priv->ips.renderctx) {
> - i915_gem_object_ggtt_unpin(dev_priv->ips.renderctx);
> - drm_gem_object_unreference(&dev_priv->ips.renderctx->base);
> - dev_priv->ips.renderctx = NULL;
> - }
> -
> if (dev_priv->ips.pwrctx) {
> i915_gem_object_ggtt_unpin(dev_priv->ips.pwrctx);
> drm_gem_object_unreference(&dev_priv->ips.pwrctx->base);
> @@ -4616,11 +4612,6 @@ static int ironlake_setup_rc6(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
>
> - if (dev_priv->ips.renderctx == NULL)
> - dev_priv->ips.renderctx = intel_alloc_context_page(dev);
> - if (!dev_priv->ips.renderctx)
> - return -ENOMEM;
> -
> if (dev_priv->ips.pwrctx == NULL)
> dev_priv->ips.pwrctx = intel_alloc_context_page(dev);
> if (!dev_priv->ips.pwrctx) {
> @@ -4634,9 +4625,6 @@ static int ironlake_setup_rc6(struct drm_device *dev)
> static void ironlake_enable_rc6(struct drm_device *dev)
> {
> struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> - bool was_interruptible;
> - int ret;
>
> /* rc6 disabled by default due to repeated reports of hanging during
> * boot and resume.
> @@ -4646,46 +4634,8 @@ static void ironlake_enable_rc6(struct drm_device *dev)
>
> WARN_ON(!mutex_is_locked(&dev->struct_mutex));
>
> - ret = ironlake_setup_rc6(dev);
> - if (ret)
> - return;
> -
> - was_interruptible = dev_priv->mm.interruptible;
> - dev_priv->mm.interruptible = false;
> -
> - /*
> - * GPU can automatically power down the render unit if given a page
> - * to save state.
> - */
> - ret = intel_ring_begin(ring, 6);
> - if (ret) {
> - ironlake_teardown_rc6(dev);
> - dev_priv->mm.interruptible = was_interruptible;
> - return;
> - }
> -
> - intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
> - intel_ring_emit(ring, MI_SET_CONTEXT);
> - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(dev_priv->ips.renderctx) |
> - MI_MM_SPACE_GTT |
> - MI_SAVE_EXT_STATE_EN |
> - MI_RESTORE_EXT_STATE_EN |
> - MI_RESTORE_INHIBIT);
> - intel_ring_emit(ring, MI_SUSPEND_FLUSH);
> - intel_ring_emit(ring, MI_NOOP);
> - intel_ring_emit(ring, MI_FLUSH);
> - intel_ring_advance(ring);
> -
> - /*
> - * Wait for the command parser to advance past MI_SET_CONTEXT. The HW
> - * does an implicit flush, combined with MI_FLUSH above, it should be
> - * safe to assume that renderctx is valid
> - */
> - ret = intel_ring_idle(ring);
> - dev_priv->mm.interruptible = was_interruptible;
> - if (ret) {
> + if (ironlake_setup_rc6(dev)) {
> DRM_ERROR("failed to enable ironlake power savings\n");
> - ironlake_teardown_rc6(dev);
> return;
> }
>
> @@ -5144,7 +5094,7 @@ EXPORT_SYMBOL_GPL(i915_gpu_lower);
> bool i915_gpu_busy(void)
> {
> struct drm_i915_private *dev_priv;
> - struct intel_engine_cs *ring;
> + struct intel_engine_cs *engine;
> bool ret = false;
> int i;
>
> @@ -5153,8 +5103,8 @@ bool i915_gpu_busy(void)
> goto out_unlock;
> dev_priv = i915_mch_dev;
>
> - for_each_ring(ring, dev_priv, i)
> - ret |= !list_empty(&ring->request_list);
> + for_each_engine(engine, dev_priv, i)
> + ret |= engine->last_request != NULL;
>
> out_unlock:
> spin_unlock_irq(&mchdev_lock);
> diff --git a/drivers/gpu/drm/i915/intel_renderstate.h b/drivers/gpu/drm/i915/intel_renderstate.h
> index 6c792d3a9c9c..fd4f66231d30 100644
> --- a/drivers/gpu/drm/i915/intel_renderstate.h
> +++ b/drivers/gpu/drm/i915/intel_renderstate.h
> @@ -24,7 +24,13 @@
> #ifndef _INTEL_RENDERSTATE_H
> #define _INTEL_RENDERSTATE_H
>
> -#include "i915_drv.h"
> +#include <linux/types.h>
> +
> +struct intel_renderstate_rodata {
> + const u32 *reloc;
> + const u32 *batch;
> + const u32 batch_items;
> +};
>
> extern const struct intel_renderstate_rodata gen6_null_state;
> extern const struct intel_renderstate_rodata gen7_null_state;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 1b1180797851..ae02b1757745 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -33,86 +33,34 @@
> #include "i915_trace.h"
> #include "intel_drv.h"
>
> -bool
> -intel_ring_initialized(struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> -
> - if (!dev)
> - return false;
> -
> - if (i915.enable_execlists) {
> - struct intel_context *dctx = ring->default_context;
> - struct intel_ringbuffer *ringbuf = dctx->engine[ring->id].ringbuf;
> -
> - return ringbuf->obj;
> - } else
> - return ring->buffer && ring->buffer->obj;
> -}
> -
> -int __intel_ring_space(int head, int tail, int size)
> -{
> - int space = head - (tail + I915_RING_FREE_SPACE);
> - if (space < 0)
> - space += size;
> - return space;
> -}
> -
> -int intel_ring_space(struct intel_ringbuffer *ringbuf)
> -{
> - return __intel_ring_space(ringbuf->head & HEAD_ADDR,
> - ringbuf->tail, ringbuf->size);
> -}
> -
> -bool intel_ring_stopped(struct intel_engine_cs *ring)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
> -}
> -
> -void __intel_ring_advance(struct intel_engine_cs *ring)
> -{
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - ringbuf->tail &= ringbuf->size - 1;
> - if (intel_ring_stopped(ring))
> - return;
> - ring->write_tail(ring, ringbuf->tail);
> -}
> -
> static int
> -gen2_render_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate_domains,
> - u32 flush_domains)
> +gen2_emit_flush(struct i915_gem_request *rq, u32 flags)
> {
> + struct intel_ringbuffer *ring;
> u32 cmd;
> - int ret;
>
> cmd = MI_FLUSH;
> - if (((invalidate_domains|flush_domains) & I915_GEM_DOMAIN_RENDER) == 0)
> + if ((flags & (I915_FLUSH_CACHES | I915_INVALIDATE_CACHES)) == 0)
> cmd |= MI_NO_WRITE_FLUSH;
>
> - if (invalidate_domains & I915_GEM_DOMAIN_SAMPLER)
> + if (flags & I915_INVALIDATE_CACHES)
> cmd |= MI_READ_FLUSH;
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 1);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, cmd);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> return 0;
> }
>
> static int
> -gen4_render_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate_domains,
> - u32 flush_domains)
> +gen4_emit_flush(struct i915_gem_request *rq, u32 flags)
> {
> - struct drm_device *dev = ring->dev;
> + struct intel_ringbuffer *ring;
> u32 cmd;
> - int ret;
>
> /*
> * read/write caches:
> @@ -142,22 +90,20 @@ gen4_render_ring_flush(struct intel_engine_cs *ring,
> * are flushed at any MI_FLUSH.
> */
>
> - cmd = MI_FLUSH | MI_NO_WRITE_FLUSH;
> - if ((invalidate_domains|flush_domains) & I915_GEM_DOMAIN_RENDER)
> - cmd &= ~MI_NO_WRITE_FLUSH;
> - if (invalidate_domains & I915_GEM_DOMAIN_INSTRUCTION)
> + cmd = MI_FLUSH;
> + if ((flags & (I915_FLUSH_CACHES | I915_INVALIDATE_CACHES)) == 0)
> + cmd |= MI_NO_WRITE_FLUSH;
> + if (flags & I915_INVALIDATE_CACHES) {
> cmd |= MI_EXE_FLUSH;
> + if (IS_G4X(rq->i915) || IS_GEN5(rq->i915))
> + cmd |= MI_INVALIDATE_ISP;
> + }
>
> - if (invalidate_domains & I915_GEM_DOMAIN_COMMAND &&
> - (IS_G4X(dev) || IS_GEN5(dev)))
> - cmd |= MI_INVALIDATE_ISP;
> -
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 1);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, cmd);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> return 0;
> @@ -201,100 +147,89 @@ gen4_render_ring_flush(struct intel_engine_cs *ring,
> * really our business. That leaves only stall at scoreboard.
> */
> static int
> -intel_emit_post_sync_nonzero_flush(struct intel_engine_cs *ring)
> +gen6_emit_post_sync_nonzero_flush(struct i915_gem_request *rq)
> {
> - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> - int ret;
> + const u32 scratch = rq->engine->scratch.gtt_offset + 2*CACHELINE_BYTES;
> + struct intel_ringbuffer *ring;
>
> + ring = intel_ring_begin(rq, 8);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> -
> - intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(5));
> + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> intel_ring_emit(ring, PIPE_CONTROL_CS_STALL |
> PIPE_CONTROL_STALL_AT_SCOREBOARD);
> - intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT); /* address */
> - intel_ring_emit(ring, 0); /* low dword */
> - intel_ring_emit(ring, 0); /* high dword */
> - intel_ring_emit(ring, MI_NOOP);
> - intel_ring_advance(ring);
> -
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + intel_ring_emit(ring, scratch | PIPE_CONTROL_GLOBAL_GTT);
> + intel_ring_emit(ring, 0);
>
> - intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(5));
> + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> intel_ring_emit(ring, PIPE_CONTROL_QW_WRITE);
> - intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT); /* address */
> - intel_ring_emit(ring, 0);
> + intel_ring_emit(ring, scratch | PIPE_CONTROL_GLOBAL_GTT);
> intel_ring_emit(ring, 0);
> - intel_ring_emit(ring, MI_NOOP);
> - intel_ring_advance(ring);
>
> + intel_ring_advance(ring);
> return 0;
> }
>
> static int
> -gen6_render_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate_domains, u32 flush_domains)
> +gen6_render_emit_flush(struct i915_gem_request *rq, u32 flags)
> {
> - u32 flags = 0;
> - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> + const u32 scratch = rq->engine->scratch.gtt_offset + 2*CACHELINE_BYTES;
> + struct intel_ringbuffer *ring;
> + u32 cmd = 0;
> int ret;
>
> - /* Force SNB workarounds for PIPE_CONTROL flushes */
> - ret = intel_emit_post_sync_nonzero_flush(ring);
> - if (ret)
> - return ret;
> + if (flags & I915_FLUSH_CACHES) {
> + /* Force SNB workarounds for PIPE_CONTROL flushes */
> + ret = gen6_emit_post_sync_nonzero_flush(rq);
> + if (ret)
> + return ret;
>
> - /* Just flush everything. Experiments have shown that reducing the
> - * number of bits based on the write domains has little performance
> - * impact.
> - */
> - if (flush_domains) {
> - flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> - flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> - /*
> - * Ensure that any following seqno writes only happen
> - * when the render cache is indeed flushed.
> - */
> - flags |= PIPE_CONTROL_CS_STALL;
> - }
> - if (invalidate_domains) {
> - flags |= PIPE_CONTROL_TLB_INVALIDATE;
> - flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> + cmd |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> + }
> + if (flags & I915_INVALIDATE_CACHES) {
> + cmd |= PIPE_CONTROL_TLB_INVALIDATE;
> + cmd |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> /*
> * TLB invalidate requires a post-sync write.
> */
> - flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
> + cmd |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
> }
> + if (flags & I915_COMMAND_BARRIER)
> + /*
> + * Ensure that any following seqno writes only happen
> + * when the render cache is indeed flushed.
> + */
> + cmd |= PIPE_CONTROL_CS_STALL;
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + if (cmd) {
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> - intel_ring_emit(ring, flags);
> - intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
> - intel_ring_emit(ring, 0);
> - intel_ring_advance(ring);
> + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> + intel_ring_emit(ring, cmd);
> + intel_ring_emit(ring, scratch | PIPE_CONTROL_GLOBAL_GTT);
> + intel_ring_emit(ring, 0);
> + intel_ring_advance(ring);
> + }
>
> return 0;
> }
>
> static int
> -gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
> +gen7_render_ring_cs_stall_wa(struct i915_gem_request *rq)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> intel_ring_emit(ring, PIPE_CONTROL_CS_STALL |
> @@ -306,35 +241,32 @@ gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
> return 0;
> }
>
> -static int gen7_ring_fbc_flush(struct intel_engine_cs *ring, u32 value)
> +static int gen7_ring_fbc_flush(struct i915_gem_request *rq, u32 value)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - if (!ring->fbc_dirty)
> - return 0;
> + ring = intel_ring_begin(rq, 6);
> + if (IS_ERR(ring))
> + return PTR_ERR(rq);
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> /* WaFbcNukeOn3DBlt:ivb/hsw */
> intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> intel_ring_emit(ring, MSG_FBC_REND_STATE);
> intel_ring_emit(ring, value);
> intel_ring_emit(ring, MI_STORE_REGISTER_MEM(1) | MI_SRM_LRM_GLOBAL_GTT);
> intel_ring_emit(ring, MSG_FBC_REND_STATE);
> - intel_ring_emit(ring, ring->scratch.gtt_offset + 256);
> + intel_ring_emit(ring, rq->engine->scratch.gtt_offset + 256);
> intel_ring_advance(ring);
>
> - ring->fbc_dirty = false;
> return 0;
> }
>
> static int
> -gen7_render_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate_domains, u32 flush_domains)
> +gen7_render_emit_flush(struct i915_gem_request *rq, u32 flags)
> {
> - u32 flags = 0;
> - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> + const u32 scratch_addr = rq->engine->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> + struct intel_ringbuffer *ring;
> + u32 cmd = 0;
> int ret;
>
> /*
> @@ -345,63 +277,71 @@ gen7_render_ring_flush(struct intel_engine_cs *ring,
> * read-cache invalidate bits set) must have the CS_STALL bit set. We
> * don't try to be clever and just set it unconditionally.
> */
> - flags |= PIPE_CONTROL_CS_STALL;
> + cmd |= PIPE_CONTROL_CS_STALL;
>
> /* Just flush everything. Experiments have shown that reducing the
> * number of bits based on the write domains has little performance
> * impact.
> */
> - if (flush_domains) {
> - flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> - flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> - }
> - if (invalidate_domains) {
> - flags |= PIPE_CONTROL_TLB_INVALIDATE;
> - flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> + if (flags & I915_FLUSH_CACHES) {
> + cmd |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> + cmd |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> + }
> + if (flags & I915_INVALIDATE_CACHES) {
> + cmd |= PIPE_CONTROL_TLB_INVALIDATE;
> + cmd |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> /*
> * TLB invalidate requires a post-sync write.
> */
> - flags |= PIPE_CONTROL_QW_WRITE;
> - flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
> + cmd |= PIPE_CONTROL_QW_WRITE;
> + cmd |= PIPE_CONTROL_GLOBAL_GTT_IVB;
>
> /* Workaround: we must issue a pipe_control with CS-stall bit
> * set before a pipe_control command that has the state cache
> * invalidate bit set. */
> - gen7_render_ring_cs_stall_wa(ring);
> + ret = gen7_render_ring_cs_stall_wa(rq);
> + if (ret)
> + return ret;
> }
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> - intel_ring_emit(ring, flags);
> + intel_ring_emit(ring, cmd);
> intel_ring_emit(ring, scratch_addr);
> intel_ring_emit(ring, 0);
> intel_ring_advance(ring);
>
> - if (!invalidate_domains && flush_domains)
> - return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
> + if (flags & I915_KICK_FBC) {
> + ret = gen7_ring_fbc_flush(rq, FBC_REND_NUKE);
> + if (ret)
> + return ret;
> + }
>
> return 0;
> }
>
> static int
> -gen8_emit_pipe_control(struct intel_engine_cs *ring,
> - u32 flags, u32 scratch_addr)
> +gen8_emit_pipe_control(struct i915_gem_request *rq,
> + u32 cmd, u32 scratch_addr)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 6);
> - if (ret)
> - return ret;
> + if (cmd == 0)
> + return 0;
> +
> + ring = intel_ring_begin(rq, 6);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
>
> intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(6));
> - intel_ring_emit(ring, flags);
> + intel_ring_emit(ring, cmd);
> intel_ring_emit(ring, scratch_addr);
> intel_ring_emit(ring, 0);
> intel_ring_emit(ring, 0);
> @@ -412,31 +352,29 @@ gen8_emit_pipe_control(struct intel_engine_cs *ring,
> }
>
> static int
> -gen8_render_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate_domains, u32 flush_domains)
> +gen8_render_emit_flush(struct i915_gem_request *rq,
> + u32 flags)
> {
> - u32 flags = 0;
> - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> + const u32 scratch_addr = rq->engine->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> + u32 cmd = 0;
> int ret;
>
> - flags |= PIPE_CONTROL_CS_STALL;
> -
> - if (flush_domains) {
> - flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> - flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> + if (flags & I915_FLUSH_CACHES) {
> + cmd |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> + cmd |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> }
> - if (invalidate_domains) {
> - flags |= PIPE_CONTROL_TLB_INVALIDATE;
> - flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> - flags |= PIPE_CONTROL_QW_WRITE;
> - flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
> + if (flags & I915_INVALIDATE_CACHES) {
> + cmd |= PIPE_CONTROL_TLB_INVALIDATE;
> + cmd |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
> + cmd |= PIPE_CONTROL_QW_WRITE;
> + cmd |= PIPE_CONTROL_GLOBAL_GTT_IVB;
>
> /* WaCsStallBeforeStateCacheInvalidate:bdw,chv */
> - ret = gen8_emit_pipe_control(ring,
> + ret = gen8_emit_pipe_control(rq,
> PIPE_CONTROL_CS_STALL |
> PIPE_CONTROL_STALL_AT_SCOREBOARD,
> 0);
> @@ -444,304 +382,419 @@ gen8_render_ring_flush(struct intel_engine_cs *ring,
> return ret;
> }
>
> - ret = gen8_emit_pipe_control(ring, flags, scratch_addr);
> + if (flags & I915_COMMAND_BARRIER)
> + cmd |= PIPE_CONTROL_CS_STALL;
> +
> +
> + ret = gen8_emit_pipe_control(rq, cmd, scratch_addr);
> if (ret)
> return ret;
>
> - if (!invalidate_domains && flush_domains)
> - return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
> + if (flags & I915_KICK_FBC) {
> + ret = gen7_ring_fbc_flush(rq, FBC_REND_NUKE);
> + if (ret)
> + return ret;
> + }
>
> return 0;
> }
>
> -static void ring_write_tail(struct intel_engine_cs *ring,
> +static void ring_write_tail(struct intel_engine_cs *engine,
> u32 value)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - I915_WRITE_TAIL(ring, value);
> + struct drm_i915_private *dev_priv = engine->i915;
> + I915_WRITE_TAIL(engine, value);
> }
>
> -u64 intel_ring_get_active_head(struct intel_engine_cs *ring)
> +u64 intel_engine_get_active_head(struct intel_engine_cs *engine)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> u64 acthd;
>
> - if (INTEL_INFO(ring->dev)->gen >= 8)
> - acthd = I915_READ64_2x32(RING_ACTHD(ring->mmio_base),
> - RING_ACTHD_UDW(ring->mmio_base));
> - else if (INTEL_INFO(ring->dev)->gen >= 4)
> - acthd = I915_READ(RING_ACTHD(ring->mmio_base));
> + if (INTEL_INFO(dev_priv)->gen >= 8)
> + acthd = I915_READ64_2x32(RING_ACTHD(engine->mmio_base),
> + RING_ACTHD_UDW(engine->mmio_base));
> + else if (INTEL_INFO(dev_priv)->gen >= 4)
> + acthd = I915_READ(RING_ACTHD(engine->mmio_base));
> else
> acthd = I915_READ(ACTHD);
>
> return acthd;
> }
>
> -static void ring_setup_phys_status_page(struct intel_engine_cs *ring)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - u32 addr;
> -
> - addr = dev_priv->status_page_dmah->busaddr;
> - if (INTEL_INFO(ring->dev)->gen >= 4)
> - addr |= (dev_priv->status_page_dmah->busaddr >> 28) & 0xf0;
> - I915_WRITE(HWS_PGA, addr);
> -}
> -
> -static bool stop_ring(struct intel_engine_cs *ring)
> +static bool engine_stop(struct intel_engine_cs *engine)
> {
> - struct drm_i915_private *dev_priv = to_i915(ring->dev);
> + struct drm_i915_private *dev_priv = engine->i915;
>
> - if (!IS_GEN2(ring->dev)) {
> - I915_WRITE_MODE(ring, _MASKED_BIT_ENABLE(STOP_RING));
> - if (wait_for((I915_READ_MODE(ring) & MODE_IDLE) != 0, 1000)) {
> - DRM_ERROR("%s : timed out trying to stop ring\n", ring->name);
> + if (!IS_GEN2(dev_priv)) {
> + I915_WRITE_MODE(engine, _MASKED_BIT_ENABLE(STOP_RING));
> + if (wait_for((I915_READ_MODE(engine) & MODE_IDLE) != 0, 1000)) {
> + DRM_ERROR("%s : timed out trying to stop ring\n", engine->name);
> /* Sometimes we observe that the idle flag is not
> * set even though the ring is empty. So double
> * check before giving up.
> */
> - if (I915_READ_HEAD(ring) != I915_READ_TAIL(ring))
> + if (I915_READ_HEAD(engine) != I915_READ_TAIL(engine))
> return false;
> }
> }
>
> - I915_WRITE_CTL(ring, 0);
> - I915_WRITE_HEAD(ring, 0);
> - ring->write_tail(ring, 0);
> + I915_WRITE_CTL(engine, 0);
> + I915_WRITE_HEAD(engine, 0);
> + engine->write_tail(engine, 0);
>
> - if (!IS_GEN2(ring->dev)) {
> - (void)I915_READ_CTL(ring);
> - I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING));
> + if (!IS_GEN2(dev_priv)) {
> + (void)I915_READ_CTL(engine);
> + I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
> }
>
> - return (I915_READ_HEAD(ring) & HEAD_ADDR) == 0;
> + return (I915_READ_HEAD(engine) & HEAD_ADDR) == 0;
> +}
> +
> +static int engine_suspend(struct intel_engine_cs *engine)
> +{
> + return engine_stop(engine) ? 0 : -EIO;
> }
>
> -static int init_ring_common(struct intel_engine_cs *ring)
> +static int enable_status_page(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - struct drm_i915_gem_object *obj = ringbuf->obj;
> + struct drm_i915_private *dev_priv = engine->i915;
> + u32 mmio, addr;
> int ret = 0;
>
> - gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> + if (!I915_NEED_GFX_HWS(dev_priv)) {
> + addr = dev_priv->status_page_dmah->busaddr;
> + if (INTEL_INFO(dev_priv)->gen >= 4)
> + addr |= (dev_priv->status_page_dmah->busaddr >> 28) & 0xf0;
> + mmio = HWS_PGA;
> + } else {
> + addr = engine->status_page.gfx_addr;
> + /* The ring status page addresses are no longer next to the rest of
> + * the ring registers as of gen7.
> + */
> + if (IS_GEN7(dev_priv)) {
> + switch (engine->id) {
> + default:
> + case RCS:
> + mmio = RENDER_HWS_PGA_GEN7;
> + break;
> + case BCS:
> + mmio = BLT_HWS_PGA_GEN7;
> + break;
> + /*
> + * VCS2 actually doesn't exist on Gen7. Only shut up
> + * gcc switch check warning
> + */
> + case VCS2:
> + case VCS:
> + mmio = BSD_HWS_PGA_GEN7;
> + break;
> + case VECS:
> + mmio = VEBOX_HWS_PGA_GEN7;
> + break;
> + }
> + } else if (IS_GEN6(dev_priv)) {
> + mmio = RING_HWS_PGA_GEN6(engine->mmio_base);
> + } else {
> + /* XXX: gen8 returns to sanity */
> + mmio = RING_HWS_PGA(engine->mmio_base);
> + }
> + }
> +
> + I915_WRITE(mmio, addr);
> + POSTING_READ(mmio);
> +
> + /*
> + * Flush the TLB for this page
> + *
> + * FIXME: These two bits have disappeared on gen8, so a question
> + * arises: do we still need this and if so how should we go about
> + * invalidating the TLB?
> + */
> + if (INTEL_INFO(dev_priv)->gen >= 6 && INTEL_INFO(dev_priv)->gen < 8) {
> + u32 reg = RING_INSTPM(engine->mmio_base);
> +
> + /* ring should be idle before issuing a sync flush*/
> + WARN_ON((I915_READ_MODE(engine) & MODE_IDLE) == 0);
>
> - if (!stop_ring(ring)) {
> + I915_WRITE(reg,
> + _MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
> + INSTPM_SYNC_FLUSH));
> + if (wait_for((I915_READ(reg) & INSTPM_SYNC_FLUSH) == 0,
> + 1000)) {
> + DRM_ERROR("%s: wait for SyncFlush to complete for TLB invalidation timed out\n",
> + engine->name);
> + ret = -EIO;
> + }
> + }
> +
> + return ret;
> +}
> +
> +static struct intel_ringbuffer *
> +engine_get_ring(struct intel_engine_cs *engine,
> + struct intel_context *ctx)
> +{
> + struct drm_i915_private *dev_priv = engine->i915;
> + struct intel_ringbuffer *ring;
> + int ret = 0;
> +
> + ring = engine->legacy_ring;
> + if (ring)
> + return ring;
> +
> + ring = intel_engine_alloc_ring(engine, ctx, 32 * PAGE_SIZE);
> + if (IS_ERR(ring)) {
> + DRM_ERROR("Failed to allocate ringbuffer for %s: %ld\n", engine->name, PTR_ERR(ring));
> + return ERR_CAST(ring);
> + }
> +
> + gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> + if (!engine_stop(engine)) {
> /* G45 ring initialization often fails to reset head to zero */
> DRM_DEBUG_KMS("%s head not reset to zero "
> "ctl %08x head %08x tail %08x start %08x\n",
> - ring->name,
> - I915_READ_CTL(ring),
> - I915_READ_HEAD(ring),
> - I915_READ_TAIL(ring),
> - I915_READ_START(ring));
> -
> - if (!stop_ring(ring)) {
> + engine->name,
> + I915_READ_CTL(engine),
> + I915_READ_HEAD(engine),
> + I915_READ_TAIL(engine),
> + I915_READ_START(engine));
> + if (!engine_stop(engine)) {
> DRM_ERROR("failed to set %s head to zero "
> "ctl %08x head %08x tail %08x start %08x\n",
> - ring->name,
> - I915_READ_CTL(ring),
> - I915_READ_HEAD(ring),
> - I915_READ_TAIL(ring),
> - I915_READ_START(ring));
> + engine->name,
> + I915_READ_CTL(engine),
> + I915_READ_HEAD(engine),
> + I915_READ_TAIL(engine),
> + I915_READ_START(engine));
> ret = -EIO;
> - goto out;
> }
> }
> + gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
>
> - if (I915_NEED_GFX_HWS(dev))
> - intel_ring_setup_status_page(ring);
> - else
> - ring_setup_phys_status_page(ring);
> + if (ret == 0) {
> + engine->legacy_ring = ring;
> + } else {
> + intel_ring_free(ring);
> + ring = ERR_PTR(ret);
> + }
> +
> + return ring;
> +}
> +
> +static int engine_resume(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_private *dev_priv = engine->i915;
> + struct intel_ringbuffer *ring = engine->legacy_ring;
> + int retry = 3, ret;
> +
> + if (WARN_ON(ring == NULL))
> + return -ENODEV;
> +
> + gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> +
> + ret = enable_status_page(engine);
>
> +reset:
> /* Enforce ordering by reading HEAD register back */
> - I915_READ_HEAD(ring);
> + engine->write_tail(engine, ring->tail);
> + I915_WRITE_HEAD(engine, ring->head);
> + (void)I915_READ_HEAD(engine);
>
> /* Initialize the ring. This must happen _after_ we've cleared the ring
> * registers with the above sequence (the readback of the HEAD registers
> * also enforces ordering), otherwise the hw might lose the new ring
> * register values. */
> - I915_WRITE_START(ring, i915_gem_obj_ggtt_offset(obj));
> + I915_WRITE_START(engine, i915_gem_obj_ggtt_offset(ring->obj));
>
> /* WaClearRingBufHeadRegAtInit:ctg,elk */
> - if (I915_READ_HEAD(ring))
> + if (I915_READ_HEAD(engine) != ring->head)
> DRM_DEBUG("%s initialization failed [head=%08x], fudging\n",
> - ring->name, I915_READ_HEAD(ring));
> - I915_WRITE_HEAD(ring, 0);
> - (void)I915_READ_HEAD(ring);
> -
> - I915_WRITE_CTL(ring,
> - ((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES)
> - | RING_VALID);
> -
> - /* If the head is still not zero, the ring is dead */
> - if (wait_for((I915_READ_CTL(ring) & RING_VALID) != 0 &&
> - I915_READ_START(ring) == i915_gem_obj_ggtt_offset(obj) &&
> - (I915_READ_HEAD(ring) & HEAD_ADDR) == 0, 50)) {
> + engine->name, I915_READ_HEAD(engine));
> + I915_WRITE_HEAD(engine, ring->head);
> + (void)I915_READ_HEAD(engine);
> +
> + I915_WRITE_CTL(engine,
> + ((ring->size - PAGE_SIZE) & RING_NR_PAGES)
> + | RING_VALID);
> +
> + if (wait_for((I915_READ_CTL(engine) & RING_VALID) != 0, 50)) {
> + if (retry-- && engine_stop(engine))
> + goto reset;
> + }
> +
> + if ((I915_READ_CTL(engine) & RING_VALID) == 0 ||
> + I915_READ_START(engine) != i915_gem_obj_ggtt_offset(ring->obj)) {
> DRM_ERROR("%s initialization failed "
> - "ctl %08x (valid? %d) head %08x tail %08x start %08x [expected %08lx]\n",
> - ring->name,
> - I915_READ_CTL(ring), I915_READ_CTL(ring) & RING_VALID,
> - I915_READ_HEAD(ring), I915_READ_TAIL(ring),
> - I915_READ_START(ring), (unsigned long)i915_gem_obj_ggtt_offset(obj));
> + "ctl %08x (valid? %d) head %08x [expected %08x], tail %08x [expected %08x], start %08x [expected %08lx]\n",
> + engine->name,
> + I915_READ_CTL(engine), I915_READ_CTL(engine) & RING_VALID,
> + I915_READ_HEAD(engine), ring->head,
> + I915_READ_TAIL(engine), ring->tail,
> + I915_READ_START(engine), (unsigned long)i915_gem_obj_ggtt_offset(ring->obj));
> ret = -EIO;
> - goto out;
> }
>
> - ringbuf->head = I915_READ_HEAD(ring);
> - ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
> - ringbuf->space = intel_ring_space(ringbuf);
> - ringbuf->last_retired_head = -1;
> + gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
> + return ret;
> +}
>
> - memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
> +static void engine_put_ring(struct intel_ringbuffer *ring,
> + struct intel_context *ctx)
> +{
> + if (ring->last_context == ctx) {
> + struct i915_gem_request *rq;
> + int ret = -EINVAL;
>
> -out:
> - gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
> + rq = intel_engine_alloc_request(ring->engine,
> + ring->engine->default_context);
> + if (!IS_ERR(rq)) {
> + ret = i915_request_commit(rq);
> + i915_request_put(rq);
> + }
> + if (WARN_ON(ret))
> + ring->last_context = ring->engine->default_context;
> + }
> +}
>
> - return ret;
> +static int engine_add_request(struct i915_gem_request *rq)
> +{
> + rq->engine->write_tail(rq->engine, rq->tail);
> + list_add_tail(&rq->engine_list, &rq->engine->requests);
> + return 0;
> }
>
> -void
> -intel_fini_pipe_control(struct intel_engine_cs *ring)
> +static bool engine_rq_is_complete(struct i915_gem_request *rq)
> {
> - struct drm_device *dev = ring->dev;
> + return __i915_seqno_passed(rq->engine->get_seqno(rq->engine),
> + rq->seqno);
> +}
>
> - if (ring->scratch.obj == NULL)
> +static void
> +fini_pipe_control(struct intel_engine_cs *engine)
> +{
> + if (engine->scratch.obj == NULL)
> return;
>
> - if (INTEL_INFO(dev)->gen >= 5) {
> - kunmap(sg_page(ring->scratch.obj->pages->sgl));
> - i915_gem_object_ggtt_unpin(ring->scratch.obj);
> + if (INTEL_INFO(engine->i915)->gen >= 5) {
> + kunmap(sg_page(engine->scratch.obj->pages->sgl));
> + i915_gem_object_ggtt_unpin(engine->scratch.obj);
> }
>
> - drm_gem_object_unreference(&ring->scratch.obj->base);
> - ring->scratch.obj = NULL;
> + drm_gem_object_unreference(&engine->scratch.obj->base);
> + engine->scratch.obj = NULL;
> }
>
> -int
> -intel_init_pipe_control(struct intel_engine_cs *ring)
> +static int
> +init_pipe_control(struct intel_engine_cs *engine)
> {
> int ret;
>
> - if (ring->scratch.obj)
> + if (engine->scratch.obj)
> return 0;
>
> - ring->scratch.obj = i915_gem_alloc_object(ring->dev, 4096);
> - if (ring->scratch.obj == NULL) {
> + engine->scratch.obj = i915_gem_alloc_object(engine->i915->dev, 4096);
> + if (engine->scratch.obj == NULL) {
> DRM_ERROR("Failed to allocate seqno page\n");
> ret = -ENOMEM;
> goto err;
> }
>
> - ret = i915_gem_object_set_cache_level(ring->scratch.obj, I915_CACHE_LLC);
> + ret = i915_gem_object_set_cache_level(engine->scratch.obj, I915_CACHE_LLC);
> if (ret)
> goto err_unref;
>
> - ret = i915_gem_obj_ggtt_pin(ring->scratch.obj, 4096, 0);
> + ret = i915_gem_obj_ggtt_pin(engine->scratch.obj, 4096, 0);
> if (ret)
> goto err_unref;
>
> - ring->scratch.gtt_offset = i915_gem_obj_ggtt_offset(ring->scratch.obj);
> - ring->scratch.cpu_page = kmap(sg_page(ring->scratch.obj->pages->sgl));
> - if (ring->scratch.cpu_page == NULL) {
> + engine->scratch.gtt_offset = i915_gem_obj_ggtt_offset(engine->scratch.obj);
> + engine->scratch.cpu_page = kmap(sg_page(engine->scratch.obj->pages->sgl));
> + if (engine->scratch.cpu_page == NULL) {
> ret = -ENOMEM;
> goto err_unpin;
> }
>
> DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08x\n",
> - ring->name, ring->scratch.gtt_offset);
> + engine->name, engine->scratch.gtt_offset);
> return 0;
>
> err_unpin:
> - i915_gem_object_ggtt_unpin(ring->scratch.obj);
> + i915_gem_object_ggtt_unpin(engine->scratch.obj);
> err_unref:
> - drm_gem_object_unreference(&ring->scratch.obj->base);
> + drm_gem_object_unreference(&engine->scratch.obj->base);
> err:
> + engine->scratch.obj = NULL;
> return ret;
> }
>
> -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
> - u32 addr, u32 value)
> +static int
> +emit_lri(struct i915_gem_request *rq,
> + int num_registers,
> + ...)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct intel_ringbuffer *ring;
> + va_list ap;
>
> - if (WARN_ON(dev_priv->num_wa_regs >= I915_MAX_WA_REGS))
> - return;
> + BUG_ON(num_registers > 60);
>
> - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> - intel_ring_emit(ring, addr);
> - intel_ring_emit(ring, value);
> + ring = intel_ring_begin(rq, 2*num_registers + 1);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
> - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = value & 0xFFFF;
> - /* value is updated with the status of remaining bits of this
> - * register when it is read from debugfs file
> - */
> - dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
> - dev_priv->num_wa_regs++;
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(num_registers));
> + va_start(ap, num_registers);
> + while (num_registers--) {
> + intel_ring_emit(ring, va_arg(ap, u32));
> + intel_ring_emit(ring, va_arg(ap, u32));
> + }
> + va_end(ap);
> + intel_ring_advance(ring);
>
> - return;
> + return 0;
> }
>
> -static int bdw_init_workarounds(struct intel_engine_cs *ring)
> +static int bdw_render_init_context(struct i915_gem_request *rq)
> {
> int ret;
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
>
> - /*
> - * workarounds applied in this fn are part of register state context,
> - * they need to be re-initialized followed by gpu reset, suspend/resume,
> - * module reload.
> - */
> - dev_priv->num_wa_regs = 0;
> - memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> - /*
> - * update the number of dwords required based on the
> - * actual number of workarounds applied
> - */
> - ret = intel_ring_begin(ring, 24);
> - if (ret)
> - return ret;
> + ret = emit_lri(rq, 8,
>
> + /* FIXME: Unclear whether we really need this on production bdw. */
> + GEN8_ROW_CHICKEN,
> /* WaDisablePartialInstShootdown:bdw */
> + _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) |
> /* WaDisableThreadStallDopClockGating:bdw */
> - /* FIXME: Unclear whether we really need this on production bdw. */
> - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> - _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
> - | STALL_DOP_GATING_DISABLE));
> + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE),
>
> + GEN7_ROW_CHICKEN2,
> /* WaDisableDopClockGating:bdw May not be needed for production */
> - intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> - _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> + _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE),
>
> /*
> * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
> * pre-production hardware
> */
> - intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> - _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS
> - | GEN8_SAMPLER_POWER_BYPASS_DIS));
> + HALF_SLICE_CHICKEN3,
> + _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS) |
> + _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS),
>
> - intel_ring_emit_wa(ring, GEN7_HALF_SLICE_CHICKEN1,
> - _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE));
> + GEN7_HALF_SLICE_CHICKEN1,
> + _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE),
>
> - intel_ring_emit_wa(ring, COMMON_SLICE_CHICKEN2,
> - _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE));
> + COMMON_SLICE_CHICKEN2,
> + _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE),
>
> /* Use Force Non-Coherent whenever executing a 3D context. This is a
> * workaround for for a possible hang in the unlikely event a TLB
> * invalidation occurs during a PSD flush.
> */
> - intel_ring_emit_wa(ring, HDC_CHICKEN0,
> - _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT));
> + HDC_CHICKEN0,
> + _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT),
>
> + CACHE_MODE_1,
> /* Wa4x4STCOptimizationDisable:bdw */
> - intel_ring_emit_wa(ring, CACHE_MODE_1,
> - _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
> + _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE),
>
> /*
> * BSpec recommends 8x4 when MSAA is used,
> @@ -751,66 +804,51 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring)
> * disable bit, which we don't touch here, but it's good
> * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
> */
> - intel_ring_emit_wa(ring, GEN7_GT_MODE,
> - GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
> -
> - intel_ring_advance(ring);
> -
> - DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
> - dev_priv->num_wa_regs);
> + GEN7_GT_MODE,
> + GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
> + if (ret)
> + return ret;
>
> - return 0;
> + return i915_gem_render_state_init(rq);
> }
>
> -static int chv_init_workarounds(struct intel_engine_cs *ring)
> +static int chv_render_init_context(struct i915_gem_request *rq)
> {
> int ret;
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> - /*
> - * workarounds applied in this fn are part of register state context,
> - * they need to be re-initialized followed by gpu reset, suspend/resume,
> - * module reload.
> - */
> - dev_priv->num_wa_regs = 0;
> - memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
>
> - ret = intel_ring_begin(ring, 12);
> - if (ret)
> - return ret;
> + ret = emit_lri(rq, 8,
>
> + GEN8_ROW_CHICKEN,
> /* WaDisablePartialInstShootdown:chv */
> - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> - _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
> -
> + _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE) |
> /* WaDisableThreadStallDopClockGating:chv */
> - intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> - _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
> + _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE),
>
> /* WaDisableDopClockGating:chv (pre-production hw) */
> - intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> - _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> + GEN7_ROW_CHICKEN2,
> + _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE),
>
> /* WaDisableSamplerPowerBypass:chv (pre-production hw) */
> - intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> - _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> + HALF_SLICE_CHICKEN3,
> + _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
>
> - intel_ring_advance(ring);
> + if (ret)
> + return ret;
>
> - return 0;
> + return i915_gem_render_state_init(rq);
> }
>
> -static int init_render_ring(struct intel_engine_cs *ring)
> +static int render_resume(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - int ret = init_ring_common(ring);
> + struct drm_i915_private *dev_priv = engine->i915;
> + int ret;
> +
> + ret = engine_resume(engine);
> if (ret)
> return ret;
>
> /* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
> - if (INTEL_INFO(dev)->gen >= 4 && INTEL_INFO(dev)->gen < 7)
> + if (INTEL_INFO(dev_priv)->gen >= 4 && INTEL_INFO(dev_priv)->gen < 7)
> I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
>
> /* We need to disable the AsyncFlip performance optimisations in order
> @@ -819,28 +857,22 @@ static int init_render_ring(struct intel_engine_cs *ring)
> *
> * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv,bdw,chv
> */
> - if (INTEL_INFO(dev)->gen >= 6)
> + if (INTEL_INFO(dev_priv)->gen >= 6)
> I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(ASYNC_FLIP_PERF_DISABLE));
>
> /* Required for the hardware to program scanline values for waiting */
> /* WaEnableFlushTlbInvalidationMode:snb */
> - if (INTEL_INFO(dev)->gen == 6)
> + if (INTEL_INFO(dev_priv)->gen == 6)
> I915_WRITE(GFX_MODE,
> _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT));
>
> /* WaBCSVCSTlbInvalidationMode:ivb,vlv,hsw */
> - if (IS_GEN7(dev))
> + if (IS_GEN7(dev_priv))
> I915_WRITE(GFX_MODE_GEN7,
> _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
> _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
>
> - if (INTEL_INFO(dev)->gen >= 5) {
> - ret = intel_init_pipe_control(ring);
> - if (ret)
> - return ret;
> - }
> -
> - if (IS_GEN6(dev)) {
> + if (IS_GEN6(dev_priv)) {
> /* From the Sandybridge PRM, volume 1 part 3, page 24:
> * "If this bit is set, STCunit will have LRA as replacement
> * policy. [...] This bit must be reset. LRA replacement
> @@ -850,19 +882,40 @@ static int init_render_ring(struct intel_engine_cs *ring)
> _MASKED_BIT_DISABLE(CM0_STC_EVICT_DISABLE_LRA_SNB));
> }
>
> - if (INTEL_INFO(dev)->gen >= 6)
> + if (INTEL_INFO(dev_priv)->gen >= 6)
> I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
>
> - if (HAS_L3_DPF(dev))
> - I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
> + return 0;
> +}
>
> - return ret;
> +static void cleanup_status_page(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_gem_object *obj;
> +
> + obj = engine->status_page.obj;
> + if (obj == NULL)
> + return;
> +
> + kunmap(sg_page(obj->pages->sgl));
> + i915_gem_object_ggtt_unpin(obj);
> + drm_gem_object_unreference(&obj->base);
> + engine->status_page.obj = NULL;
> +}
> +
> +static void engine_cleanup(struct intel_engine_cs *engine)
> +{
> + if (engine->legacy_ring)
> + intel_ring_free(engine->legacy_ring);
> +
> + cleanup_status_page(engine);
> + i915_cmd_parser_fini_engine(engine);
> }
>
> -static void render_ring_cleanup(struct intel_engine_cs *ring)
> +static void render_cleanup(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> +
> + engine_cleanup(engine);
>
> if (dev_priv->semaphore_obj) {
> i915_gem_object_ggtt_unpin(dev_priv->semaphore_obj);
> @@ -870,154 +923,82 @@ static void render_ring_cleanup(struct intel_engine_cs *ring)
> dev_priv->semaphore_obj = NULL;
> }
>
> - intel_fini_pipe_control(ring);
> + fini_pipe_control(engine);
> }
>
> -static int gen8_rcs_signal(struct intel_engine_cs *signaller,
> - unsigned int num_dwords)
> +static int
> +gen8_rcs_emit_signal(struct i915_gem_request *rq, int id)
> {
> -#define MBOX_UPDATE_DWORDS 8
> - struct drm_device *dev = signaller->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *waiter;
> - int i, ret, num_rings;
> -
> - num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
> - num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
> -#undef MBOX_UPDATE_DWORDS
> -
> - ret = intel_ring_begin(signaller, num_dwords);
> - if (ret)
> - return ret;
> + u64 offset = GEN8_SEMAPHORE_OFFSET(rq->i915, rq->engine->id, id);
> + struct intel_ringbuffer *ring;
>
> - for_each_ring(waiter, dev_priv, i) {
> - u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
> - if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
> - continue;
> + ring = intel_ring_begin(rq, 8);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - intel_ring_emit(signaller, GFX_OP_PIPE_CONTROL(6));
> - intel_ring_emit(signaller, PIPE_CONTROL_GLOBAL_GTT_IVB |
> - PIPE_CONTROL_QW_WRITE |
> - PIPE_CONTROL_FLUSH_ENABLE);
> - intel_ring_emit(signaller, lower_32_bits(gtt_offset));
> - intel_ring_emit(signaller, upper_32_bits(gtt_offset));
> - intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
> - intel_ring_emit(signaller, 0);
> - intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
> - MI_SEMAPHORE_TARGET(waiter->id));
> - intel_ring_emit(signaller, 0);
> - }
> + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(6));
> + intel_ring_emit(ring,
> + PIPE_CONTROL_GLOBAL_GTT_IVB |
> + PIPE_CONTROL_QW_WRITE |
> + PIPE_CONTROL_FLUSH_ENABLE);
> + intel_ring_emit(ring, lower_32_bits(offset));
> + intel_ring_emit(ring, upper_32_bits(offset));
> + intel_ring_emit(ring, rq->seqno);
> + intel_ring_emit(ring, 0);
> + intel_ring_emit(ring,
> + MI_SEMAPHORE_SIGNAL |
> + MI_SEMAPHORE_TARGET(id));
> + intel_ring_emit(ring, 0);
> + intel_ring_advance(ring);
>
> return 0;
> }
>
> -static int gen8_xcs_signal(struct intel_engine_cs *signaller,
> - unsigned int num_dwords)
> +static int
> +gen8_xcs_emit_signal(struct i915_gem_request *rq, int id)
> {
> -#define MBOX_UPDATE_DWORDS 6
> - struct drm_device *dev = signaller->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *waiter;
> - int i, ret, num_rings;
> + u64 offset = GEN8_SEMAPHORE_OFFSET(rq->i915, rq->engine->id, id);
> + struct intel_ringbuffer *ring;
>
> - num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
> - num_dwords += (num_rings-1) * MBOX_UPDATE_DWORDS;
> -#undef MBOX_UPDATE_DWORDS
> + ring = intel_ring_begin(rq, 6);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - ret = intel_ring_begin(signaller, num_dwords);
> - if (ret)
> - return ret;
> -
> - for_each_ring(waiter, dev_priv, i) {
> - u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
> - if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
> - continue;
> -
> - intel_ring_emit(signaller, (MI_FLUSH_DW + 1) |
> - MI_FLUSH_DW_OP_STOREDW);
> - intel_ring_emit(signaller, lower_32_bits(gtt_offset) |
> - MI_FLUSH_DW_USE_GTT);
> - intel_ring_emit(signaller, upper_32_bits(gtt_offset));
> - intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
> - intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
> - MI_SEMAPHORE_TARGET(waiter->id));
> - intel_ring_emit(signaller, 0);
> - }
> -
> - return 0;
> -}
> -
> -static int gen6_signal(struct intel_engine_cs *signaller,
> - unsigned int num_dwords)
> -{
> - struct drm_device *dev = signaller->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *useless;
> - int i, ret, num_rings;
> -
> -#define MBOX_UPDATE_DWORDS 3
> - num_rings = hweight32(INTEL_INFO(dev)->ring_mask);
> - num_dwords += round_up((num_rings-1) * MBOX_UPDATE_DWORDS, 2);
> -#undef MBOX_UPDATE_DWORDS
> -
> - ret = intel_ring_begin(signaller, num_dwords);
> - if (ret)
> - return ret;
> -
> - for_each_ring(useless, dev_priv, i) {
> - u32 mbox_reg = signaller->semaphore.mbox.signal[i];
> - if (mbox_reg != GEN6_NOSYNC) {
> - intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
> - intel_ring_emit(signaller, mbox_reg);
> - intel_ring_emit(signaller, signaller->outstanding_lazy_seqno);
> - }
> - }
> -
> - /* If num_dwords was rounded, make sure the tail pointer is correct */
> - if (num_rings % 2 == 0)
> - intel_ring_emit(signaller, MI_NOOP);
> + intel_ring_emit(ring,
> + MI_FLUSH_DW |
> + MI_FLUSH_DW_OP_STOREDW |
> + (4 - 2));
> + intel_ring_emit(ring,
> + lower_32_bits(offset) |
> + MI_FLUSH_DW_USE_GTT);
> + intel_ring_emit(ring, upper_32_bits(offset));
> + intel_ring_emit(ring, rq->seqno);
> + intel_ring_emit(ring,
> + MI_SEMAPHORE_SIGNAL |
> + MI_SEMAPHORE_TARGET(id));
> + intel_ring_emit(ring, 0);
> + intel_ring_advance(ring);
>
> return 0;
> }
>
> -/**
> - * gen6_add_request - Update the semaphore mailbox registers
> - *
> - * @ring - ring that is adding a request
> - * @seqno - return seqno stuck into the ring
> - *
> - * Update the mailbox registers in the *other* rings with the current seqno.
> - * This acts like a signal in the canonical semaphore.
> - */
> static int
> -gen6_add_request(struct intel_engine_cs *ring)
> +gen6_emit_signal(struct i915_gem_request *rq, int id)
> {
> - int ret;
> -
> - if (ring->semaphore.signal)
> - ret = ring->semaphore.signal(ring, 4);
> - else
> - ret = intel_ring_begin(ring, 4);
> + struct intel_ringbuffer *ring;
>
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> - intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
> - intel_ring_emit(ring, ring->outstanding_lazy_seqno);
> - intel_ring_emit(ring, MI_USER_INTERRUPT);
> - __intel_ring_advance(ring);
> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> + intel_ring_emit(ring, rq->engine->semaphore.mbox.signal[id]);
> + intel_ring_emit(ring, rq->seqno);
> + intel_ring_advance(ring);
>
> return 0;
> }
>
> -static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
> - u32 seqno)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - return dev_priv->last_seqno < seqno;
> -}
> -
> /**
> * intel_ring_sync - sync the waiter to the signaller on seqno
> *
> @@ -1027,66 +1008,52 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev,
> */
>
> static int
> -gen8_ring_sync(struct intel_engine_cs *waiter,
> - struct intel_engine_cs *signaller,
> - u32 seqno)
> +gen8_emit_wait(struct i915_gem_request *waiter,
> + struct i915_gem_request *signaller)
> {
> - struct drm_i915_private *dev_priv = waiter->dev->dev_private;
> - int ret;
> + u64 offset = GEN8_SEMAPHORE_OFFSET(waiter->i915, signaller->engine->id, waiter->engine->id);
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(waiter, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(waiter, 4);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> - intel_ring_emit(waiter, MI_SEMAPHORE_WAIT |
> - MI_SEMAPHORE_GLOBAL_GTT |
> - MI_SEMAPHORE_POLL |
> - MI_SEMAPHORE_SAD_GTE_SDD);
> - intel_ring_emit(waiter, seqno);
> - intel_ring_emit(waiter,
> - lower_32_bits(GEN8_WAIT_OFFSET(waiter, signaller->id)));
> - intel_ring_emit(waiter,
> - upper_32_bits(GEN8_WAIT_OFFSET(waiter, signaller->id)));
> - intel_ring_advance(waiter);
> + intel_ring_emit(ring,
> + MI_SEMAPHORE_WAIT |
> + MI_SEMAPHORE_GLOBAL_GTT |
> + MI_SEMAPHORE_POLL |
> + MI_SEMAPHORE_SAD_GTE_SDD);
> + intel_ring_emit(ring, signaller->breadcrumb[waiter->engine->id]);
> + intel_ring_emit(ring, lower_32_bits(offset));
> + intel_ring_emit(ring, upper_32_bits(offset));
> + intel_ring_advance(ring);
> return 0;
> }
>
> static int
> -gen6_ring_sync(struct intel_engine_cs *waiter,
> - struct intel_engine_cs *signaller,
> - u32 seqno)
> +gen6_emit_wait(struct i915_gem_request *waiter,
> + struct i915_gem_request *signaller)
> {
> u32 dw1 = MI_SEMAPHORE_MBOX |
> MI_SEMAPHORE_COMPARE |
> MI_SEMAPHORE_REGISTER;
> - u32 wait_mbox = signaller->semaphore.mbox.wait[waiter->id];
> - int ret;
> + u32 wait_mbox = signaller->engine->semaphore.mbox.wait[waiter->engine->id];
> + struct intel_ringbuffer *ring;
> +
> + WARN_ON(wait_mbox == MI_SEMAPHORE_SYNC_INVALID);
> +
> + ring = intel_ring_begin(waiter, 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> + intel_ring_emit(ring, dw1 | wait_mbox);
> /* Throughout all of the GEM code, seqno passed implies our current
> * seqno is >= the last seqno executed. However for hardware the
> * comparison is strictly greater than.
> */
> - seqno -= 1;
> -
> - WARN_ON(wait_mbox == MI_SEMAPHORE_SYNC_INVALID);
> -
> - ret = intel_ring_begin(waiter, 4);
> - if (ret)
> - return ret;
> -
> - /* If seqno wrap happened, omit the wait with no-ops */
> - if (likely(!i915_gem_has_seqno_wrapped(waiter->dev, seqno))) {
> - intel_ring_emit(waiter, dw1 | wait_mbox);
> - intel_ring_emit(waiter, seqno);
> - intel_ring_emit(waiter, 0);
> - intel_ring_emit(waiter, MI_NOOP);
> - } else {
> - intel_ring_emit(waiter, MI_NOOP);
> - intel_ring_emit(waiter, MI_NOOP);
> - intel_ring_emit(waiter, MI_NOOP);
> - intel_ring_emit(waiter, MI_NOOP);
> - }
> - intel_ring_advance(waiter);
> + intel_ring_emit(ring, signaller->breadcrumb[waiter->engine->id] - 1);
> + intel_ring_emit(ring, 0);
> + intel_ring_advance(ring);
>
> return 0;
> }
> @@ -1101,10 +1068,10 @@ do { \
> } while (0)
>
> static int
> -pc_render_add_request(struct intel_engine_cs *ring)
> +gen5_emit_breadcrumb(struct i915_gem_request *rq)
> {
> - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> - int ret;
> + u32 scratch_addr = rq->engine->scratch.gtt_offset + 2 * CACHELINE_BYTES;
> + struct intel_ringbuffer *ring;
>
> /* For Ironlake, MI_USER_INTERRUPT was deprecated and apparently
> * incoherent with writes to memory, i.e. completely fubar,
> @@ -1114,16 +1081,17 @@ pc_render_add_request(struct intel_engine_cs *ring)
> * incoherence by flushing the 6 PIPE_NOTIFY buffers out to
> * memory before requesting an interrupt.
> */
> - ret = intel_ring_begin(ring, 32);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 32);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE |
> PIPE_CONTROL_WRITE_FLUSH |
> PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
> - intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
> - intel_ring_emit(ring, ring->outstanding_lazy_seqno);
> + intel_ring_emit(ring, rq->engine->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
> + intel_ring_emit(ring, rq->seqno);
> intel_ring_emit(ring, 0);
> +
> PIPE_CONTROL_FLUSH(ring, scratch_addr);
> scratch_addr += 2 * CACHELINE_BYTES; /* write to separate cachelines */
> PIPE_CONTROL_FLUSH(ring, scratch_addr);
> @@ -1140,96 +1108,80 @@ pc_render_add_request(struct intel_engine_cs *ring)
> PIPE_CONTROL_WRITE_FLUSH |
> PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
> PIPE_CONTROL_NOTIFY);
> - intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
> - intel_ring_emit(ring, ring->outstanding_lazy_seqno);
> + intel_ring_emit(ring, rq->engine->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
> + intel_ring_emit(ring, rq->seqno);
> intel_ring_emit(ring, 0);
> - __intel_ring_advance(ring);
> -
> - return 0;
> -}
>
> -static u32
> -gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
> -{
> - /* Workaround to force correct ordering between irq and seqno writes on
> - * ivb (and maybe also on snb) by reading from a CS register (like
> - * ACTHD) before reading the status page. */
> - if (!lazy_coherency) {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - POSTING_READ(RING_ACTHD(ring->mmio_base));
> - }
> + intel_ring_advance(ring);
>
> - return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
> + return 0;
> }
>
> static u32
> -ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
> +ring_get_seqno(struct intel_engine_cs *engine)
> {
> - return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
> + return intel_read_status_page(engine, I915_GEM_HWS_INDEX);
> }
>
> static void
> -ring_set_seqno(struct intel_engine_cs *ring, u32 seqno)
> +ring_set_seqno(struct intel_engine_cs *engine, u32 seqno)
> {
> - intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
> + intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
> }
>
> static u32
> -pc_render_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
> +gen5_render_get_seqno(struct intel_engine_cs *engine)
> {
> - return ring->scratch.cpu_page[0];
> + return engine->scratch.cpu_page[0];
> }
>
> static void
> -pc_render_set_seqno(struct intel_engine_cs *ring, u32 seqno)
> +gen5_render_set_seqno(struct intel_engine_cs *engine, u32 seqno)
> {
> - ring->scratch.cpu_page[0] = seqno;
> + engine->scratch.cpu_page[0] = seqno;
> }
>
> static bool
> -gen5_ring_get_irq(struct intel_engine_cs *ring)
> +gen5_irq_get(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *i915 = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> + if (!i915->dev->irq_enabled)
> return false;
>
> - spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0)
> - gen5_enable_gt_irq(dev_priv, ring->irq_enable_mask);
> - spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> + spin_lock_irqsave(&i915->irq_lock, flags);
> + if (engine->irq_refcount++ == 0)
> + gen5_enable_gt_irq(i915, engine->irq_enable_mask);
> + spin_unlock_irqrestore(&i915->irq_lock, flags);
>
> return true;
> }
>
> static void
> -gen5_ring_put_irq(struct intel_engine_cs *ring)
> +gen5_irq_put(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *i915 = engine->i915;
> unsigned long flags;
>
> - spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0)
> - gen5_disable_gt_irq(dev_priv, ring->irq_enable_mask);
> - spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> + spin_lock_irqsave(&i915->irq_lock, flags);
> + if (--engine->irq_refcount == 0)
> + gen5_disable_gt_irq(i915, engine->irq_enable_mask);
> + spin_unlock_irqrestore(&i915->irq_lock, flags);
> }
>
> static bool
> -i9xx_ring_get_irq(struct intel_engine_cs *ring)
> +i9xx_irq_get(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> + if (!dev_priv->dev->irq_enabled)
> return false;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0) {
> - dev_priv->irq_mask &= ~ring->irq_enable_mask;
> + if (engine->irq_refcount++ == 0) {
> + dev_priv->irq_mask &= ~engine->irq_enable_mask;
> I915_WRITE(IMR, dev_priv->irq_mask);
> POSTING_READ(IMR);
> }
> @@ -1239,15 +1191,14 @@ i9xx_ring_get_irq(struct intel_engine_cs *ring)
> }
>
> static void
> -i9xx_ring_put_irq(struct intel_engine_cs *ring)
> +i9xx_irq_put(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0) {
> - dev_priv->irq_mask |= ring->irq_enable_mask;
> + if (--engine->irq_refcount == 0) {
> + dev_priv->irq_mask |= engine->irq_enable_mask;
> I915_WRITE(IMR, dev_priv->irq_mask);
> POSTING_READ(IMR);
> }
> @@ -1255,18 +1206,17 @@ i9xx_ring_put_irq(struct intel_engine_cs *ring)
> }
>
> static bool
> -i8xx_ring_get_irq(struct intel_engine_cs *ring)
> +i8xx_irq_get(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> + if (!dev_priv->dev->irq_enabled)
> return false;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0) {
> - dev_priv->irq_mask &= ~ring->irq_enable_mask;
> + if (engine->irq_refcount++ == 0) {
> + dev_priv->irq_mask &= ~engine->irq_enable_mask;
> I915_WRITE16(IMR, dev_priv->irq_mask);
> POSTING_READ16(IMR);
> }
> @@ -1276,175 +1226,120 @@ i8xx_ring_get_irq(struct intel_engine_cs *ring)
> }
>
> static void
> -i8xx_ring_put_irq(struct intel_engine_cs *ring)
> +i8xx_irq_put(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0) {
> - dev_priv->irq_mask |= ring->irq_enable_mask;
> + if (--engine->irq_refcount == 0) {
> + dev_priv->irq_mask |= engine->irq_enable_mask;
> I915_WRITE16(IMR, dev_priv->irq_mask);
> POSTING_READ16(IMR);
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> }
>
> -void intel_ring_setup_status_page(struct intel_engine_cs *ring)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - u32 mmio = 0;
> -
> - /* The ring status page addresses are no longer next to the rest of
> - * the ring registers as of gen7.
> - */
> - if (IS_GEN7(dev)) {
> - switch (ring->id) {
> - case RCS:
> - mmio = RENDER_HWS_PGA_GEN7;
> - break;
> - case BCS:
> - mmio = BLT_HWS_PGA_GEN7;
> - break;
> - /*
> - * VCS2 actually doesn't exist on Gen7. Only shut up
> - * gcc switch check warning
> - */
> - case VCS2:
> - case VCS:
> - mmio = BSD_HWS_PGA_GEN7;
> - break;
> - case VECS:
> - mmio = VEBOX_HWS_PGA_GEN7;
> - break;
> - }
> - } else if (IS_GEN6(ring->dev)) {
> - mmio = RING_HWS_PGA_GEN6(ring->mmio_base);
> - } else {
> - /* XXX: gen8 returns to sanity */
> - mmio = RING_HWS_PGA(ring->mmio_base);
> - }
> -
> - I915_WRITE(mmio, (u32)ring->status_page.gfx_addr);
> - POSTING_READ(mmio);
> -
> - /*
> - * Flush the TLB for this page
> - *
> - * FIXME: These two bits have disappeared on gen8, so a question
> - * arises: do we still need this and if so how should we go about
> - * invalidating the TLB?
> - */
> - if (INTEL_INFO(dev)->gen >= 6 && INTEL_INFO(dev)->gen < 8) {
> - u32 reg = RING_INSTPM(ring->mmio_base);
> -
> - /* ring should be idle before issuing a sync flush*/
> - WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0);
> -
> - I915_WRITE(reg,
> - _MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
> - INSTPM_SYNC_FLUSH));
> - if (wait_for((I915_READ(reg) & INSTPM_SYNC_FLUSH) == 0,
> - 1000))
> - DRM_ERROR("%s: wait for SyncFlush to complete for TLB invalidation timed out\n",
> - ring->name);
> - }
> -}
> -
> static int
> -bsd_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate_domains,
> - u32 flush_domains)
> +bsd_emit_flush(struct i915_gem_request *rq,
> + u32 flags)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 1);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_FLUSH);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
> return 0;
> }
>
> static int
> -i9xx_add_request(struct intel_engine_cs *ring)
> +i9xx_emit_breadcrumb(struct i915_gem_request *rq)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 5);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
> - intel_ring_emit(ring, ring->outstanding_lazy_seqno);
> + intel_ring_emit(ring, rq->seqno);
> intel_ring_emit(ring, MI_USER_INTERRUPT);
> - __intel_ring_advance(ring);
> + intel_ring_emit(ring, MI_NOOP);
> + intel_ring_advance(ring);
>
> return 0;
> }
>
> static bool
> -gen6_ring_get_irq(struct intel_engine_cs *ring)
> +gen6_irq_get(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> + if (!dev_priv->dev->irq_enabled)
> return false;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0) {
> - if (HAS_L3_DPF(dev) && ring->id == RCS)
> - I915_WRITE_IMR(ring,
> - ~(ring->irq_enable_mask |
> - GT_PARITY_ERROR(dev)));
> - else
> - I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
> - gen5_enable_gt_irq(dev_priv, ring->irq_enable_mask);
> + if (engine->irq_refcount++ == 0) {
> + I915_WRITE_IMR(engine,
> + ~(engine->irq_enable_mask |
> + engine->irq_keep_mask));
> + gen5_enable_gt_irq(dev_priv, engine->irq_enable_mask);
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
>
> + /* Keep the device awake to save expensive CPU cycles when
> + * reading the registers.
> + */
> + gen6_gt_force_wake_get(dev_priv, engine->power_domains);
> return true;
> }
>
> static void
> -gen6_ring_put_irq(struct intel_engine_cs *ring)
> +gen6_irq_barrier(struct intel_engine_cs *engine)
> +{
> + /* w/a for lax serialisation of GPU writes with IRQs */
> + struct drm_i915_private *dev_priv = engine->i915;
> + (void)I915_READ(RING_ACTHD(engine->mmio_base));
> +}
> +
> +static void
> +gen6_irq_put(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> + gen6_gt_force_wake_put(dev_priv, engine->power_domains);
> +
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0) {
> - if (HAS_L3_DPF(dev) && ring->id == RCS)
> - I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
> - else
> - I915_WRITE_IMR(ring, ~0);
> - gen5_disable_gt_irq(dev_priv, ring->irq_enable_mask);
> + if (--engine->irq_refcount == 0) {
> + I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
> + gen5_disable_gt_irq(dev_priv, engine->irq_enable_mask);
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> }
>
> static bool
> -hsw_vebox_get_irq(struct intel_engine_cs *ring)
> +hsw_vebox_irq_get(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> + if (!dev_priv->dev->irq_enabled)
> return false;
>
> + gen6_gt_force_wake_get(dev_priv, engine->power_domains);
> +
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0) {
> - I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
> - gen6_enable_pm_irq(dev_priv, ring->irq_enable_mask);
> + if (engine->irq_refcount++ == 0) {
> + I915_WRITE_IMR(engine,
> + ~(engine->irq_enable_mask |
> + engine->irq_keep_mask));
> + gen6_enable_pm_irq(dev_priv, engine->irq_enable_mask);
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
>
> @@ -1452,43 +1347,36 @@ hsw_vebox_get_irq(struct intel_engine_cs *ring)
> }
>
> static void
> -hsw_vebox_put_irq(struct intel_engine_cs *ring)
> +hsw_vebox_irq_put(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> - return;
> -
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0) {
> - I915_WRITE_IMR(ring, ~0);
> - gen6_disable_pm_irq(dev_priv, ring->irq_enable_mask);
> + if (--engine->irq_refcount == 0) {
> + I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
> + gen6_disable_pm_irq(dev_priv, engine->irq_enable_mask);
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> +
> + gen6_gt_force_wake_put(dev_priv, engine->power_domains);
> }
>
> static bool
> -gen8_ring_get_irq(struct intel_engine_cs *ring)
> +gen8_irq_get(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> - if (!dev->irq_enabled)
> + if (!dev_priv->dev->irq_enabled)
> return false;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (ring->irq_refcount++ == 0) {
> - if (HAS_L3_DPF(dev) && ring->id == RCS) {
> - I915_WRITE_IMR(ring,
> - ~(ring->irq_enable_mask |
> - GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
> - } else {
> - I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
> - }
> - POSTING_READ(RING_IMR(ring->mmio_base));
> + if (engine->irq_refcount++ == 0) {
> + I915_WRITE_IMR(engine,
> + ~(engine->irq_enable_mask |
> + engine->irq_keep_mask));
> + POSTING_READ(RING_IMR(engine->mmio_base));
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
>
> @@ -1496,35 +1384,29 @@ gen8_ring_get_irq(struct intel_engine_cs *ring)
> }
>
> static void
> -gen8_ring_put_irq(struct intel_engine_cs *ring)
> +gen8_irq_put(struct intel_engine_cs *engine)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> + struct drm_i915_private *dev_priv = engine->i915;
> unsigned long flags;
>
> spin_lock_irqsave(&dev_priv->irq_lock, flags);
> - if (--ring->irq_refcount == 0) {
> - if (HAS_L3_DPF(dev) && ring->id == RCS) {
> - I915_WRITE_IMR(ring,
> - ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
> - } else {
> - I915_WRITE_IMR(ring, ~0);
> - }
> - POSTING_READ(RING_IMR(ring->mmio_base));
> + if (--engine->irq_refcount == 0) {
> + I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
> + POSTING_READ(RING_IMR(engine->mmio_base));
> }
> spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
> }
>
> static int
> -i965_dispatch_execbuffer(struct intel_engine_cs *ring,
> - u64 offset, u32 length,
> - unsigned flags)
> +i965_emit_batchbuffer(struct i915_gem_request *rq,
> + u64 offset, u32 length,
> + unsigned flags)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 2);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring,
> MI_BATCH_BUFFER_START |
> @@ -1539,31 +1421,31 @@ i965_dispatch_execbuffer(struct intel_engine_cs *ring,
> /* Just userspace ABI convention to limit the wa batch bo to a resonable size */
> #define I830_BATCH_LIMIT (256*1024)
> static int
> -i830_dispatch_execbuffer(struct intel_engine_cs *ring,
> - u64 offset, u32 len,
> - unsigned flags)
> +i830_emit_batchbuffer(struct i915_gem_request *rq,
> + u64 offset, u32 len,
> + unsigned flags)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> if (flags & I915_DISPATCH_PINNED) {
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_BATCH_BUFFER);
> intel_ring_emit(ring, offset | (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE));
> intel_ring_emit(ring, offset + len - 8);
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
> } else {
> - u32 cs_offset = ring->scratch.gtt_offset;
> + u32 cs_offset = rq->engine->scratch.gtt_offset;
>
> if (len > I830_BATCH_LIMIT)
> return -ENOSPC;
>
> - ret = intel_ring_begin(ring, 9+3);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 9+3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
> +
> /* Blit the batch (which has now all relocs applied) to the stable batch
> * scratch bo area (so that the CS never stumbles over its tlb
> * invalidation bug) ... */
> @@ -1590,15 +1472,15 @@ i830_dispatch_execbuffer(struct intel_engine_cs *ring,
> }
>
> static int
> -i915_dispatch_execbuffer(struct intel_engine_cs *ring,
> - u64 offset, u32 len,
> - unsigned flags)
> +i915_emit_batchbuffer(struct i915_gem_request *rq,
> + u64 offset, u32 len,
> + unsigned flags)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 2);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring, MI_BATCH_BUFFER_START | MI_BATCH_GTT);
> intel_ring_emit(ring, offset | (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE));
> @@ -1607,492 +1489,232 @@ i915_dispatch_execbuffer(struct intel_engine_cs *ring,
> return 0;
> }
>
> -static void cleanup_status_page(struct intel_engine_cs *ring)
> -{
> - struct drm_i915_gem_object *obj;
> -
> - obj = ring->status_page.obj;
> - if (obj == NULL)
> - return;
> -
> - kunmap(sg_page(obj->pages->sgl));
> - i915_gem_object_ggtt_unpin(obj);
> - drm_gem_object_unreference(&obj->base);
> - ring->status_page.obj = NULL;
> -}
> -
> -static int init_status_page(struct intel_engine_cs *ring)
> +static int setup_status_page(struct intel_engine_cs *engine)
> {
> struct drm_i915_gem_object *obj;
> + unsigned flags;
> + int ret;
>
> - if ((obj = ring->status_page.obj) == NULL) {
> - unsigned flags;
> - int ret;
> + obj = i915_gem_alloc_object(engine->i915->dev, 4096);
> + if (obj == NULL) {
> + DRM_ERROR("Failed to allocate status page\n");
> + return -ENOMEM;
> + }
>
> - obj = i915_gem_alloc_object(ring->dev, 4096);
> - if (obj == NULL) {
> - DRM_ERROR("Failed to allocate status page\n");
> - return -ENOMEM;
> - }
> + ret = i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> + if (ret)
> + goto err_unref;
>
> - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
> - if (ret)
> - goto err_unref;
> -
> - flags = 0;
> - if (!HAS_LLC(ring->dev))
> - /* On g33, we cannot place HWS above 256MiB, so
> - * restrict its pinning to the low mappable arena.
> - * Though this restriction is not documented for
> - * gen4, gen5, or byt, they also behave similarly
> - * and hang if the HWS is placed at the top of the
> - * GTT. To generalise, it appears that all !llc
> - * platforms have issues with us placing the HWS
> - * above the mappable region (even though we never
> - * actualy map it).
> - */
> - flags |= PIN_MAPPABLE;
> - ret = i915_gem_obj_ggtt_pin(obj, 4096, flags);
> - if (ret) {
> + flags = 0;
> + if (!HAS_LLC(engine->i915))
> + /* On g33, we cannot place HWS above 256MiB, so
> + * restrict its pinning to the low mappable arena.
> + * Though this restriction is not documented for
> + * gen4, gen5, or byt, they also behave similarly
> + * and hang if the HWS is placed at the top of the
> + * GTT. To generalise, it appears that all !llc
> + * platforms have issues with us placing the HWS
> + * above the mappable region (even though we never
> + * actualy map it).
> + */
> + flags |= PIN_MAPPABLE;
> + ret = i915_gem_obj_ggtt_pin(obj, 4096, flags);
> + if (ret) {
> err_unref:
> - drm_gem_object_unreference(&obj->base);
> - return ret;
> - }
> -
> - ring->status_page.obj = obj;
> + drm_gem_object_unreference(&obj->base);
> + return ret;
> }
>
> - ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(obj);
> - ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
> - memset(ring->status_page.page_addr, 0, PAGE_SIZE);
> + engine->status_page.obj = obj;
>
> - DRM_DEBUG_DRIVER("%s hws offset: 0x%08x\n",
> - ring->name, ring->status_page.gfx_addr);
> + engine->status_page.gfx_addr = i915_gem_obj_ggtt_offset(obj);
> + engine->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
> + memset(engine->status_page.page_addr, 0, PAGE_SIZE);
>
> + DRM_DEBUG_DRIVER("%s hws offset: 0x%08x\n",
> + engine->name, engine->status_page.gfx_addr);
> return 0;
> }
>
> -static int init_phys_status_page(struct intel_engine_cs *ring)
> +static int setup_phys_status_page(struct intel_engine_cs *engine)
> {
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct drm_i915_private *i915 = engine->i915;
>
> - if (!dev_priv->status_page_dmah) {
> - dev_priv->status_page_dmah =
> - drm_pci_alloc(ring->dev, PAGE_SIZE, PAGE_SIZE);
> - if (!dev_priv->status_page_dmah)
> - return -ENOMEM;
> - }
> + i915->status_page_dmah =
> + drm_pci_alloc(i915->dev, PAGE_SIZE, PAGE_SIZE);
> + if (!i915->status_page_dmah)
> + return -ENOMEM;
>
> - ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> - memset(ring->status_page.page_addr, 0, PAGE_SIZE);
> + engine->status_page.page_addr = i915->status_page_dmah->vaddr;
> + memset(engine->status_page.page_addr, 0, PAGE_SIZE);
>
> return 0;
> }
>
> -void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf)
> +void intel_ring_free(struct intel_ringbuffer *ring)
> {
> - if (!ringbuf->obj)
> - return;
> + if (ring->obj) {
> + iounmap(ring->virtual_start);
> + i915_gem_object_ggtt_unpin(ring->obj);
> + drm_gem_object_unreference(&ring->obj->base);
> + }
>
> - iounmap(ringbuf->virtual_start);
> - i915_gem_object_ggtt_unpin(ringbuf->obj);
> - drm_gem_object_unreference(&ringbuf->obj->base);
> - ringbuf->obj = NULL;
> + list_del(&ring->engine_list);
> + kfree(ring);
> }
>
> -int intel_alloc_ringbuffer_obj(struct drm_device *dev,
> - struct intel_ringbuffer *ringbuf)
> +struct intel_ringbuffer *
> +intel_engine_alloc_ring(struct intel_engine_cs *engine,
> + struct intel_context *ctx,
> + int size)
> {
> - struct drm_i915_private *dev_priv = to_i915(dev);
> + struct drm_i915_private *i915 = engine->i915;
> + struct intel_ringbuffer *ring;
> struct drm_i915_gem_object *obj;
> int ret;
>
> - if (ringbuf->obj)
> - return 0;
> + DRM_DEBUG("creating ringbuffer for %s, size %d\n", engine->name, size);
> +
> + if (WARN_ON(!is_power_of_2(size)))
> + return ERR_PTR(-EINVAL);
> +
> + ring = kzalloc(sizeof(*ring), GFP_KERNEL);
> + if (ring == NULL)
> + return ERR_PTR(-ENOMEM);
> +
> + ring->engine = engine;
> + ring->ctx = ctx;
>
> - obj = NULL;
> - if (!HAS_LLC(dev))
> - obj = i915_gem_object_create_stolen(dev, ringbuf->size);
> + obj = i915_gem_object_create_stolen(i915->dev, size);
> if (obj == NULL)
> - obj = i915_gem_alloc_object(dev, ringbuf->size);
> + obj = i915_gem_alloc_object(i915->dev, size);
> if (obj == NULL)
> - return -ENOMEM;
> + return ERR_PTR(-ENOMEM);
>
> /* mark ring buffers as read-only from GPU side by default */
> obj->gt_ro = 1;
>
> ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
> - if (ret)
> + if (ret) {
> + DRM_ERROR("failed pin ringbuffer into GGTT\n");
> goto err_unref;
> + }
>
> ret = i915_gem_object_set_to_gtt_domain(obj, true);
> - if (ret)
> + if (ret) {
> + DRM_ERROR("failed mark ringbuffer for GTT writes\n");
> goto err_unpin;
> + }
>
> - ringbuf->virtual_start =
> - ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj),
> - ringbuf->size);
> - if (ringbuf->virtual_start == NULL) {
> + ring->virtual_start =
> + ioremap_wc(i915->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj),
> + size);
> + if (ring->virtual_start == NULL) {
> + DRM_ERROR("failed to map ringbuffer through GTT\n");
> ret = -EINVAL;
> goto err_unpin;
> }
>
> - ringbuf->obj = obj;
> - return 0;
> -
> -err_unpin:
> - i915_gem_object_ggtt_unpin(obj);
> -err_unref:
> - drm_gem_object_unreference(&obj->base);
> - return ret;
> -}
> -
> -static int intel_init_ring_buffer(struct drm_device *dev,
> - struct intel_engine_cs *ring)
> -{
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - int ret;
> -
> - if (ringbuf == NULL) {
> - ringbuf = kzalloc(sizeof(*ringbuf), GFP_KERNEL);
> - if (!ringbuf)
> - return -ENOMEM;
> - ring->buffer = ringbuf;
> - }
> -
> - ring->dev = dev;
> - INIT_LIST_HEAD(&ring->active_list);
> - INIT_LIST_HEAD(&ring->request_list);
> - INIT_LIST_HEAD(&ring->execlist_queue);
> - ringbuf->size = 32 * PAGE_SIZE;
> - ringbuf->ring = ring;
> - memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
> -
> - init_waitqueue_head(&ring->irq_queue);
> -
> - if (I915_NEED_GFX_HWS(dev)) {
> - ret = init_status_page(ring);
> - if (ret)
> - goto error;
> - } else {
> - BUG_ON(ring->id != RCS);
> - ret = init_phys_status_page(ring);
> - if (ret)
> - goto error;
> - }
> -
> - ret = intel_alloc_ringbuffer_obj(dev, ringbuf);
> - if (ret) {
> - DRM_ERROR("Failed to allocate ringbuffer %s: %d\n", ring->name, ret);
> - goto error;
> - }
> + ring->obj = obj;
> + ring->size = size;
>
> /* Workaround an erratum on the i830 which causes a hang if
> * the TAIL pointer points to within the last 2 cachelines
> * of the buffer.
> */
> - ringbuf->effective_size = ringbuf->size;
> - if (IS_I830(dev) || IS_845G(dev))
> - ringbuf->effective_size -= 2 * CACHELINE_BYTES;
> + ring->effective_size = size;
> + if (IS_I830(i915) || IS_845G(i915))
> + ring->effective_size -= 2 * CACHELINE_BYTES;
>
> - ret = i915_cmd_parser_init_ring(ring);
> - if (ret)
> - goto error;
> + ring->space = intel_ring_space(ring);
> + ring->retired_head = -1;
>
> - ret = ring->init(ring);
> - if (ret)
> - goto error;
> + INIT_LIST_HEAD(&ring->requests);
> + INIT_LIST_HEAD(&ring->breadcrumbs);
> + list_add_tail(&ring->engine_list, &engine->rings);
>
> - return 0;
> + return ring;
>
> -error:
> - kfree(ringbuf);
> - ring->buffer = NULL;
> - return ret;
> +err_unpin:
> + i915_gem_object_ggtt_unpin(obj);
> +err_unref:
> + drm_gem_object_unreference(&obj->base);
> + return ERR_PTR(ret);
> +}
> +
> +static void
> +nop_irq_barrier(struct intel_engine_cs *engine)
> +{
> }
>
> -void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
> +static int intel_engine_init(struct intel_engine_cs *engine,
> + struct drm_i915_private *i915)
> {
> - struct drm_i915_private *dev_priv = to_i915(ring->dev);
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> + int ret;
>
> - if (!intel_ring_initialized(ring))
> - return;
> + engine->i915 = i915;
>
> - intel_stop_ring_buffer(ring);
> - WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0);
> + INIT_LIST_HEAD(&engine->rings);
> + INIT_LIST_HEAD(&engine->read_list);
> + INIT_LIST_HEAD(&engine->write_list);
> + INIT_LIST_HEAD(&engine->requests);
> + INIT_LIST_HEAD(&engine->pending);
> + INIT_LIST_HEAD(&engine->submitted);
>
> - intel_destroy_ringbuffer_obj(ringbuf);
> - ring->preallocated_lazy_request = NULL;
> - ring->outstanding_lazy_seqno = 0;
> + spin_lock_init(&engine->lock);
> + spin_lock_init(&engine->irqlock);
>
> - if (ring->cleanup)
> - ring->cleanup(ring);
> + engine->suspend = engine_suspend;
> + engine->resume = engine_resume;
> + engine->cleanup = engine_cleanup;
>
> - cleanup_status_page(ring);
> + engine->get_seqno = ring_get_seqno;
> + engine->set_seqno = ring_set_seqno;
>
> - i915_cmd_parser_fini_ring(ring);
> + engine->irq_barrier = nop_irq_barrier;
>
> - kfree(ringbuf);
> - ring->buffer = NULL;
> -}
> + engine->get_ring = engine_get_ring;
> + engine->put_ring = engine_put_ring;
>
> -static int intel_ring_wait_request(struct intel_engine_cs *ring, int n)
> -{
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - struct drm_i915_gem_request *request;
> - u32 seqno = 0;
> - int ret;
> + engine->semaphore.wait = NULL;
>
> - if (ringbuf->last_retired_head != -1) {
> - ringbuf->head = ringbuf->last_retired_head;
> - ringbuf->last_retired_head = -1;
> + engine->add_request = engine_add_request;
> + engine->write_tail = ring_write_tail;
> + engine->is_complete = engine_rq_is_complete;
>
> - ringbuf->space = intel_ring_space(ringbuf);
> - if (ringbuf->space >= n)
> - return 0;
> - }
> + init_waitqueue_head(&engine->irq_queue);
>
> - list_for_each_entry(request, &ring->request_list, list) {
> - if (__intel_ring_space(request->tail, ringbuf->tail,
> - ringbuf->size) >= n) {
> - seqno = request->seqno;
> - break;
> - }
> + if (I915_NEED_GFX_HWS(i915)) {
> + ret = setup_status_page(engine);
> + } else {
> + BUG_ON(engine->id != RCS);
> + ret = setup_phys_status_page(engine);
> }
> -
> - if (seqno == 0)
> - return -ENOSPC;
> -
> - ret = i915_wait_seqno(ring, seqno);
> if (ret)
> return ret;
>
> - i915_gem_retire_requests_ring(ring);
> - ringbuf->head = ringbuf->last_retired_head;
> - ringbuf->last_retired_head = -1;
> + ret = i915_cmd_parser_init_engine(engine);
> + if (ret)
> + return ret;
>
> - ringbuf->space = intel_ring_space(ringbuf);
> return 0;
> }
>
> -static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
> +static void gen6_bsd_ring_write_tail(struct intel_engine_cs *engine,
> + u32 value)
> {
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - unsigned long end;
> - int ret;
> -
> - ret = intel_ring_wait_request(ring, n);
> - if (ret != -ENOSPC)
> - return ret;
> + struct drm_i915_private *dev_priv = engine->i915;
>
> - /* force the tail write in case we have been skipping them */
> - __intel_ring_advance(ring);
> + /* Every tail move must follow the sequence below */
>
> - /* With GEM the hangcheck timer should kick us out of the loop,
> - * leaving it early runs the risk of corrupting GEM state (due
> - * to running on almost untested codepaths). But on resume
> - * timers don't work yet, so prevent a complete hang in that
> - * case by choosing an insanely large timeout. */
> - end = jiffies + 60 * HZ;
> + /* Disable notification that the ring is IDLE. The GT
> + * will then assume that it is busy and bring it out of rc6.
> + */
> + I915_WRITE(GEN6_BSD_SLEEP_PSMI_CONTROL,
> + _MASKED_BIT_ENABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
>
> - trace_i915_ring_wait_begin(ring);
> - do {
> - ringbuf->head = I915_READ_HEAD(ring);
> - ringbuf->space = intel_ring_space(ringbuf);
> - if (ringbuf->space >= n) {
> - ret = 0;
> - break;
> - }
> -
> - msleep(1);
> -
> - if (dev_priv->mm.interruptible && signal_pending(current)) {
> - ret = -ERESTARTSYS;
> - break;
> - }
> -
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> - dev_priv->mm.interruptible);
> - if (ret)
> - break;
> -
> - if (time_after(jiffies, end)) {
> - ret = -EBUSY;
> - break;
> - }
> - } while (1);
> - trace_i915_ring_wait_end(ring);
> - return ret;
> -}
> -
> -static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
> -{
> - uint32_t __iomem *virt;
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - int rem = ringbuf->size - ringbuf->tail;
> -
> - if (ringbuf->space < rem) {
> - int ret = ring_wait_for_space(ring, rem);
> - if (ret)
> - return ret;
> - }
> -
> - virt = ringbuf->virtual_start + ringbuf->tail;
> - rem /= 4;
> - while (rem--)
> - iowrite32(MI_NOOP, virt++);
> -
> - ringbuf->tail = 0;
> - ringbuf->space = intel_ring_space(ringbuf);
> -
> - return 0;
> -}
> -
> -int intel_ring_idle(struct intel_engine_cs *ring)
> -{
> - u32 seqno;
> - int ret;
> -
> - /* We need to add any requests required to flush the objects and ring */
> - if (ring->outstanding_lazy_seqno) {
> - ret = i915_add_request(ring, NULL);
> - if (ret)
> - return ret;
> - }
> -
> - /* Wait upon the last request to be completed */
> - if (list_empty(&ring->request_list))
> - return 0;
> -
> - seqno = list_entry(ring->request_list.prev,
> - struct drm_i915_gem_request,
> - list)->seqno;
> -
> - return i915_wait_seqno(ring, seqno);
> -}
> -
> -static int
> -intel_ring_alloc_seqno(struct intel_engine_cs *ring)
> -{
> - if (ring->outstanding_lazy_seqno)
> - return 0;
> -
> - if (ring->preallocated_lazy_request == NULL) {
> - struct drm_i915_gem_request *request;
> -
> - request = kmalloc(sizeof(*request), GFP_KERNEL);
> - if (request == NULL)
> - return -ENOMEM;
> -
> - ring->preallocated_lazy_request = request;
> - }
> -
> - return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno);
> -}
> -
> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
> - int bytes)
> -{
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - int ret;
> -
> - if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
> - ret = intel_wrap_ring_buffer(ring);
> - if (unlikely(ret))
> - return ret;
> - }
> -
> - if (unlikely(ringbuf->space < bytes)) {
> - ret = ring_wait_for_space(ring, bytes);
> - if (unlikely(ret))
> - return ret;
> - }
> -
> - return 0;
> -}
> -
> -int intel_ring_begin(struct intel_engine_cs *ring,
> - int num_dwords)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> - int ret;
> -
> - ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> - dev_priv->mm.interruptible);
> - if (ret)
> - return ret;
> -
> - ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
> - if (ret)
> - return ret;
> -
> - /* Preallocate the olr before touching the ring */
> - ret = intel_ring_alloc_seqno(ring);
> - if (ret)
> - return ret;
> -
> - ring->buffer->space -= num_dwords * sizeof(uint32_t);
> - return 0;
> -}
> -
> -/* Align the ring tail to a cacheline boundary */
> -int intel_ring_cacheline_align(struct intel_engine_cs *ring)
> -{
> - int num_dwords = (ring->buffer->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
> - int ret;
> -
> - if (num_dwords == 0)
> - return 0;
> -
> - num_dwords = CACHELINE_BYTES / sizeof(uint32_t) - num_dwords;
> - ret = intel_ring_begin(ring, num_dwords);
> - if (ret)
> - return ret;
> -
> - while (num_dwords--)
> - intel_ring_emit(ring, MI_NOOP);
> -
> - intel_ring_advance(ring);
> -
> - return 0;
> -}
> -
> -void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno)
> -{
> - struct drm_device *dev = ring->dev;
> - struct drm_i915_private *dev_priv = dev->dev_private;
> -
> - BUG_ON(ring->outstanding_lazy_seqno);
> -
> - if (INTEL_INFO(dev)->gen == 6 || INTEL_INFO(dev)->gen == 7) {
> - I915_WRITE(RING_SYNC_0(ring->mmio_base), 0);
> - I915_WRITE(RING_SYNC_1(ring->mmio_base), 0);
> - if (HAS_VEBOX(dev))
> - I915_WRITE(RING_SYNC_2(ring->mmio_base), 0);
> - }
> -
> - ring->set_seqno(ring, seqno);
> - ring->hangcheck.seqno = seqno;
> -}
> -
> -static void gen6_bsd_ring_write_tail(struct intel_engine_cs *ring,
> - u32 value)
> -{
> - struct drm_i915_private *dev_priv = ring->dev->dev_private;
> -
> - /* Every tail move must follow the sequence below */
> -
> - /* Disable notification that the ring is IDLE. The GT
> - * will then assume that it is busy and bring it out of rc6.
> - */
> - I915_WRITE(GEN6_BSD_SLEEP_PSMI_CONTROL,
> - _MASKED_BIT_ENABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
> -
> - /* Clear the context id. Here be magic! */
> - I915_WRITE64(GEN6_BSD_RNCID, 0x0);
> + /* Clear the context id. Here be magic! */
> + I915_WRITE64(GEN6_BSD_RNCID, 0x0);
>
> /* Wait for the ring not to be idle, i.e. for it to wake up. */
> if (wait_for((I915_READ(GEN6_BSD_SLEEP_PSMI_CONTROL) &
> @@ -2101,8 +1723,8 @@ static void gen6_bsd_ring_write_tail(struct intel_engine_cs *ring,
> DRM_ERROR("timed out waiting for the BSD ring to wake up\n");
>
> /* Now that the ring is fully powered up, update the tail */
> - I915_WRITE_TAIL(ring, value);
> - POSTING_READ(RING_TAIL(ring->mmio_base));
> + I915_WRITE_TAIL(engine, value);
> + POSTING_READ(RING_TAIL(engine->mmio_base));
>
> /* Let the ring send IDLE messages to the GT again,
> * and so let it sleep to conserve power when idle.
> @@ -2111,73 +1733,72 @@ static void gen6_bsd_ring_write_tail(struct intel_engine_cs *ring,
> _MASKED_BIT_DISABLE(GEN6_BSD_SLEEP_MSG_DISABLE));
> }
>
> -static int gen6_bsd_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate, u32 flush)
> +static int gen6_bsd_emit_flush(struct i915_gem_request *rq,
> + u32 flags)
> {
> + struct intel_ringbuffer *ring;
> uint32_t cmd;
> - int ret;
> -
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
>
> - cmd = MI_FLUSH_DW;
> - if (INTEL_INFO(ring->dev)->gen >= 8)
> + cmd = 3;
> + if (INTEL_INFO(rq->i915)->gen >= 8)
> cmd += 1;
> +
> + ring = intel_ring_begin(rq, cmd);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
> +
> /*
> * Bspec vol 1c.5 - video engine command streamer:
> * "If ENABLED, all TLBs will be invalidated once the flush
> * operation is complete. This bit is only valid when the
> * Post-Sync Operation field is a value of 1h or 3h."
> */
> - if (invalidate & I915_GEM_GPU_DOMAINS)
> - cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD |
> - MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
> + cmd = MI_FLUSH_DW | (cmd - 2);
> + if (flags & I915_INVALIDATE_CACHES)
> + cmd |= (MI_INVALIDATE_TLB |
> + MI_INVALIDATE_BSD |
> + MI_FLUSH_DW_STORE_INDEX |
> + MI_FLUSH_DW_OP_STOREDW);
> intel_ring_emit(ring, cmd);
> intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
> - if (INTEL_INFO(ring->dev)->gen >= 8) {
> + if (INTEL_INFO(rq->i915)->gen >= 8)
> intel_ring_emit(ring, 0); /* upper addr */
> - intel_ring_emit(ring, 0); /* value */
> - } else {
> - intel_ring_emit(ring, 0);
> - intel_ring_emit(ring, MI_NOOP);
> - }
> + intel_ring_emit(ring, 0); /* value */
> intel_ring_advance(ring);
> return 0;
> }
>
> static int
> -gen8_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
> - u64 offset, u32 len,
> - unsigned flags)
> +gen8_emit_batchbuffer(struct i915_gem_request *rq,
> + u64 offset, u32 len,
> + unsigned flags)
> {
> - bool ppgtt = USES_PPGTT(ring->dev) && !(flags & I915_DISPATCH_SECURE);
> - int ret;
> + struct intel_ringbuffer *ring;
> + bool ppgtt = USES_PPGTT(rq->i915) && !(flags & I915_DISPATCH_SECURE);
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 3);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> /* FIXME(BDW): Address space and security selectors. */
> intel_ring_emit(ring, MI_BATCH_BUFFER_START_GEN8 | (ppgtt<<8));
> intel_ring_emit(ring, lower_32_bits(offset));
> intel_ring_emit(ring, upper_32_bits(offset));
> - intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
>
> return 0;
> }
>
> static int
> -hsw_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
> - u64 offset, u32 len,
> - unsigned flags)
> +hsw_emit_batchbuffer(struct i915_gem_request *rq,
> + u64 offset, u32 len,
> + unsigned flags)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 2);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring,
> MI_BATCH_BUFFER_START | MI_BATCH_PPGTT_HSW |
> @@ -2190,15 +1811,15 @@ hsw_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
> }
>
> static int
> -gen6_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
> - u64 offset, u32 len,
> - unsigned flags)
> +gen6_emit_batchbuffer(struct i915_gem_request *rq,
> + u64 offset, u32 len,
> + unsigned flags)
> {
> - int ret;
> + struct intel_ringbuffer *ring;
>
> - ret = intel_ring_begin(ring, 2);
> - if (ret)
> - return ret;
> + ring = intel_ring_begin(rq, 2);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
>
> intel_ring_emit(ring,
> MI_BATCH_BUFFER_START |
> @@ -2212,60 +1833,102 @@ gen6_ring_dispatch_execbuffer(struct intel_engine_cs *ring,
>
> /* Blitter support (SandyBridge+) */
>
> -static int gen6_ring_flush(struct intel_engine_cs *ring,
> - u32 invalidate, u32 flush)
> +static int gen6_blt_emit_flush(struct i915_gem_request *rq,
> + u32 flags)
> {
> - struct drm_device *dev = ring->dev;
> + struct intel_ringbuffer *ring;
> uint32_t cmd;
> - int ret;
>
> - ret = intel_ring_begin(ring, 4);
> - if (ret)
> - return ret;
> -
> - cmd = MI_FLUSH_DW;
> - if (INTEL_INFO(ring->dev)->gen >= 8)
> + cmd = 3;
> + if (INTEL_INFO(rq->i915)->gen >= 8)
> cmd += 1;
> +
> + ring = intel_ring_begin(rq, cmd);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
> +
> /*
> * Bspec vol 1c.3 - blitter engine command streamer:
> * "If ENABLED, all TLBs will be invalidated once the flush
> * operation is complete. This bit is only valid when the
> * Post-Sync Operation field is a value of 1h or 3h."
> */
> - if (invalidate & I915_GEM_DOMAIN_RENDER)
> - cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
> - MI_FLUSH_DW_OP_STOREDW;
> + cmd = MI_FLUSH_DW | (cmd - 2);
> + if (flags & I915_INVALIDATE_CACHES)
> + cmd |= (MI_INVALIDATE_TLB |
> + MI_FLUSH_DW_STORE_INDEX |
> + MI_FLUSH_DW_OP_STOREDW);
> intel_ring_emit(ring, cmd);
> intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
> - if (INTEL_INFO(ring->dev)->gen >= 8) {
> + if (INTEL_INFO(rq->i915)->gen >= 8)
> intel_ring_emit(ring, 0); /* upper addr */
> - intel_ring_emit(ring, 0); /* value */
> - } else {
> - intel_ring_emit(ring, 0);
> - intel_ring_emit(ring, MI_NOOP);
> - }
> + intel_ring_emit(ring, 0); /* value */
> intel_ring_advance(ring);
>
> - if (IS_GEN7(dev) && !invalidate && flush)
> - return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> + if (IS_GEN7(rq->i915) && flags & I915_KICK_FBC)
> + return gen7_ring_fbc_flush(rq, FBC_REND_CACHE_CLEAN);
>
> return 0;
> }
>
> -int intel_init_render_ring_buffer(struct drm_device *dev)
> +static void gen8_engine_init_semaphore(struct intel_engine_cs *engine)
> +{
> + if (engine->i915->semaphore_obj == NULL)
> + return;
> +
> + engine->semaphore.wait = gen8_emit_wait;
> + engine->semaphore.signal =
> + engine->id == RCS ? gen8_rcs_emit_signal : gen8_xcs_emit_signal;
> +}
> +
> +static bool semaphores_enabled(struct drm_i915_private *dev_priv)
> +{
> + if (INTEL_INFO(dev_priv)->gen < 6)
> + return false;
> +
> + if (i915.semaphores >= 0)
> + return i915.semaphores;
> +
> + /* Until we get further testing... */
> + if (IS_GEN8(dev_priv))
> + return false;
> +
> +#ifdef CONFIG_INTEL_IOMMU
> + /* Enable semaphores on SNB when IO remapping is off */
> + if (INTEL_INFO(dev_priv)->gen == 6 && intel_iommu_gfx_mapped)
> + return false;
> +#endif
> +
> + return true;
> +}
> +
> +int intel_init_render_engine(struct drm_i915_private *dev_priv)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[RCS];
> + struct intel_engine_cs *engine = &dev_priv->engine[RCS];
> struct drm_i915_gem_object *obj;
> int ret;
>
> - ring->name = "render ring";
> - ring->id = RCS;
> - ring->mmio_base = RENDER_RING_BASE;
> + ret = intel_engine_init(engine, dev_priv);
> + if (ret)
> + return ret;
> +
> + engine->name = "render ring";
> + engine->id = RCS;
> + engine->power_domains = FORCEWAKE_RENDER;
> + engine->mmio_base = RENDER_RING_BASE;
> +
> + engine->init_context = i915_gem_render_state_init;
> +
> + if (HAS_L3_DPF(dev_priv)) {
> + if (INTEL_INFO(dev_priv)->gen >= 8)
> + engine->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
> + else
> + engine->irq_keep_mask |= GT_PARITY_ERROR(dev_priv);
> + }
>
> - if (INTEL_INFO(dev)->gen >= 8) {
> - if (i915_semaphore_is_enabled(dev)) {
> - obj = i915_gem_alloc_object(dev, 4096);
> + if (INTEL_INFO(dev_priv)->gen >= 8) {
> + if (semaphores_enabled(dev_priv)) {
> + obj = i915_gem_alloc_object(dev_priv->dev, 4096);
> if (obj == NULL) {
> DRM_ERROR("Failed to allocate semaphore bo. Disabling semaphores\n");
> i915.semaphores = 0;
> @@ -2280,36 +1943,28 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> dev_priv->semaphore_obj = obj;
> }
> }
> - if (IS_CHERRYVIEW(dev))
> - ring->init_context = chv_init_workarounds;
> + if (IS_CHERRYVIEW(dev_priv))
> + engine->init_context = chv_render_init_context;
> else
> - ring->init_context = bdw_init_workarounds;
> - ring->add_request = gen6_add_request;
> - ring->flush = gen8_render_ring_flush;
> - ring->irq_get = gen8_ring_get_irq;
> - ring->irq_put = gen8_ring_put_irq;
> - ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
> - ring->get_seqno = gen6_ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - if (i915_semaphore_is_enabled(dev)) {
> - WARN_ON(!dev_priv->semaphore_obj);
> - ring->semaphore.sync_to = gen8_ring_sync;
> - ring->semaphore.signal = gen8_rcs_signal;
> - GEN8_RING_SEMAPHORE_INIT;
> - }
> - } else if (INTEL_INFO(dev)->gen >= 6) {
> - ring->add_request = gen6_add_request;
> - ring->flush = gen7_render_ring_flush;
> - if (INTEL_INFO(dev)->gen == 6)
> - ring->flush = gen6_render_ring_flush;
> - ring->irq_get = gen6_ring_get_irq;
> - ring->irq_put = gen6_ring_put_irq;
> - ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
> - ring->get_seqno = gen6_ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen6_ring_sync;
> - ring->semaphore.signal = gen6_signal;
> + engine->init_context = bdw_render_init_context;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + engine->emit_flush = gen8_render_emit_flush;
> + engine->irq_get = gen8_irq_get;
> + engine->irq_put = gen8_irq_put;
> + engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
> + gen8_engine_init_semaphore(engine);
> + } else if (INTEL_INFO(dev_priv)->gen >= 6) {
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + engine->emit_flush = gen7_render_emit_flush;
> + if (INTEL_INFO(dev_priv)->gen == 6)
> + engine->emit_flush = gen6_render_emit_flush;
> + engine->irq_get = gen6_irq_get;
> + engine->irq_barrier = gen6_irq_barrier;
> + engine->irq_put = gen6_irq_put;
> + engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
> + if (semaphores_enabled(dev_priv)) {
> + engine->semaphore.wait = gen6_emit_wait;
> + engine->semaphore.signal = gen6_emit_signal;
> /*
> * The current semaphore is only applied on pre-gen8
> * platform. And there is no VCS2 ring on the pre-gen8
> @@ -2317,63 +1972,62 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> * initialized as INVALID. Gen8 will initialize the
> * sema between VCS2 and RCS later.
> */
> - ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
> - ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
> - ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
> - ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
> - ring->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
> - ring->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
> - ring->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
> - ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> + engine->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_RV;
> + engine->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_RB;
> + engine->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_RVE;
> + engine->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.signal[RCS] = GEN6_NOSYNC;
> + engine->semaphore.mbox.signal[VCS] = GEN6_VRSYNC;
> + engine->semaphore.mbox.signal[BCS] = GEN6_BRSYNC;
> + engine->semaphore.mbox.signal[VECS] = GEN6_VERSYNC;
> + engine->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> }
> - } else if (IS_GEN5(dev)) {
> - ring->add_request = pc_render_add_request;
> - ring->flush = gen4_render_ring_flush;
> - ring->get_seqno = pc_render_get_seqno;
> - ring->set_seqno = pc_render_set_seqno;
> - ring->irq_get = gen5_ring_get_irq;
> - ring->irq_put = gen5_ring_put_irq;
> - ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT |
> - GT_RENDER_PIPECTL_NOTIFY_INTERRUPT;
> + } else if (IS_GEN5(dev_priv)) {
> + engine->emit_breadcrumb = gen5_emit_breadcrumb;
> + engine->emit_flush = gen4_emit_flush;
> + engine->get_seqno = gen5_render_get_seqno;
> + engine->set_seqno = gen5_render_set_seqno;
> + engine->irq_get = gen5_irq_get;
> + engine->irq_put = gen5_irq_put;
> + engine->irq_enable_mask =
> + GT_RENDER_USER_INTERRUPT |
> + GT_RENDER_PIPECTL_NOTIFY_INTERRUPT;
> } else {
> - ring->add_request = i9xx_add_request;
> - if (INTEL_INFO(dev)->gen < 4)
> - ring->flush = gen2_render_ring_flush;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + if (INTEL_INFO(dev_priv)->gen < 4)
> + engine->emit_flush = gen2_emit_flush;
> else
> - ring->flush = gen4_render_ring_flush;
> - ring->get_seqno = ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - if (IS_GEN2(dev)) {
> - ring->irq_get = i8xx_ring_get_irq;
> - ring->irq_put = i8xx_ring_put_irq;
> + engine->emit_flush = gen4_emit_flush;
> + if (IS_GEN2(dev_priv)) {
> + engine->irq_get = i8xx_irq_get;
> + engine->irq_put = i8xx_irq_put;
> } else {
> - ring->irq_get = i9xx_ring_get_irq;
> - ring->irq_put = i9xx_ring_put_irq;
> + engine->irq_get = i9xx_irq_get;
> + engine->irq_put = i9xx_irq_put;
> }
> - ring->irq_enable_mask = I915_USER_INTERRUPT;
> - }
> - ring->write_tail = ring_write_tail;
> -
> - if (IS_HASWELL(dev))
> - ring->dispatch_execbuffer = hsw_ring_dispatch_execbuffer;
> - else if (IS_GEN8(dev))
> - ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
> - else if (INTEL_INFO(dev)->gen >= 6)
> - ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
> - else if (INTEL_INFO(dev)->gen >= 4)
> - ring->dispatch_execbuffer = i965_dispatch_execbuffer;
> - else if (IS_I830(dev) || IS_845G(dev))
> - ring->dispatch_execbuffer = i830_dispatch_execbuffer;
> + engine->irq_enable_mask = I915_USER_INTERRUPT;
> + }
> +
> + if (IS_GEN8(dev_priv))
> + engine->emit_batchbuffer = gen8_emit_batchbuffer;
> + else if (IS_HASWELL(dev_priv))
> + engine->emit_batchbuffer = hsw_emit_batchbuffer;
> + else if (INTEL_INFO(dev_priv)->gen >= 6)
> + engine->emit_batchbuffer = gen6_emit_batchbuffer;
> + else if (INTEL_INFO(dev_priv)->gen >= 4)
> + engine->emit_batchbuffer = i965_emit_batchbuffer;
> + else if (IS_I830(dev_priv) || IS_845G(dev_priv))
> + engine->emit_batchbuffer = i830_emit_batchbuffer;
> else
> - ring->dispatch_execbuffer = i915_dispatch_execbuffer;
> - ring->init = init_render_ring;
> - ring->cleanup = render_ring_cleanup;
> + engine->emit_batchbuffer = i915_emit_batchbuffer;
> +
> + engine->resume = render_resume;
> + engine->cleanup = render_cleanup;
>
> /* Workaround batchbuffer to combat CS tlb bug. */
> - if (HAS_BROKEN_CS_TLB(dev)) {
> - obj = i915_gem_alloc_object(dev, I830_BATCH_LIMIT);
> + if (HAS_BROKEN_CS_TLB(dev_priv)) {
> + obj = i915_gem_alloc_object(dev_priv->dev, I830_BATCH_LIMIT);
> if (obj == NULL) {
> DRM_ERROR("Failed to allocate batch bo\n");
> return -ENOMEM;
> @@ -2386,158 +2040,155 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> return ret;
> }
>
> - ring->scratch.obj = obj;
> - ring->scratch.gtt_offset = i915_gem_obj_ggtt_offset(obj);
> + engine->scratch.obj = obj;
> + engine->scratch.gtt_offset = i915_gem_obj_ggtt_offset(obj);
> + }
> +
> + if (INTEL_INFO(dev_priv)->gen >= 5) {
> + ret = init_pipe_control(engine);
> + if (ret)
> + return ret;
> }
>
> - return intel_init_ring_buffer(dev, ring);
> + return intel_engine_enable_execlists(engine);
> }
>
> -int intel_init_bsd_ring_buffer(struct drm_device *dev)
> +int intel_init_bsd_engine(struct drm_i915_private *dev_priv)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[VCS];
> + struct intel_engine_cs *engine = &dev_priv->engine[VCS];
> + int ret;
> +
> + ret = intel_engine_init(engine, dev_priv);
> + if (ret)
> + return ret;
> +
> + engine->name = "bsd ring";
> + engine->id = VCS;
> + engine->power_domains = FORCEWAKE_MEDIA;
>
> - ring->name = "bsd ring";
> - ring->id = VCS;
> + if (INTEL_INFO(dev_priv)->gen >= 6) {
> + engine->mmio_base = GEN6_BSD_RING_BASE;
>
> - ring->write_tail = ring_write_tail;
> - if (INTEL_INFO(dev)->gen >= 6) {
> - ring->mmio_base = GEN6_BSD_RING_BASE;
> /* gen6 bsd needs a special wa for tail updates */
> - if (IS_GEN6(dev))
> - ring->write_tail = gen6_bsd_ring_write_tail;
> - ring->flush = gen6_bsd_ring_flush;
> - ring->add_request = gen6_add_request;
> - ring->get_seqno = gen6_ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - if (INTEL_INFO(dev)->gen >= 8) {
> - ring->irq_enable_mask =
> + if (IS_GEN6(dev_priv))
> + engine->write_tail = gen6_bsd_ring_write_tail;
> + engine->emit_flush = gen6_bsd_emit_flush;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + if (INTEL_INFO(dev_priv)->gen >= 8) {
> + engine->irq_enable_mask =
> GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
> - ring->irq_get = gen8_ring_get_irq;
> - ring->irq_put = gen8_ring_put_irq;
> - ring->dispatch_execbuffer =
> - gen8_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen8_ring_sync;
> - ring->semaphore.signal = gen8_xcs_signal;
> - GEN8_RING_SEMAPHORE_INIT;
> - }
> + engine->irq_get = gen8_irq_get;
> + engine->irq_put = gen8_irq_put;
> + engine->emit_batchbuffer = gen8_emit_batchbuffer;
> + gen8_engine_init_semaphore(engine);
> } else {
> - ring->irq_enable_mask = GT_BSD_USER_INTERRUPT;
> - ring->irq_get = gen6_ring_get_irq;
> - ring->irq_put = gen6_ring_put_irq;
> - ring->dispatch_execbuffer =
> - gen6_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen6_ring_sync;
> - ring->semaphore.signal = gen6_signal;
> - ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
> - ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
> - ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
> - ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
> - ring->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
> - ring->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
> - ring->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
> - ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> + engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
> + engine->irq_get = gen6_irq_get;
> + engine->irq_barrier = gen6_irq_barrier;
> + engine->irq_put = gen6_irq_put;
> + engine->emit_batchbuffer = gen6_emit_batchbuffer;
> + if (semaphores_enabled(dev_priv)) {
> + engine->semaphore.wait = gen6_emit_wait;
> + engine->semaphore.signal = gen6_emit_signal;
> + engine->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VR;
> + engine->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VB;
> + engine->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_VVE;
> + engine->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.signal[RCS] = GEN6_RVSYNC;
> + engine->semaphore.mbox.signal[VCS] = GEN6_NOSYNC;
> + engine->semaphore.mbox.signal[BCS] = GEN6_BVSYNC;
> + engine->semaphore.mbox.signal[VECS] = GEN6_VEVSYNC;
> + engine->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> }
> }
> } else {
> - ring->mmio_base = BSD_RING_BASE;
> - ring->flush = bsd_ring_flush;
> - ring->add_request = i9xx_add_request;
> - ring->get_seqno = ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - if (IS_GEN5(dev)) {
> - ring->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
> - ring->irq_get = gen5_ring_get_irq;
> - ring->irq_put = gen5_ring_put_irq;
> + engine->mmio_base = BSD_RING_BASE;
> +
> + engine->emit_flush = bsd_emit_flush;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + if (IS_GEN5(dev_priv)) {
> + engine->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
> + engine->irq_get = gen5_irq_get;
> + engine->irq_put = gen5_irq_put;
> } else {
> - ring->irq_enable_mask = I915_BSD_USER_INTERRUPT;
> - ring->irq_get = i9xx_ring_get_irq;
> - ring->irq_put = i9xx_ring_put_irq;
> + engine->irq_enable_mask = I915_BSD_USER_INTERRUPT;
> + engine->irq_get = i9xx_irq_get;
> + engine->irq_put = i9xx_irq_put;
> }
> - ring->dispatch_execbuffer = i965_dispatch_execbuffer;
> + engine->emit_batchbuffer = i965_emit_batchbuffer;
> }
> - ring->init = init_ring_common;
>
> - return intel_init_ring_buffer(dev, ring);
> + return intel_engine_enable_execlists(engine);
> }
>
> /**
> * Initialize the second BSD ring for Broadwell GT3.
> * It is noted that this only exists on Broadwell GT3.
> */
> -int intel_init_bsd2_ring_buffer(struct drm_device *dev)
> +int intel_init_bsd2_engine(struct drm_i915_private *dev_priv)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[VCS2];
> + struct intel_engine_cs *engine = &dev_priv->engine[VCS2];
> + int ret;
>
> - if ((INTEL_INFO(dev)->gen != 8)) {
> + if ((INTEL_INFO(dev_priv)->gen != 8)) {
> DRM_ERROR("No dual-BSD ring on non-BDW machine\n");
> return -EINVAL;
> }
>
> - ring->name = "bsd2 ring";
> - ring->id = VCS2;
> + ret = intel_engine_init(engine, dev_priv);
> + if (ret)
> + return ret;
> +
> + engine->name = "bsd2 ring";
> + engine->id = VCS2;
> + engine->power_domains = FORCEWAKE_MEDIA;
> + engine->mmio_base = GEN8_BSD2_RING_BASE;
>
> - ring->write_tail = ring_write_tail;
> - ring->mmio_base = GEN8_BSD2_RING_BASE;
> - ring->flush = gen6_bsd_ring_flush;
> - ring->add_request = gen6_add_request;
> - ring->get_seqno = gen6_ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - ring->irq_enable_mask =
> + engine->emit_flush = gen6_bsd_emit_flush;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + engine->emit_batchbuffer = gen8_emit_batchbuffer;
> + engine->irq_enable_mask =
> GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
> - ring->irq_get = gen8_ring_get_irq;
> - ring->irq_put = gen8_ring_put_irq;
> - ring->dispatch_execbuffer =
> - gen8_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen8_ring_sync;
> - ring->semaphore.signal = gen8_xcs_signal;
> - GEN8_RING_SEMAPHORE_INIT;
> - }
> - ring->init = init_ring_common;
> + engine->irq_get = gen8_irq_get;
> + engine->irq_put = gen8_irq_put;
> + gen8_engine_init_semaphore(engine);
>
> - return intel_init_ring_buffer(dev, ring);
> + return intel_engine_enable_execlists(engine);
> }
>
> -int intel_init_blt_ring_buffer(struct drm_device *dev)
> +int intel_init_blt_engine(struct drm_i915_private *dev_priv)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[BCS];
> + struct intel_engine_cs *engine = &dev_priv->engine[BCS];
> + int ret;
> +
> + ret = intel_engine_init(engine, dev_priv);
> + if (ret)
> + return ret;
>
> - ring->name = "blitter ring";
> - ring->id = BCS;
> + engine->name = "blitter ring";
> + engine->id = BCS;
> + engine->power_domains = FORCEWAKE_MEDIA;
> + engine->mmio_base = BLT_RING_BASE;
>
> - ring->mmio_base = BLT_RING_BASE;
> - ring->write_tail = ring_write_tail;
> - ring->flush = gen6_ring_flush;
> - ring->add_request = gen6_add_request;
> - ring->get_seqno = gen6_ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> - if (INTEL_INFO(dev)->gen >= 8) {
> - ring->irq_enable_mask =
> + engine->emit_flush = gen6_blt_emit_flush;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> + if (INTEL_INFO(dev_priv)->gen >= 8) {
> + engine->irq_enable_mask =
> GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
> - ring->irq_get = gen8_ring_get_irq;
> - ring->irq_put = gen8_ring_put_irq;
> - ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen8_ring_sync;
> - ring->semaphore.signal = gen8_xcs_signal;
> - GEN8_RING_SEMAPHORE_INIT;
> - }
> + engine->irq_get = gen8_irq_get;
> + engine->irq_put = gen8_irq_put;
> + engine->emit_batchbuffer = gen8_emit_batchbuffer;
> + gen8_engine_init_semaphore(engine);
> } else {
> - ring->irq_enable_mask = GT_BLT_USER_INTERRUPT;
> - ring->irq_get = gen6_ring_get_irq;
> - ring->irq_put = gen6_ring_put_irq;
> - ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.signal = gen6_signal;
> - ring->semaphore.sync_to = gen6_ring_sync;
> + engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
> + engine->irq_get = gen6_irq_get;
> + engine->irq_barrier = gen6_irq_barrier;
> + engine->irq_put = gen6_irq_put;
> + engine->emit_batchbuffer = gen6_emit_batchbuffer;
> + if (semaphores_enabled(dev_priv)) {
> + engine->semaphore.signal = gen6_emit_signal;
> + engine->semaphore.wait = gen6_emit_wait;
> /*
> * The current semaphore is only applied on pre-gen8
> * platform. And there is no VCS2 ring on the pre-gen8
> @@ -2545,124 +2196,510 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
> * initialized as INVALID. Gen8 will initialize the
> * sema between BCS and VCS2 later.
> */
> - ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
> - ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
> - ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
> - ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
> - ring->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
> - ring->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
> - ring->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
> - ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> + engine->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_BR;
> + engine->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_BV;
> + engine->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_BVE;
> + engine->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.signal[RCS] = GEN6_RBSYNC;
> + engine->semaphore.mbox.signal[VCS] = GEN6_VBSYNC;
> + engine->semaphore.mbox.signal[BCS] = GEN6_NOSYNC;
> + engine->semaphore.mbox.signal[VECS] = GEN6_VEBSYNC;
> + engine->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> }
> }
> - ring->init = init_ring_common;
>
> - return intel_init_ring_buffer(dev, ring);
> + return intel_engine_enable_execlists(engine);
> }
>
> -int intel_init_vebox_ring_buffer(struct drm_device *dev)
> +int intel_init_vebox_engine(struct drm_i915_private *dev_priv)
> {
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - struct intel_engine_cs *ring = &dev_priv->ring[VECS];
> + struct intel_engine_cs *engine = &dev_priv->engine[VECS];
> + int ret;
>
> - ring->name = "video enhancement ring";
> - ring->id = VECS;
> + ret = intel_engine_init(engine, dev_priv);
> + if (ret)
> + return ret;
>
> - ring->mmio_base = VEBOX_RING_BASE;
> - ring->write_tail = ring_write_tail;
> - ring->flush = gen6_ring_flush;
> - ring->add_request = gen6_add_request;
> - ring->get_seqno = gen6_ring_get_seqno;
> - ring->set_seqno = ring_set_seqno;
> + engine->name = "video enhancement ring";
> + engine->id = VECS;
> + engine->power_domains = FORCEWAKE_MEDIA;
> + engine->mmio_base = VEBOX_RING_BASE;
>
> - if (INTEL_INFO(dev)->gen >= 8) {
> - ring->irq_enable_mask =
> + engine->emit_flush = gen6_blt_emit_flush;
> + engine->emit_breadcrumb = i9xx_emit_breadcrumb;
> +
> + if (INTEL_INFO(dev_priv)->gen >= 8) {
> + engine->irq_enable_mask =
> GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
> - ring->irq_get = gen8_ring_get_irq;
> - ring->irq_put = gen8_ring_put_irq;
> - ring->dispatch_execbuffer = gen8_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen8_ring_sync;
> - ring->semaphore.signal = gen8_xcs_signal;
> - GEN8_RING_SEMAPHORE_INIT;
> - }
> + engine->irq_get = gen8_irq_get;
> + engine->irq_put = gen8_irq_put;
> + engine->emit_batchbuffer = gen8_emit_batchbuffer;
> + gen8_engine_init_semaphore(engine);
> } else {
> - ring->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
> - ring->irq_get = hsw_vebox_get_irq;
> - ring->irq_put = hsw_vebox_put_irq;
> - ring->dispatch_execbuffer = gen6_ring_dispatch_execbuffer;
> - if (i915_semaphore_is_enabled(dev)) {
> - ring->semaphore.sync_to = gen6_ring_sync;
> - ring->semaphore.signal = gen6_signal;
> - ring->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
> - ring->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
> - ring->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
> - ring->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> - ring->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
> - ring->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
> - ring->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
> - ring->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
> - ring->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> + engine->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
> + engine->irq_get = hsw_vebox_irq_get;
> + engine->irq_barrier = gen6_irq_barrier;
> + engine->irq_put = hsw_vebox_irq_put;
> + engine->emit_batchbuffer = gen6_emit_batchbuffer;
> + if (semaphores_enabled(dev_priv)) {
> + engine->semaphore.wait = gen6_emit_wait;
> + engine->semaphore.signal = gen6_emit_signal;
> + engine->semaphore.mbox.wait[RCS] = MI_SEMAPHORE_SYNC_VER;
> + engine->semaphore.mbox.wait[VCS] = MI_SEMAPHORE_SYNC_VEV;
> + engine->semaphore.mbox.wait[BCS] = MI_SEMAPHORE_SYNC_VEB;
> + engine->semaphore.mbox.wait[VECS] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.wait[VCS2] = MI_SEMAPHORE_SYNC_INVALID;
> + engine->semaphore.mbox.signal[RCS] = GEN6_RVESYNC;
> + engine->semaphore.mbox.signal[VCS] = GEN6_VVESYNC;
> + engine->semaphore.mbox.signal[BCS] = GEN6_BVESYNC;
> + engine->semaphore.mbox.signal[VECS] = GEN6_NOSYNC;
> + engine->semaphore.mbox.signal[VCS2] = GEN6_NOSYNC;
> }
> }
> - ring->init = init_ring_common;
>
> - return intel_init_ring_buffer(dev, ring);
> + return intel_engine_enable_execlists(engine);
> }
>
> int
> -intel_ring_flush_all_caches(struct intel_engine_cs *ring)
> +intel_engine_flush(struct intel_engine_cs *engine,
> + struct intel_context *ctx)
> {
> + struct i915_gem_request *rq;
> int ret;
>
> - if (!ring->gpu_caches_dirty)
> + rq = intel_engine_alloc_request(engine, ctx);
> + if (IS_ERR(rq))
> + return PTR_ERR(rq);
> +
> + ret = i915_request_emit_breadcrumb(rq);
> + if (ret == 0)
> + ret = i915_request_commit(rq);
> + i915_request_put(rq);
> +
> + return ret;
> +}
> +
> +int intel_engine_sync(struct intel_engine_cs *engine)
> +{
> + /* Wait upon the last request to be completed */
> + if (engine->last_request == NULL)
> return 0;
>
> - ret = ring->flush(ring, 0, I915_GEM_GPU_DOMAINS);
> + return i915_request_wait(engine->last_request);
> +}
> +
> +static u32
> +next_seqno(struct drm_i915_private *i915)
> +{
> + /* reserve 0 for non-seqno */
> + if (++i915->next_seqno == 0)
> + ++i915->next_seqno;
> + return i915->next_seqno;
> +}
> +
> +struct i915_gem_request *
> +intel_engine_alloc_request(struct intel_engine_cs *engine,
> + struct intel_context *ctx)
> +{
> + struct intel_ringbuffer *ring;
> + struct i915_gem_request *rq;
> + int ret, n;
> +
> + ring = ctx->ring[engine->id].ring;
> + if (ring == NULL) {
> + ring = engine->get_ring(engine, ctx);
> + if (IS_ERR(ring))
> + return ERR_CAST(ring);
> +
> + ctx->ring[engine->id].ring = ring;
> + }
> +
> + rq = kzalloc(sizeof(*rq), GFP_KERNEL);
> + if (rq == NULL)
> + return ERR_PTR(-ENOMEM);
> +
> + kref_init(&rq->kref);
> + INIT_LIST_HEAD(&rq->vmas);
> + INIT_LIST_HEAD(&rq->breadcrumb_link);
> +
> + rq->i915 = engine->i915;
> + rq->ring = ring;
> + rq->engine = engine;
> +
> + rq->reset_counter = atomic_read(&rq->i915->gpu_error.reset_counter);
> + if (rq->reset_counter & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED)) {
> + ret = rq->reset_counter & I915_WEDGED ? -EIO : -EAGAIN;
> + goto err;
> + }
> +
> + rq->seqno = next_seqno(rq->i915);
> + memcpy(rq->semaphore, engine->semaphore.sync, sizeof(rq->semaphore));
> + for (n = 0; n < ARRAY_SIZE(rq->semaphore); n++)
> + if (__i915_seqno_passed(rq->semaphore[n], rq->seqno))
> + rq->semaphore[n] = 0;
> + rq->head = ring->tail;
> + rq->outstanding = true;
> + rq->pending_flush = ring->pending_flush;
> +
> + rq->ctx = ctx;
> + i915_gem_context_reference(rq->ctx);
> +
> + ret = i915_request_switch_context(rq);
> if (ret)
> - return ret;
> + goto err_ctx;
> +
> + return rq;
> +
> +err_ctx:
> + i915_gem_context_unreference(ctx);
> +err:
> + kfree(rq);
> + return ERR_PTR(ret);
> +}
> +
> +struct i915_gem_request *
> +intel_engine_seqno_to_request(struct intel_engine_cs *engine,
> + u32 seqno)
> +{
> + struct i915_gem_request *rq;
> +
> + list_for_each_entry(rq, &engine->requests, engine_list) {
> + if (rq->seqno == seqno)
> + return rq;
> +
> + if (__i915_seqno_passed(rq->seqno, seqno))
> + break;
> + }
> +
> + return NULL;
> +}
> +
> +void intel_engine_cleanup(struct intel_engine_cs *engine)
> +{
> + WARN_ON(engine->last_request);
> +
> + if (engine->cleanup)
> + engine->cleanup(engine);
> +}
> +
> +static void intel_engine_clear_rings(struct intel_engine_cs *engine)
> +{
> + struct intel_ringbuffer *ring;
> +
> + list_for_each_entry(ring, &engine->rings, engine_list) {
> + if (ring->retired_head != -1) {
> + ring->head = ring->retired_head;
> + ring->retired_head = -1;
> +
> + ring->space = intel_ring_space(ring);
> + }
> +
> + if (ring->last_context != NULL) {
> + struct drm_i915_gem_object *obj;
> +
> + obj = ring->last_context->ring[engine->id].state;
> + if (obj)
> + i915_gem_object_ggtt_unpin(obj);
> +
> + ring->last_context = NULL;
> + }
> + }
> +}
> +
> +int intel_engine_suspend(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_private *dev_priv = engine->i915;
> + int ret = 0;
> +
> + if (WARN_ON(!intel_engine_initialized(engine)))
> + return 0;
> +
> + I915_WRITE_IMR(engine, ~0);
> +
> + if (engine->suspend)
> + ret = engine->suspend(engine);
> +
> + intel_engine_clear_rings(engine);
> +
> + return ret;
> +}
> +
> +int intel_engine_resume(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_private *dev_priv = engine->i915;
> + int ret = 0;
> +
> + if (WARN_ON(!intel_engine_initialized(engine)))
> + return 0;
> +
> + if (engine->resume)
> + ret = engine->resume(engine);
> +
> + I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
> + return ret;
> +}
> +
> +int intel_engine_retire(struct intel_engine_cs *engine,
> + u32 seqno)
> +{
> + int count;
> +
> + if (engine->retire)
> + engine->retire(engine, seqno);
> +
> + count = 0;
> + while (!list_empty(&engine->requests)) {
> + struct i915_gem_request *rq;
> +
> + rq = list_first_entry(&engine->requests,
> + struct i915_gem_request,
> + engine_list);
> +
> + if (!__i915_seqno_passed(seqno, rq->seqno))
> + break;
> +
> + i915_request_retire(rq);
> + count++;
> + }
> +
> + if (unlikely(engine->trace_irq_seqno &&
> + __i915_seqno_passed(seqno, engine->trace_irq_seqno))) {
> + engine->irq_put(engine);
> + engine->trace_irq_seqno = 0;
> + }
> +
> + return count;
> +}
> +
> +static struct i915_gem_request *
> +find_active_batch(struct list_head *list)
> +{
> + struct i915_gem_request *rq, *last = NULL;
> +
> + list_for_each_entry(rq, list, engine_list) {
> + if (rq->batch == NULL)
> + continue;
>
> - trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> + if (!__i915_request_complete__wa(rq))
> + return rq;
> +
> + last = rq;
> + }
> +
> + return last;
> +}
> +
> +static bool context_is_banned(const struct intel_context *ctx,
> + unsigned long now)
> +{
> + const struct i915_ctx_hang_stats *hs = &ctx->hang_stats;
> +
> + if (hs->banned)
> + return true;
> +
> + if (hs->ban_period_seconds == 0)
> + return false;
> +
> + if (now - hs->guilty_ts <= hs->ban_period_seconds) {
> + if (!i915_gem_context_is_default(ctx)) {
> + DRM_DEBUG("context hanging too fast, banning!\n");
> + return true;
> + } else if (i915_stop_ring_allow_ban(ctx->i915)) {
> + if (i915_stop_ring_allow_warn(ctx->i915))
> + DRM_ERROR("gpu hanging too fast, banning!\n");
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +static void
> +intel_engine_hangstats(struct intel_engine_cs *engine)
> +{
> + struct i915_ctx_hang_stats *hs;
> + struct i915_gem_request *rq;
> +
> + rq = find_active_batch(&engine->requests);
> + if (rq == NULL)
> + return;
> +
> + hs = &rq->ctx->hang_stats;
> + if (engine->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) {
> + unsigned long now = get_seconds();
> + hs->banned = context_is_banned(rq->ctx, now);
> + hs->guilty_ts = now;
> + hs->batch_active++;
> + } else
> + hs->batch_pending++;
> +
> + list_for_each_entry_continue(rq, &engine->requests, engine_list) {
> + if (rq->batch == NULL)
> + continue;
> +
> + if (__i915_request_complete__wa(rq))
> + continue;
> +
> + rq->ctx->hang_stats.batch_pending++;
> + }
> +}
> +
> +void intel_engine_reset(struct intel_engine_cs *engine)
> +{
> + if (WARN_ON(!intel_engine_initialized(engine)))
> + return;
> +
> + if (engine->reset)
> + engine->reset(engine);
> +
> + memset(&engine->hangcheck, 0, sizeof(engine->hangcheck));
> + intel_engine_hangstats(engine);
> +
> + intel_engine_retire(engine, engine->i915->next_seqno);
> + intel_engine_clear_rings(engine);
> +}
> +
> +static int ring_wait(struct intel_ringbuffer *ring, int n)
> +{
> + int ret;
> +
> + trace_intel_ringbuffer_wait(ring, n);
> +
> + do {
> + struct i915_gem_request *rq;
> +
> + i915_gem_retire_requests__engine(ring->engine);
> + if (ring->retired_head != -1) {
> + ring->head = ring->retired_head;
> + ring->retired_head = -1;
> +
> + ring->space = intel_ring_space(ring);
> + if (ring->space >= n)
> + return 0;
> + }
> +
> + list_for_each_entry(rq, &ring->breadcrumbs, breadcrumb_link)
> + if (__intel_ring_space(rq->tail, ring->tail,
> + ring->size, I915_RING_RSVD) >= n)
> + break;
> +
> + if (WARN_ON(&rq->breadcrumb_link == &ring->breadcrumbs))
> + return -EDEADLK;
> +
> + ret = i915_request_wait(rq);
> + } while (ret == 0);
> +
> + return ret;
> +}
> +
> +static int ring_wrap(struct intel_ringbuffer *ring, int bytes)
> +{
> + uint32_t __iomem *virt;
> + int rem;
> +
> + rem = ring->size - ring->tail;
> + if (unlikely(ring->space < rem)) {
> + rem = ring_wait(ring, rem);
> + if (rem)
> + return rem;
> + }
> +
> + trace_intel_ringbuffer_wrap(ring, rem);
> +
> + virt = ring->virtual_start + ring->tail;
> + rem = ring->size - ring->tail;
> +
> + ring->space -= rem;
> + ring->tail = 0;
> +
> + rem /= 4;
> + while (rem--)
> + iowrite32(MI_NOOP, virt++);
>
> - ring->gpu_caches_dirty = false;
> return 0;
> }
>
> -int
> -intel_ring_invalidate_all_caches(struct intel_engine_cs *ring)
> +static int __intel_ring_prepare(struct intel_ringbuffer *ring,
> + int bytes)
> {
> - uint32_t flush_domains;
> int ret;
>
> - flush_domains = 0;
> - if (ring->gpu_caches_dirty)
> - flush_domains = I915_GEM_GPU_DOMAINS;
> + trace_intel_ringbuffer_begin(ring, bytes);
>
> - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
> - if (ret)
> - return ret;
> + if (unlikely(ring->tail + bytes > ring->effective_size)) {
> + ret = ring_wrap(ring, bytes);
> + if (unlikely(ret))
> + return ret;
> + }
>
> - trace_i915_gem_ring_flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
> + if (unlikely(ring->space < bytes)) {
> + ret = ring_wait(ring, bytes);
> + if (unlikely(ret))
> + return ret;
> + }
>
> - ring->gpu_caches_dirty = false;
> return 0;
> }
>
> -void
> -intel_stop_ring_buffer(struct intel_engine_cs *ring)
> +struct intel_ringbuffer *
> +intel_ring_begin(struct i915_gem_request *rq,
> + int num_dwords)
> {
> + struct intel_ringbuffer *ring = rq->ring;
> int ret;
>
> - if (!intel_ring_initialized(ring))
> - return;
> + /* TAIL updates must be aligned to a qword, so make sure we
> + * reserve space for any implicit padding required for this
> + * command.
> + */
> + ret = __intel_ring_prepare(ring,
> + ALIGN(num_dwords, 2) * sizeof(uint32_t));
> + if (ret)
> + return ERR_PTR(ret);
> +
> + ring->space -= num_dwords * sizeof(uint32_t);
> +
> + return ring;
> +}
> +
> +/* Align the ring tail to a cacheline boundary */
> +int intel_ring_cacheline_align(struct i915_gem_request *rq)
> +{
> + struct intel_ringbuffer *ring;
> + int tail, num_dwords;
> +
> + do {
> + tail = rq->ring->tail;
> + num_dwords = (tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t);
> + if (num_dwords == 0)
> + return 0;
> +
> + num_dwords = CACHELINE_BYTES / sizeof(uint32_t) - num_dwords;
> + ring = intel_ring_begin(rq, num_dwords);
> + if (IS_ERR(ring))
> + return PTR_ERR(ring);
> + } while (tail != rq->ring->tail);
> +
> + while (num_dwords--)
> + intel_ring_emit(ring, MI_NOOP);
> +
> + intel_ring_advance(ring);
> +
> + return 0;
> +}
> +
> +struct i915_gem_request *
> +intel_engine_find_active_batch(struct intel_engine_cs *engine)
> +{
> + struct i915_gem_request *rq;
> + unsigned long flags;
>
> - ret = intel_ring_idle(ring);
> - if (ret && !i915_reset_in_progress(&to_i915(ring->dev)->gpu_error))
> - DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
> - ring->name, ret);
> + spin_lock_irqsave(&engine->irqlock, flags);
> + rq = find_active_batch(&engine->submitted);
> + spin_unlock_irqrestore(&engine->irqlock, flags);
> + if (rq)
> + return rq;
>
> - stop_ring(ring);
> + return find_active_batch(&engine->requests);
> }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index d689bac5c84f..46c8d2288821 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -20,61 +20,47 @@
> * "If the Ring Buffer Head Pointer and the Tail Pointer are on the same
> * cacheline, the Head Pointer must not be greater than the Tail
> * Pointer."
> + *
> + * To also accommodate errata on 830/845 which makes the last pair of cachlines
> + * in the ringbuffer unavailable, reduce the available space further.
> */
> -#define I915_RING_FREE_SPACE 64
> +#define I915_RING_RSVD (2*CACHELINE_BYTES)
>
> -struct intel_hw_status_page {
> +struct intel_hw_status_page {
> u32 *page_addr;
> unsigned int gfx_addr;
> struct drm_i915_gem_object *obj;
> };
>
> -#define I915_READ_TAIL(ring) I915_READ(RING_TAIL((ring)->mmio_base))
> -#define I915_WRITE_TAIL(ring, val) I915_WRITE(RING_TAIL((ring)->mmio_base), val)
> +#define I915_READ_TAIL(engine) I915_READ(RING_TAIL((engine)->mmio_base))
> +#define I915_WRITE_TAIL(engine, val) I915_WRITE(RING_TAIL((engine)->mmio_base), val)
>
> -#define I915_READ_START(ring) I915_READ(RING_START((ring)->mmio_base))
> -#define I915_WRITE_START(ring, val) I915_WRITE(RING_START((ring)->mmio_base), val)
> +#define I915_READ_START(engine) I915_READ(RING_START((engine)->mmio_base))
> +#define I915_WRITE_START(engine, val) I915_WRITE(RING_START((engine)->mmio_base), val)
>
> -#define I915_READ_HEAD(ring) I915_READ(RING_HEAD((ring)->mmio_base))
> -#define I915_WRITE_HEAD(ring, val) I915_WRITE(RING_HEAD((ring)->mmio_base), val)
> +#define I915_READ_HEAD(engine) I915_READ(RING_HEAD((engine)->mmio_base))
> +#define I915_WRITE_HEAD(engine, val) I915_WRITE(RING_HEAD((engine)->mmio_base), val)
>
> -#define I915_READ_CTL(ring) I915_READ(RING_CTL((ring)->mmio_base))
> -#define I915_WRITE_CTL(ring, val) I915_WRITE(RING_CTL((ring)->mmio_base), val)
> +#define I915_READ_CTL(engine) I915_READ(RING_CTL((engine)->mmio_base))
> +#define I915_WRITE_CTL(engine, val) I915_WRITE(RING_CTL((engine)->mmio_base), val)
>
> -#define I915_READ_IMR(ring) I915_READ(RING_IMR((ring)->mmio_base))
> -#define I915_WRITE_IMR(ring, val) I915_WRITE(RING_IMR((ring)->mmio_base), val)
> +#define I915_READ_IMR(engine) I915_READ(RING_IMR((engine)->mmio_base))
> +#define I915_WRITE_IMR(engine, val) I915_WRITE(RING_IMR((engine)->mmio_base), val)
>
> -#define I915_READ_MODE(ring) I915_READ(RING_MI_MODE((ring)->mmio_base))
> -#define I915_WRITE_MODE(ring, val) I915_WRITE(RING_MI_MODE((ring)->mmio_base), val)
> +#define I915_READ_MODE(engine) I915_READ(RING_MI_MODE((engine)->mmio_base))
> +#define I915_WRITE_MODE(engine, val) I915_WRITE(RING_MI_MODE((engine)->mmio_base), val)
>
> /* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to
> * do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
> */
> #define i915_semaphore_seqno_size sizeof(uint64_t)
> -#define GEN8_SIGNAL_OFFSET(__ring, to) \
> - (i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
> - ((__ring)->id * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
> - (i915_semaphore_seqno_size * (to)))
> -
> -#define GEN8_WAIT_OFFSET(__ring, from) \
> - (i915_gem_obj_ggtt_offset(dev_priv->semaphore_obj) + \
> - ((from) * I915_NUM_RINGS * i915_semaphore_seqno_size) + \
> - (i915_semaphore_seqno_size * (__ring)->id))
> -
> -#define GEN8_RING_SEMAPHORE_INIT do { \
> - if (!dev_priv->semaphore_obj) { \
> - break; \
> - } \
> - ring->semaphore.signal_ggtt[RCS] = GEN8_SIGNAL_OFFSET(ring, RCS); \
> - ring->semaphore.signal_ggtt[VCS] = GEN8_SIGNAL_OFFSET(ring, VCS); \
> - ring->semaphore.signal_ggtt[BCS] = GEN8_SIGNAL_OFFSET(ring, BCS); \
> - ring->semaphore.signal_ggtt[VECS] = GEN8_SIGNAL_OFFSET(ring, VECS); \
> - ring->semaphore.signal_ggtt[VCS2] = GEN8_SIGNAL_OFFSET(ring, VCS2); \
> - ring->semaphore.signal_ggtt[ring->id] = MI_SEMAPHORE_SYNC_INVALID; \
> - } while(0)
> -
> -enum intel_ring_hangcheck_action {
> +#define GEN8_SEMAPHORE_OFFSET(__dp, __from, __to) \
> + (i915_gem_obj_ggtt_offset((__dp)->semaphore_obj) + \
> + ((__from) * I915_NUM_ENGINES + (__to)) * i915_semaphore_seqno_size)
> +
> +enum intel_engine_hangcheck_action {
> HANGCHECK_IDLE = 0,
> + HANGCHECK_IDLE_WAITERS,
> HANGCHECK_WAIT,
> HANGCHECK_ACTIVE,
> HANGCHECK_ACTIVE_LOOP,
> @@ -84,47 +70,61 @@ enum intel_ring_hangcheck_action {
>
> #define HANGCHECK_SCORE_RING_HUNG 31
>
> -struct intel_ring_hangcheck {
> +struct intel_engine_hangcheck {
> u64 acthd;
> u64 max_acthd;
> u32 seqno;
> + u32 interrupts;
> int score;
> - enum intel_ring_hangcheck_action action;
> + enum intel_engine_hangcheck_action action;
> int deadlock;
> };
>
> +struct i915_gem_request;
> +struct intel_context;
> +struct intel_engine_cs;
> +
> struct intel_ringbuffer {
> + struct intel_context *last_context;
> +
> + struct intel_engine_cs *engine;
> + struct intel_context *ctx;
> + struct list_head engine_list;
> +
> struct drm_i915_gem_object *obj;
> void __iomem *virtual_start;
>
> - struct intel_engine_cs *ring;
> -
> - /*
> - * FIXME: This backpointer is an artifact of the history of how the
> - * execlist patches came into being. It will get removed once the basic
> - * code has landed.
> + /**
> + * List of breadcrumbs associated with GPU requests currently
> + * outstanding.
> */
> - struct intel_context *FIXME_lrc_ctx;
> + struct list_head requests;
> + struct list_head breadcrumbs;
>
> - u32 head;
> - u32 tail;
> + int head;
> + int tail;
> int space;
> +
> int size;
> int effective_size;
>
> /** We track the position of the requests in the ring buffer, and
> - * when each is retired we increment last_retired_head as the GPU
> + * when each is retired we increment retired_head as the GPU
> * must have finished processing the request and so we know we
> * can advance the ringbuffer up to that position.
> *
> - * last_retired_head is set to -1 after the value is consumed so
> + * retired_head is set to -1 after the value is consumed so
> * we can detect new retirements.
> */
> - u32 last_retired_head;
> + int retired_head;
> + int breadcrumb_tail;
> +
> + unsigned pending_flush:4;
> };
>
> -struct intel_engine_cs {
> - const char *name;
> +struct intel_engine_cs {
> + struct drm_i915_private *i915;
> + const char *name;
> enum intel_ring_id {
> RCS = 0x0,
> VCS,
> @@ -132,46 +132,82 @@ struct intel_engine_cs {
> VECS,
> VCS2
> } id;
> -#define I915_NUM_RINGS 5
> +#define I915_NUM_ENGINES 5
> +#define I915_NUM_ENGINE_BITS 4
> #define LAST_USER_RING (VECS + 1)
> - u32 mmio_base;
> - struct drm_device *dev;
> - struct intel_ringbuffer *buffer;
> + u32 mmio_base;
> + u32 power_domains;
>
> - struct intel_hw_status_page status_page;
> + /* protects requests against hangcheck */
> + spinlock_t lock;
> + /* protects exlists: pending + submitted */
> + spinlock_t irqlock;
>
> - unsigned irq_refcount; /* protected by dev_priv->irq_lock */
> - u32 irq_enable_mask; /* bitmask to enable ring interrupt */
> - u32 trace_irq_seqno;
> - bool __must_check (*irq_get)(struct intel_engine_cs *ring);
> - void (*irq_put)(struct intel_engine_cs *ring);
> + atomic_t interrupts;
> + u32 breadcrumb[I915_NUM_ENGINES];
> + u16 tag, next_tag;
> +
> + struct list_head rings;
> + struct list_head requests;
> + struct list_head pending, submitted;
> + struct i915_gem_request *last_request;
>
> - int (*init)(struct intel_engine_cs *ring);
> + struct intel_hw_status_page status_page;
>
> - int (*init_context)(struct intel_engine_cs *ring);
> + struct intel_ringbuffer *legacy_ring;
> +
> + unsigned irq_refcount; /* protected by i915->irq_lock */
> + u32 irq_enable_mask; /* bitmask to enable ring interrupt */
> + u32 irq_keep_mask; /* never mask these interrupts */
> + u32 trace_irq_seqno;
> + bool __must_check (*irq_get)(struct intel_engine_cs *engine);
> + void (*irq_barrier)(struct intel_engine_cs *engine);
> + void (*irq_put)(struct intel_engine_cs *engine);
> +
> + struct intel_ringbuffer *
> + (*get_ring)(struct intel_engine_cs *engine,
> + struct intel_context *ctx);
> + void (*put_ring)(struct intel_ringbuffer *ring,
> + struct intel_context *ctx);
> +
> + void (*retire)(struct intel_engine_cs *engine,
> + u32 seqno);
> + void (*reset)(struct intel_engine_cs *engine);
> + int (*suspend)(struct intel_engine_cs *engine);
> + int (*resume)(struct intel_engine_cs *engine);
> + void (*cleanup)(struct intel_engine_cs *engine);
>
> - void (*write_tail)(struct intel_engine_cs *ring,
> - u32 value);
> - int __must_check (*flush)(struct intel_engine_cs *ring,
> - u32 invalidate_domains,
> - u32 flush_domains);
> - int (*add_request)(struct intel_engine_cs *ring);
> /* Some chipsets are not quite as coherent as advertised and need
> * an expensive kick to force a true read of the up-to-date seqno.
> * However, the up-to-date seqno is not always required and the last
> * seen value is good enough. Note that the seqno will always be
> * monotonic, even if not coherent.
> */
> - u32 (*get_seqno)(struct intel_engine_cs *ring,
> - bool lazy_coherency);
> - void (*set_seqno)(struct intel_engine_cs *ring,
> + u32 (*get_seqno)(struct intel_engine_cs *engine);
> + void (*set_seqno)(struct intel_engine_cs *engine,
> u32 seqno);
> - int (*dispatch_execbuffer)(struct intel_engine_cs *ring,
> - u64 offset, u32 length,
> - unsigned flags);
> +
> + int (*init_context)(struct i915_gem_request *rq);
> +
> + int __must_check (*emit_flush)(struct i915_gem_request *rq,
> + u32 domains);
> +#define I915_FLUSH_CACHES 0x1
> +#define I915_INVALIDATE_CACHES 0x2
> +#define I915_KICK_FBC 0x4
> +#define I915_COMMAND_BARRIER 0x8
> + int __must_check (*emit_batchbuffer)(struct i915_gem_request *rq,
> + u64 offset, u32 length,
> + unsigned flags);
> + int __must_check (*emit_breadcrumb)(struct i915_gem_request *rq);
> +
> + int __must_check (*add_request)(struct i915_gem_request *rq);
> + void (*write_tail)(struct intel_engine_cs *engine,
> + u32 value);
> +
> + bool (*is_complete)(struct i915_gem_request *rq);
> +
> #define I915_DISPATCH_SECURE 0x1
> #define I915_DISPATCH_PINNED 0x2
> - void (*cleanup)(struct intel_engine_cs *ring);
>
> /* GEN8 signal/wait table - never trust comments!
> * signal to signal to signal to signal to signal to
> @@ -211,38 +247,24 @@ struct intel_engine_cs {
> * ie. transpose of f(x, y)
> */
> struct {
> - u32 sync_seqno[I915_NUM_RINGS-1];
> -
> - union {
> - struct {
> - /* our mbox written by others */
> - u32 wait[I915_NUM_RINGS];
> - /* mboxes this ring signals to */
> - u32 signal[I915_NUM_RINGS];
> - } mbox;
> - u64 signal_ggtt[I915_NUM_RINGS];
> - };
> -
> - /* AKA wait() */
> - int (*sync_to)(struct intel_engine_cs *ring,
> - struct intel_engine_cs *to,
> - u32 seqno);
> - int (*signal)(struct intel_engine_cs *signaller,
> - /* num_dwords needed by caller */
> - unsigned int num_dwords);
> + struct {
> + /* our mbox written by others */
> + u32 wait[I915_NUM_ENGINES];
> + /* mboxes this ring signals to */
> + u32 signal[I915_NUM_ENGINES];
> + } mbox;
> +
> + int (*wait)(struct i915_gem_request *waiter,
> + struct i915_gem_request *signaller);
> + int (*signal)(struct i915_gem_request *rq, int id);
> +
> + u32 sync[I915_NUM_ENGINES];
> } semaphore;
>
> /* Execlists */
> - spinlock_t execlist_lock;
> - struct list_head execlist_queue;
> + bool execlists_enabled;
> + u32 execlists_submitted;
> u8 next_context_status_buffer;
> - u32 irq_keep_mask; /* bitmask for interrupts that should not be masked */
> - int (*emit_request)(struct intel_ringbuffer *ringbuf);
> - int (*emit_flush)(struct intel_ringbuffer *ringbuf,
> - u32 invalidate_domains,
> - u32 flush_domains);
> - int (*emit_bb_start)(struct intel_ringbuffer *ringbuf,
> - u64 offset, unsigned flags);
>
> /**
> * List of objects currently involved in rendering from the
> @@ -254,28 +276,13 @@ struct intel_engine_cs {
> *
> * A reference is held on the buffer while on this list.
> */
> - struct list_head active_list;
> -
> - /**
> - * List of breadcrumbs associated with GPU requests currently
> - * outstanding.
> - */
> - struct list_head request_list;
> -
> - /**
> - * Do we have some not yet emitted requests outstanding?
> - */
> - struct drm_i915_gem_request *preallocated_lazy_request;
> - u32 outstanding_lazy_seqno;
> - bool gpu_caches_dirty;
> - bool fbc_dirty;
> + struct list_head read_list, write_list, fence_list;
>
> wait_queue_head_t irq_queue;
>
> struct intel_context *default_context;
> - struct intel_context *last_context;
>
> - struct intel_ring_hangcheck hangcheck;
> + struct intel_engine_hangcheck hangcheck;
>
> struct {
> struct drm_i915_gem_object *obj;
> @@ -317,49 +324,32 @@ struct intel_engine_cs {
> u32 (*get_cmd_length_mask)(u32 cmd_header);
> };
>
> -bool intel_ring_initialized(struct intel_engine_cs *ring);
> -
> -static inline unsigned
> -intel_ring_flag(struct intel_engine_cs *ring)
> +static inline bool
> +intel_engine_initialized(struct intel_engine_cs *engine)
> {
> - return 1 << ring->id;
> + return engine->default_context;
> }
>
> -static inline u32
> -intel_ring_sync_index(struct intel_engine_cs *ring,
> - struct intel_engine_cs *other)
> +static inline unsigned
> +intel_engine_flag(struct intel_engine_cs *engine)
> {
> - int idx;
> -
> - /*
> - * rcs -> 0 = vcs, 1 = bcs, 2 = vecs, 3 = vcs2;
> - * vcs -> 0 = bcs, 1 = vecs, 2 = vcs2, 3 = rcs;
> - * bcs -> 0 = vecs, 1 = vcs2. 2 = rcs, 3 = vcs;
> - * vecs -> 0 = vcs2, 1 = rcs, 2 = vcs, 3 = bcs;
> - * vcs2 -> 0 = rcs, 1 = vcs, 2 = bcs, 3 = vecs;
> - */
> -
> - idx = (other - ring) - 1;
> - if (idx < 0)
> - idx += I915_NUM_RINGS;
> -
> - return idx;
> + return 1 << engine->id;
> }
>
> static inline u32
> -intel_read_status_page(struct intel_engine_cs *ring,
> +intel_read_status_page(struct intel_engine_cs *engine,
> int reg)
> {
> /* Ensure that the compiler doesn't optimize away the load. */
> barrier();
> - return ring->status_page.page_addr[reg];
> + return engine->status_page.page_addr[reg];
> }
>
> static inline void
> -intel_write_status_page(struct intel_engine_cs *ring,
> +intel_write_status_page(struct intel_engine_cs *engine,
> int reg, u32 value)
> {
> - ring->status_page.page_addr[reg] = value;
> + engine->status_page.page_addr[reg] = value;
> }
>
> /**
> @@ -381,64 +371,77 @@ intel_write_status_page(struct intel_engine_cs *ring,
> #define I915_GEM_HWS_SCRATCH_INDEX 0x30
> #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
>
> -void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf);
> -int intel_alloc_ringbuffer_obj(struct drm_device *dev,
> - struct intel_ringbuffer *ringbuf);
> +struct intel_ringbuffer *
> +intel_engine_alloc_ring(struct intel_engine_cs *engine,
> + struct intel_context *ctx,
> + int size);
> +void intel_ring_free(struct intel_ringbuffer *ring);
>
> -void intel_stop_ring_buffer(struct intel_engine_cs *ring);
> -void intel_cleanup_ring_buffer(struct intel_engine_cs *ring);
> -
> -int __must_check intel_ring_begin(struct intel_engine_cs *ring, int n);
> -int __must_check intel_ring_cacheline_align(struct intel_engine_cs *ring);
> -static inline void intel_ring_emit(struct intel_engine_cs *ring,
> +struct intel_ringbuffer *__must_check
> +intel_ring_begin(struct i915_gem_request *rq, int n);
> +int __must_check intel_ring_cacheline_align(struct i915_gem_request *rq);
> +static inline void intel_ring_emit(struct intel_ringbuffer *ring,
> u32 data)
> {
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - iowrite32(data, ringbuf->virtual_start + ringbuf->tail);
> - ringbuf->tail += 4;
> + iowrite32(data, ring->virtual_start + ring->tail);
> + ring->tail += 4;
> }
> -static inline void intel_ring_advance(struct intel_engine_cs *ring)
> +static inline void intel_ring_advance(struct intel_ringbuffer *ring)
> {
> - struct intel_ringbuffer *ringbuf = ring->buffer;
> - ringbuf->tail &= ringbuf->size - 1;
> + ring->tail &= ring->size - 1;
> }
> -int __intel_ring_space(int head, int tail, int size);
> -int intel_ring_space(struct intel_ringbuffer *ringbuf);
> -bool intel_ring_stopped(struct intel_engine_cs *ring);
> -void __intel_ring_advance(struct intel_engine_cs *ring);
> -
> -int __must_check intel_ring_idle(struct intel_engine_cs *ring);
> -void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
> -int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
> -int intel_ring_invalidate_all_caches(struct intel_engine_cs *ring);
> -
> -void intel_fini_pipe_control(struct intel_engine_cs *ring);
> -int intel_init_pipe_control(struct intel_engine_cs *ring);
> -
> -int intel_init_render_ring_buffer(struct drm_device *dev);
> -int intel_init_bsd_ring_buffer(struct drm_device *dev);
> -int intel_init_bsd2_ring_buffer(struct drm_device *dev);
> -int intel_init_blt_ring_buffer(struct drm_device *dev);
> -int intel_init_vebox_ring_buffer(struct drm_device *dev);
>
> -u64 intel_ring_get_active_head(struct intel_engine_cs *ring);
> -void intel_ring_setup_status_page(struct intel_engine_cs *ring);
> -
> -static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf)
> +static inline int __intel_ring_space(int head, int tail, int size, int rsvd)
> {
> - return ringbuf->tail;
> + int space = head - (tail + 8);
> + if (space < 0)
> + space += size;
> + return space - rsvd;
> }
>
> -static inline u32 intel_ring_get_seqno(struct intel_engine_cs *ring)
> +static inline int intel_ring_space(struct intel_ringbuffer *ring)
> {
> - BUG_ON(ring->outstanding_lazy_seqno == 0);
> - return ring->outstanding_lazy_seqno;
> + return __intel_ring_space(ring->head, ring->tail,
> + ring->size, I915_RING_RSVD);
> }
>
> -static inline void i915_trace_irq_get(struct intel_engine_cs *ring, u32 seqno)
> +
> +struct i915_gem_request * __must_check __attribute__((nonnull))
> +intel_engine_alloc_request(struct intel_engine_cs *engine,
> + struct intel_context *ctx);
> +
> +struct i915_gem_request *
> +intel_engine_find_active_batch(struct intel_engine_cs *engine);
> +
> +struct i915_gem_request *
> +intel_engine_seqno_to_request(struct intel_engine_cs *engine,
> + u32 seqno);
> +
> +int intel_init_render_engine(struct drm_i915_private *i915);
> +int intel_init_bsd_engine(struct drm_i915_private *i915);
> +int intel_init_bsd2_engine(struct drm_i915_private *i915);
> +int intel_init_blt_engine(struct drm_i915_private *i915);
> +int intel_init_vebox_engine(struct drm_i915_private *i915);
> +
> +#define intel_engine_hang(engine) \
> + (engine->i915->gpu_error.stop_rings & intel_engine_flag(engine))
> +int __must_check intel_engine_sync(struct intel_engine_cs *engine);
> +int __must_check intel_engine_flush(struct intel_engine_cs *engine,
> + struct intel_context *ctx);
> +
> +int intel_engine_retire(struct intel_engine_cs *engine, u32 seqno);
> +void intel_engine_reset(struct intel_engine_cs *engine);
> +int intel_engine_suspend(struct intel_engine_cs *engine);
> +int intel_engine_resume(struct intel_engine_cs *engine);
> +void intel_engine_cleanup(struct intel_engine_cs *engine);
> +
> +
> +u64 intel_engine_get_active_head(struct intel_engine_cs *engine);
> +
> +static inline void i915_trace_irq_get(struct intel_engine_cs *engine, u32 seqno)
> {
> - if (ring->trace_irq_seqno == 0 && ring->irq_get(ring))
> - ring->trace_irq_seqno = seqno;
> + if (engine->trace_irq_seqno || engine->irq_get(engine))
> + engine->trace_irq_seqno = seqno;
> }
>
> #endif /* _INTEL_RINGBUFFER_H_ */
More information about the Intel-gfx
mailing list