[Intel-gfx] [PATCH 1/4] drm/i915: Replace global_seqno with a hangcheck heartbeat seqno

Mon Feb 25 18:40:56 UTC 2019

Quoting Tvrtko Ursulin (2019-02-25 17:59:40)
> 
> On 25/02/2019 16:23, Chris Wilson wrote:
> >   static inline struct i915_priolist *to_priolist(struct rb_node *rb)
> >   {
> >       return rb_entry(rb, struct i915_priolist, node);
> > @@ -2206,6 +2212,10 @@ static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
> >                                 request->fence.seqno,
> >                                 request->timeline->hwsp_offset);
> >   
> > +     cs = gen8_emit_ggtt_write(cs,
> > +                               intel_engine_next_hangcheck_seqno(request->engine),
> > +                               intel_hws_hangcheck_address(request->engine));
> > +
> >       cs = gen8_emit_ggtt_write(cs,
> >                                 request->global_seqno,
> >                                 intel_hws_seqno_address(request->engine));
> > @@ -2230,6 +2240,11 @@ static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
> >                                     PIPE_CONTROL_FLUSH_ENABLE |
> >                                     PIPE_CONTROL_CS_STALL);
> >   
> > +     cs = gen8_emit_ggtt_write_rcs(cs,
> > +                                   intel_engine_next_hangcheck_seqno(request->engine),
> > +                                   intel_hws_hangcheck_address(request->engine),
> > +                                   PIPE_CONTROL_CS_STALL);
> 
> Are CS_STALL needed on two writes or only last one would be enough? Or 
> even, should all flushes be moved to the last pipe control?

The CS_STALL is overkill as there's no requirement for it to be before
the global_seqno, but the convenience and ease to reason over win.

> > +
> >       cs = gen8_emit_ggtt_write_rcs(cs,
> >                                     request->global_seqno,
> >                                     intel_hws_seqno_address(request->engine),
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 7f841dba87b3..870184bbd169 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -43,6 +43,12 @@
> >    */
> >   #define LEGACY_REQUEST_SIZE 200
> >   
> > +static inline u32 hws_hangcheck_address(struct intel_engine_cs *engine)
> > +{
> > +     return (i915_ggtt_offset(engine->status_page.vma) +
> > +             I915_GEM_HWS_HANGCHECK_ADDR);
> > +}
> > +
> 
> You can consolidate by putting the previous copy in a header.

Inline spaghetti means it didn't go where I wanted and I purposely moved
these address computation to their users so that I can kill them off,
one by one. As is the plan even for the new hangcheck seqno.

> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> > index 5d45ad4ecca9..2869aaa9d225 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> > @@ -6,6 +6,7 @@
> >   
> >   #include <linux/hashtable.h>
> >   #include <linux/irq_work.h>
> > +#include <linux/random.h>
> >   #include <linux/seqlock.h>
> >   
> >   #include "i915_gem_batch_pool.h"
> > @@ -119,7 +120,8 @@ struct intel_instdone {
> >   
> >   struct intel_engine_hangcheck {
> >       u64 acthd;
> > -     u32 seqno;
> > +     u32 last_seqno;
> > +     u32 next_seqno;
> 
> Reading the code I got the impression:
> 
> s/last_seqno/hangcheck_seqno/
> s/next_seqno/last_seqno/
> 
> Could be closer to reality. But your choice.

hangcheck.last_seqno,
hangcheck.next_seqno

hangcheck.hangcheck_seqno? Nah.
-Chris