[Intel-gfx] [PATCH 1/4] drm/i915: Replace global_seqno with a hangcheck heartbeat seqno

Tue Feb 26 07:34:37 UTC 2019

On 25/02/2019 18:40, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-02-25 17:59:40)
>>
>> On 25/02/2019 16:23, Chris Wilson wrote:
>>>    static inline struct i915_priolist *to_priolist(struct rb_node *rb)
>>>    {
>>>        return rb_entry(rb, struct i915_priolist, node);
>>> @@ -2206,6 +2212,10 @@ static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
>>>                                  request->fence.seqno,
>>>                                  request->timeline->hwsp_offset);
>>>    
>>> +     cs = gen8_emit_ggtt_write(cs,
>>> +                               intel_engine_next_hangcheck_seqno(request->engine),
>>> +                               intel_hws_hangcheck_address(request->engine));
>>> +
>>>        cs = gen8_emit_ggtt_write(cs,
>>>                                  request->global_seqno,
>>>                                  intel_hws_seqno_address(request->engine));
>>> @@ -2230,6 +2240,11 @@ static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
>>>                                      PIPE_CONTROL_FLUSH_ENABLE |
>>>                                      PIPE_CONTROL_CS_STALL);
>>>    
>>> +     cs = gen8_emit_ggtt_write_rcs(cs,
>>> +                                   intel_engine_next_hangcheck_seqno(request->engine),
>>> +                                   intel_hws_hangcheck_address(request->engine),
>>> +                                   PIPE_CONTROL_CS_STALL);
>>
>> Are CS_STALL needed on two writes or only last one would be enough? Or
>> even, should all flushes be moved to the last pipe control?
> 
> The CS_STALL is overkill as there's no requirement for it to be before
> the global_seqno, but the convenience and ease to reason over win.
> 
>>> +
>>>        cs = gen8_emit_ggtt_write_rcs(cs,
>>>                                      request->global_seqno,
>>>                                      intel_hws_seqno_address(request->engine),
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> index 7f841dba87b3..870184bbd169 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> @@ -43,6 +43,12 @@
>>>     */
>>>    #define LEGACY_REQUEST_SIZE 200
>>>    
>>> +static inline u32 hws_hangcheck_address(struct intel_engine_cs *engine)
>>> +{
>>> +     return (i915_ggtt_offset(engine->status_page.vma) +
>>> +             I915_GEM_HWS_HANGCHECK_ADDR);
>>> +}
>>> +
>>
>> You can consolidate by putting the previous copy in a header.
> 
> Inline spaghetti means it didn't go where I wanted and I purposely moved
> these address computation to their users so that I can kill them off,
> one by one. As is the plan even for the new hangcheck seqno.
>   
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> index 5d45ad4ecca9..2869aaa9d225 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> @@ -6,6 +6,7 @@
>>>    
>>>    #include <linux/hashtable.h>
>>>    #include <linux/irq_work.h>
>>> +#include <linux/random.h>
>>>    #include <linux/seqlock.h>
>>>    
>>>    #include "i915_gem_batch_pool.h"
>>> @@ -119,7 +120,8 @@ struct intel_instdone {
>>>    
>>>    struct intel_engine_hangcheck {
>>>        u64 acthd;
>>> -     u32 seqno;
>>> +     u32 last_seqno;
>>> +     u32 next_seqno;
>>
>> Reading the code I got the impression:
>>
>> s/last_seqno/hangcheck_seqno/
>> s/next_seqno/last_seqno/
>>
>> Could be closer to reality. But your choice.
> 
> hangcheck.last_seqno,
> hangcheck.next_seqno
> 
> hangcheck.hangcheck_seqno? Nah.

Ok have at it.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko