[Intel-gfx] [PATCH 21/31] drm/i915: Convert engine->write_tail to operate on a request

Thu Jul 28 15:05:28 UTC 2016

On 27/07/16 13:29, Chris Wilson wrote:
> On Wed, Jul 27, 2016 at 12:53:25PM +0100, Dave Gordon wrote:
>> On 25/07/16 08:44, Chris Wilson wrote:
>>> If we rewrite the I915_WRITE_TAIL specialisation for the legacy
>>> ringbuffer as submitting the request onto the ringbuffer, we can unify
>>> the vfunc with both execlists and GuC in the next patch.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem_request.c |  8 ++---
>>> drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
>>> drivers/gpu/drm/i915/intel_ringbuffer.c | 53 +++++++++++++++++----------------
>>> drivers/gpu/drm/i915/intel_ringbuffer.h |  3 +-
>>> 4 files changed, 32 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>>> index 1c185e293bf0..8814e9c5266b 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>>> @@ -467,15 +467,13 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>>> 	 */
>>> 	request->postfix = ring->tail;
>>>
>>> -	if (i915.enable_execlists) {
>>> +	if (i915.enable_execlists)
>>> 		ret = engine->emit_request(request);
>>> -	} else {
>>> +	else
>>> 		ret = engine->add_request(request);
>>> -
>>> -		request->tail = ring->tail;
>>> -	}
>>> 	/* Not allowed to fail! */
>>> 	WARN(ret, "emit|add_request failed: %d!\n", ret);
>>> +
>>> 	/* Sanity check that the reserved size was large enough. */
>>> 	ret = ring->tail - request_start;
>>> 	if (ret < 0)
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 567d94de3300..250edb2bcef7 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -373,7 +373,7 @@ static void execlists_update_context(struct drm_i915_gem_request *rq)
>>> 	struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
>>> 	uint32_t *reg_state = rq->ctx->engine[engine->id].lrc_reg_state;
>>>
>>> -	reg_state[CTX_RING_TAIL+1] = rq->tail;
>>> +	reg_state[CTX_RING_TAIL+1] = rq->tail % (rq->ring->size - 1);
>>
>> mod ringsize-1 ?
>>
>> Surely tail % ringsize, or tail & (ringsize-1).
>>
>> But it's redundant anyway, rq->tail cannot exceed ring->size,
>> so the original code was correct.
>
> No, rq->tail can be equal to ring->size which leads to a GPU hang.
> (Observed on the older gen at least, I'd rather have the same paranoia
> here.)
> -Chris

Even if it's not redundant, it's still the wrong number. The code above 
would result in tail (==size) being converted to 1 rather than 0.

If it's a % operation, it should be ringsize not ringsize-1. Or convert 
to an & operation with ringsize-1.

.Dave.