[Intel-gfx] [PATCH 09/11] drm/i915/execlists: Refactor out can_merge_rq()
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Jan 31 09:19:18 UTC 2019
On 30/01/2019 18:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-01-30 18:05:42)
>>
>> On 30/01/2019 02:19, Chris Wilson wrote:
>>> In the next patch, we add another user that wants to check whether
>>> requests can be merge into a single HW execution, and in the future we
>>> want to add more conditions under which requests from the same context
>>> cannot be merge. In preparation, extract out can_merge_rq().
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>> drivers/gpu/drm/i915/intel_lrc.c | 30 +++++++++++++++++++-----------
>>> 1 file changed, 19 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 2616b0b3e8d5..e97ce54138d3 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -285,12 +285,11 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>>> }
>>>
>>> __maybe_unused static inline bool
>>> -assert_priority_queue(const struct intel_engine_execlists *execlists,
>>> - const struct i915_request *prev,
>>> +assert_priority_queue(const struct i915_request *prev,
>>> const struct i915_request *next)
>>> {
>>> - if (!prev)
>>> - return true;
>>> + const struct intel_engine_execlists *execlists =
>>> + &prev->engine->execlists;
>>>
>>> /*
>>> * Without preemption, the prev may refer to the still active element
>>> @@ -601,6 +600,17 @@ static bool can_merge_ctx(const struct intel_context *prev,
>>> return true;
>>> }
>>>
>>> +static bool can_merge_rq(const struct i915_request *prev,
>>> + const struct i915_request *next)
>>> +{
>>> + GEM_BUG_ON(!assert_priority_queue(prev, next));
>>> +
>>> + if (!can_merge_ctx(prev->hw_context, next->hw_context))
>>> + return false;
>>> +
>>> + return true;
>>
>> I'll assume you'll be adding here in the future as the reason this is
>> not simply "return can_merge_ctx(...)"?
>
> Yes, raison d'etre of making the change.
>
>>> static void port_assign(struct execlist_port *port, struct i915_request *rq)
>>> {
>>> GEM_BUG_ON(rq == port_request(port));
>>> @@ -753,8 +763,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>> int i;
>>>
>>> priolist_for_each_request_consume(rq, rn, p, i) {
>>> - GEM_BUG_ON(!assert_priority_queue(execlists, last, rq));
>>> -
>>> /*
>>> * Can we combine this request with the current port?
>>> * It has to be the same context/ringbuffer and not
>>> @@ -766,8 +774,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>> * second request, and so we never need to tell the
>>> * hardware about the first.
>>> */
>>> - if (last &&
>>> - !can_merge_ctx(rq->hw_context, last->hw_context)) {
>>> + if (last && !can_merge_rq(last, rq)) {
>>> + if (last->hw_context == rq->hw_context)
>>> + goto done;
>>
>> I don't get this added check. AFAICS it will only trigger with GVT
>> making it not consider filling both ports if possible.
>
> Because we are preparing for can_merge_rq() deciding not to merge the
> same context. If we do that we can't continue on to the next port and
> must terminate the loop, violating the trick with the hint in the
> process.
>
> This changes due to the next patch, per-context freq and probably more
> that I've forgotten.
After a second look, I noticed the existing GVT comment a bit lower down
which avoids populating port1 already.
Maybe one thing which would make sense is to re-arange these checks in
the order of "priority", like:
if (last && !can_merge_rq(...)) {
// naturally highest prio since it is impossible
if (port == last_port)
goto done;
// 2nd highest to account for programming limitation
else if (last->hw_context == rq->hw_context)
goto done;
// GVT check simplified (I think - since we know last is either
different ctx or single submit)
else if (ctx_single_port_submission(rq->hw_context))
goto done;
>
>>> +
>>> /*
>>> * If we are on the second port and cannot
>>> * combine this request with the last, then we
>>> @@ -787,7 +797,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>> ctx_single_port_submission(rq->hw_context))
>>> goto done;
>>>
>>> - GEM_BUG_ON(last->hw_context == rq->hw_context);
>>
>> This is related to the previous comment. Rebase error?
>
> Previous if check, so it's clear at this point that we can't be using
> the same.
Yep.
>
>>> @@ -827,8 +836,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>> * request triggering preemption on the next dequeue (or subsequent
>>> * interrupt for secondary ports).
>>> */
>>> - execlists->queue_priority_hint =
>>> - port != execlists->port ? rq_prio(last) : INT_MIN;
>>> + execlists->queue_priority_hint = queue_prio(execlists);
>>
>> This shouldn't be in this patch.
>
> If we terminate the loop early, we need to look at the head of the
> queue.
Why it is different for ending early for any other (existing) reason?
Although I concede better management of queue_priority_hint is exactly
what I was suggesting. Oops. Consequences are not entirely straight
forward though.. if we decide not to submit all of a single context, or
leave port1 empty, currently we would hint scheduling the tasklet for
any new submission. With this change only after a CS or if a higher ctx
is submitted. Which is what makes me feel it should be a separate patch
for a behaviour change (since a high prio, higher than INT_MIN, is
potentially head of the queue).
Regards,
Tvrtko
More information about the Intel-gfx
mailing list