[PATCH 2/2] drm/i915/gvt: Audit and shadow workload during ELSP writing
Gao, Ping A
ping.a.gao at intel.com
Tue Jun 6 02:38:56 UTC 2017
On 2017/6/6 9:35, Wang, Zhi A wrote:
> On 06/06/17 09:29, Gao, Ping A wrote:
>> On 2017/6/5 16:38, Wang, Zhi A wrote:
>>> On 05/24/17 20:49, Ping Gao wrote:
>>>> Let the workload audit and shadow ahead of vGPU scheduling, that
>>>> will eliminate GPU idle time and improve performance for multi-VM.
>>>>
>>>> The performance of Heaven running simultaneously in 3VMs has
>>>> improved 20% after this patch.
>>>>
>>>> Signed-off-by: Ping Gao <ping.a.gao at intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/gvt/execlist.c | 12 ++++++++++++
>>>> drivers/gpu/drm/i915/gvt/scheduler.c | 7 +++++++
>>>> drivers/gpu/drm/i915/gvt/scheduler.h | 1 +
>>>> 3 files changed, 20 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gvt/execlist.c b/drivers/gpu/drm/i915/gvt/execlist.c
>>>> index dca989e..a14cf29 100644
>>>> --- a/drivers/gpu/drm/i915/gvt/execlist.c
>>>> +++ b/drivers/gpu/drm/i915/gvt/execlist.c
>>>> @@ -605,6 +605,7 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
>>>> struct list_head *q = workload_q_head(vgpu, ring_id);
>>>> struct intel_vgpu_workload *last_workload = get_last_workload(q);
>>>> struct intel_vgpu_workload *workload = NULL;
>>>> + struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
>>>> u64 ring_context_gpa;
>>>> u32 head, tail, start, ctl, ctx_ctl, per_ctx, indirect_ctx;
>>>> int ret;
>>>> @@ -668,6 +669,7 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
>>>> workload->complete = complete_execlist_workload;
>>>> workload->status = -EINPROGRESS;
>>>> workload->emulate_schedule_in = emulate_schedule_in;
>>>> + workload->shadowed = false;
>>>>
>>>> if (ring_id == RCS) {
>>>> intel_gvt_hypervisor_read_gpa(vgpu, ring_context_gpa +
>>>> @@ -701,6 +703,16 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
>>>> return ret;
>>>> }
>>>>
>>>> + /* Only audit and shadow the first workload in the queue
>>>> + * as there is only one pre-allocated buf-obj for shadow.
>>>> + */
>>>> + if (vgpu->gvt->scheduler.current_vgpu != vgpu &&
>>>> + list_empty(workload_q_head(vgpu, ring_id))) {
>>>> + mutex_lock(&dev_priv->drm.struct_mutex);
>>>> + intel_gvt_audit_and_shadow_workload(workload);
>>>> + mutex_unlock(&dev_priv->drm.struct_mutex);
>>>> + }
>>>> +
>>> Is there any reason why a pre-shadow cannot happen when current_vgpu ==
>>> vgpu and workload q is empty?
>> As there is only one pre-allocated obj-buf for shadow, it's able to
>> shadow the first workload only in the q. The q is empty before inqueue
>> means current workload will be the first node of the q.
>>
>> current_vgpu!=vgpu, it try to emphasize that the performance could get
>> improved only when pre-shadow happen under this condition , that's why
>> we need pre-shadow. Logically it can be removed but the purpose of
>> pre-shadow would be not very clear when reading the code.
> I got the background in our previous talk :P . I'm just curious. From my
> point of view, the gap comes from the workload scheduler thread. If we
> pre-shadow workload as much as possible, that would be nicer. Do you
> observe performance drop/gain changed after remove current->vgpu != vgpu? :P
There is no benefit to do pre-shadow if current->vgpu == vgpu, so remove
current->vgpu != vgpu has no performance impact I think.
Just notice that this condition should removed as current->vgpu==vgpu
could been changed soon due to scheduler, it would make current vgpu
miss some chance to do pre-shadow. I will remove it at next version :)
>>>> queue_workload(workload);
>>>> return 0;
>>>> }
>>>> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
>>>> index d6bfdfe..4261bd1 100644
>>>> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
>>>> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
>>>> @@ -184,6 +184,11 @@ int intel_gvt_audit_and_shadow_workload(struct intel_vgpu_workload *workload)
>>>> struct intel_ring *ring;
>>>> int ret;
>>>>
>>>> + if (workload->shadowed) {
>>>> + ret = workload->status;
>>>> + return ret;
>>>> + }
>>>> +
>>>> shadow_ctx->desc_template &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
>>>> shadow_ctx->desc_template |= workload->ctx_desc.addressing_mode <<
>>>> GEN8_CTX_ADDRESSING_MODE_SHIFT;
>>>> @@ -230,6 +235,8 @@ int intel_gvt_audit_and_shadow_workload(struct intel_vgpu_workload *workload)
>>>> if (ret)
>>>> goto out;
>>>>
>>>> + workload->shadowed = true;
>>>> +
>>>> out:
>>>> workload->status = ret;
>>>> return ret;
>>>> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.h b/drivers/gpu/drm/i915/gvt/scheduler.h
>>>> index 2cd725c..575659e 100644
>>>> --- a/drivers/gpu/drm/i915/gvt/scheduler.h
>>>> +++ b/drivers/gpu/drm/i915/gvt/scheduler.h
>>>> @@ -78,6 +78,7 @@ struct intel_vgpu_workload {
>>>> struct drm_i915_gem_request *req;
>>>> /* if this workload has been dispatched to i915? */
>>>> bool dispatched;
>>>> + bool shadowed;
>>>> int status;
>>>>
>>>> struct intel_vgpu_mm *shadow_mm;
>>>>
More information about the intel-gvt-dev
mailing list