[PATCH 2/2] drm/i915/gvt: Audit and shadow workload during ELSP writing
Zhi Wang
zhi.a.wang at intel.com
Tue Jun 6 01:35:17 UTC 2017
On 06/06/17 09:29, Gao, Ping A wrote:
> On 2017/6/5 16:38, Wang, Zhi A wrote:
>> On 05/24/17 20:49, Ping Gao wrote:
>>> Let the workload audit and shadow ahead of vGPU scheduling, that
>>> will eliminate GPU idle time and improve performance for multi-VM.
>>>
>>> The performance of Heaven running simultaneously in 3VMs has
>>> improved 20% after this patch.
>>>
>>> Signed-off-by: Ping Gao <ping.a.gao at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/gvt/execlist.c | 12 ++++++++++++
>>> drivers/gpu/drm/i915/gvt/scheduler.c | 7 +++++++
>>> drivers/gpu/drm/i915/gvt/scheduler.h | 1 +
>>> 3 files changed, 20 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gvt/execlist.c b/drivers/gpu/drm/i915/gvt/execlist.c
>>> index dca989e..a14cf29 100644
>>> --- a/drivers/gpu/drm/i915/gvt/execlist.c
>>> +++ b/drivers/gpu/drm/i915/gvt/execlist.c
>>> @@ -605,6 +605,7 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
>>> struct list_head *q = workload_q_head(vgpu, ring_id);
>>> struct intel_vgpu_workload *last_workload = get_last_workload(q);
>>> struct intel_vgpu_workload *workload = NULL;
>>> + struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
>>> u64 ring_context_gpa;
>>> u32 head, tail, start, ctl, ctx_ctl, per_ctx, indirect_ctx;
>>> int ret;
>>> @@ -668,6 +669,7 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
>>> workload->complete = complete_execlist_workload;
>>> workload->status = -EINPROGRESS;
>>> workload->emulate_schedule_in = emulate_schedule_in;
>>> + workload->shadowed = false;
>>>
>>> if (ring_id == RCS) {
>>> intel_gvt_hypervisor_read_gpa(vgpu, ring_context_gpa +
>>> @@ -701,6 +703,16 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
>>> return ret;
>>> }
>>>
>>> + /* Only audit and shadow the first workload in the queue
>>> + * as there is only one pre-allocated buf-obj for shadow.
>>> + */
>>> + if (vgpu->gvt->scheduler.current_vgpu != vgpu &&
>>> + list_empty(workload_q_head(vgpu, ring_id))) {
>>> + mutex_lock(&dev_priv->drm.struct_mutex);
>>> + intel_gvt_audit_and_shadow_workload(workload);
>>> + mutex_unlock(&dev_priv->drm.struct_mutex);
>>> + }
>>> +
>> Is there any reason why a pre-shadow cannot happen when current_vgpu ==
>> vgpu and workload q is empty?
> As there is only one pre-allocated obj-buf for shadow, it's able to
> shadow the first workload only in the q. The q is empty before inqueue
> means current workload will be the first node of the q.
>
> current_vgpu!=vgpu, it try to emphasize that the performance could get
> improved only when pre-shadow happen under this condition , that's why
> we need pre-shadow. Logically it can be removed but the purpose of
> pre-shadow would be not very clear when reading the code.
I got the background in our previous talk :P . I'm just curious. From my
point of view, the gap comes from the workload scheduler thread. If we
pre-shadow workload as much as possible, that would be nicer. Do you
observe performance drop/gain changed after remove current->vgpu != vgpu? :P
>>> queue_workload(workload);
>>> return 0;
>>> }
>>> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
>>> index d6bfdfe..4261bd1 100644
>>> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
>>> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
>>> @@ -184,6 +184,11 @@ int intel_gvt_audit_and_shadow_workload(struct intel_vgpu_workload *workload)
>>> struct intel_ring *ring;
>>> int ret;
>>>
>>> + if (workload->shadowed) {
>>> + ret = workload->status;
>>> + return ret;
>>> + }
>>> +
>>> shadow_ctx->desc_template &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
>>> shadow_ctx->desc_template |= workload->ctx_desc.addressing_mode <<
>>> GEN8_CTX_ADDRESSING_MODE_SHIFT;
>>> @@ -230,6 +235,8 @@ int intel_gvt_audit_and_shadow_workload(struct intel_vgpu_workload *workload)
>>> if (ret)
>>> goto out;
>>>
>>> + workload->shadowed = true;
>>> +
>>> out:
>>> workload->status = ret;
>>> return ret;
>>> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.h b/drivers/gpu/drm/i915/gvt/scheduler.h
>>> index 2cd725c..575659e 100644
>>> --- a/drivers/gpu/drm/i915/gvt/scheduler.h
>>> +++ b/drivers/gpu/drm/i915/gvt/scheduler.h
>>> @@ -78,6 +78,7 @@ struct intel_vgpu_workload {
>>> struct drm_i915_gem_request *req;
>>> /* if this workload has been dispatched to i915? */
>>> bool dispatched;
>>> + bool shadowed;
>>> int status;
>>>
>>> struct intel_vgpu_mm *shadow_mm;
>>>
More information about the intel-gvt-dev
mailing list