[Intel-gfx] [PATCH v3 2/8] drm/i915: Adds graphic address space ballooning logic

Tue Dec 16 05:41:34 PST 2014

On 12/16/2014 01:22 PM, Yu, Zhang wrote:
> On 12/12/2014 9:00 PM, Tvrtko Ursulin wrote:
>>
>> On 11/13/2014 12:02 PM, Yu Zhang wrote:
>>> With Intel GVT-g, the global graphic memory space is partitioned by
>>> multiple vGPU instances in different VMs. The ballooning code is called
>>> in i915_gem_setup_global_gtt(), utilizing the drm mm allocator APIs to
>>> mark the graphic address space which are partitioned out to other vGPUs
>>> as reserved.
>>>
>>> v2:
>>> take Chris and Daniel's comments:
>>>     - no guard page between different VMs
>>>     - use drm_mm_reserve_node() to do the reservation for ballooning,
>>>     instead of the previous drm_mm_insert_node_in_range_generic()
>>>
>>> v3:
>>> take Daniel's comments:
>>>     - move ballooning functions into i915_vgpu.c
>>>     - add kerneldoc to ballooning functions
>>>
>>> Signed-off-by: Yu Zhang <yu.c.zhang at linux.intel.com>
>>> Signed-off-by: Jike Song <jike.song at intel.com>
>>> Signed-off-by: Zhi Wang <zhi.a.wang at intel.com>
>>> Signed-off-by: Eddie Dong <eddie.dong at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_gtt.c |  17 +++-
>>>   drivers/gpu/drm/i915/i915_vgpu.c    | 149
>>> ++++++++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/i915_vgpu.h    |   2 +
>>>   3 files changed, 165 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> index de12017..2dfac13 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> @@ -27,6 +27,7 @@
>>>   #include <drm/drmP.h>
>>>   #include <drm/i915_drm.h>
>>>   #include "i915_drv.h"
>>> +#include "i915_vgpu.h"
>>>   #include "i915_trace.h"
>>>   #include "intel_drv.h"
>>>
>>> @@ -1683,6 +1684,16 @@ int i915_gem_setup_global_gtt(struct drm_device
>>> *dev,
>>>
>>>       /* Subtract the guard page ... */
>>>       drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
>>> +
>>> +    dev_priv->gtt.base.start = start;
>>> +    dev_priv->gtt.base.total = end - start;
>>> +
>>> +    if (intel_vgpu_active(dev)) {
>>> +        ret = intel_vgt_balloon(dev);
>>> +        if (ret)
>>> +            return ret;
>>> +    }
>>> +
>>
>> Out of curiosity, what will be the mechanism to prevent a vGPU instance
>> from ignoring the ballooning data? Must be something in the hypervisor
>> blocking pass-through access to such domains?
> Well, although we have range check logic in the host side(which checks
> the legality of a GM address), the correctness of a GM from vGPU side

Oh, I thought that part is direct for performance reasons (avoiding 
translation). It's not then?

> are supposed to be guaranteed by the drm mm allocator - all those
> ballooned out spaces are marked as reserved.

I was thinking about a malicious i915 driver instance, or simply a bug 
in DRM or i915 which breaks the blocked ranges assumption.

If that is possible then it sounds pretty bad if one VM could snoop the 
surfaces from another.

>> And probably GPU reset should also be disallowed for vGPU instances?
>>
>>>       if (!HAS_LLC(dev))
>>>           dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
>>>
>>> @@ -1702,9 +1713,6 @@ int i915_gem_setup_global_gtt(struct drm_device
>>> *dev,
>>>           vma->bound |= GLOBAL_BIND;
>>>       }
>>>
>>> -    dev_priv->gtt.base.start = start;
>>> -    dev_priv->gtt.base.total = end - start;
>>> -
>>>       /* Clear any non-preallocated blocks */
>>>       drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
>>>           DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>>> @@ -1756,6 +1764,9 @@ void i915_global_gtt_cleanup(struct drm_device
>>> *dev)
>>>       }
>>>
>>>       if (drm_mm_initialized(&vm->mm)) {
>>> +        if (intel_vgpu_active(dev))
>>> +            intel_vgt_deballoon();
>>> +
>>>           drm_mm_takedown(&vm->mm);
>>>           list_del(&vm->global_link);
>>>       }
>>> diff --git a/drivers/gpu/drm/i915/i915_vgpu.c
>>> b/drivers/gpu/drm/i915/i915_vgpu.c
>>> index 3f6b797..ff5fba3 100644
>>> --- a/drivers/gpu/drm/i915/i915_vgpu.c
>>> +++ b/drivers/gpu/drm/i915/i915_vgpu.c
>>> @@ -83,3 +83,152 @@ void i915_check_vgpu(struct drm_device *dev)
>>>       dev_priv->vgpu.active = true;
>>>       DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
>>>   }
>>> +
>>> +struct _balloon_info_ {
>>> +    /*
>>> +     * There are up to 2 regions per low/high graphic memory that
>>> +     * might be ballooned. Here, index 0/1 is for low
>>> +     * graphic memory, 2/3 for high graphic memory.
>>> +     */
>>> +    struct drm_mm_node space[4];
>>> +} bl_info;
>>
>> This should be static I think.
> Yes, you are right.
>>
>>> +/**
>>> + * intel_vgt_deballoon - deballoon reserved graphics address trunks
>>> + *
>>> + * This function is called to deallocate the ballooned-out graphic
>>> memory, when
>>> + * driver is unloaded or when ballooning fails.
>>> + */
>>> +void intel_vgt_deballoon(void)
>>> +{
>>> +    int i;
>>> +
>>> +    DRM_INFO("VGT deballoon.\n");
>>
>> Would debug be more appropriate? I don't see much value of saying this
>> on driver unload - it's not that it is optional at this point.
> OK. :)
>>
>> Also for all logging, is intended human readable name VGT or vGT? If the
>> latter it would be nicer to log it in that form.
> Well, I prefer VGT. :)

So it's vGPU and VGT.. OK... :) (and GVT-d) :)

>>> +
>>> +    for (i = 0; i < 4; i++) {
>>> +        if (bl_info.space[i].allocated)
>>> +            drm_mm_remove_node(&bl_info.space[i]);
>>> +    }
>>> +
>>> +    memset(&bl_info, 0, sizeof(bl_info));
>>> +}
>>> +
>>> +static int vgt_balloon_space(struct drm_mm *mm,
>>> +                 struct drm_mm_node *node,
>>> +                 unsigned long start, unsigned long end)
>>> +{
>>> +    unsigned long size = end - start;
>>> +
>>> +    if (start == end)
>>> +        return -EINVAL;
>>> +
>>> +    DRM_INFO("balloon space: range [ 0x%lx - 0x%lx ] %lu KB.\n",
>>> +         start, end, size / 1024);
>>
>> KiB ?
> Yes. Thanks. :)
>>
>>> +    node->start = start;
>>> +    node->size = size;
>>> +
>>> +    return drm_mm_reserve_node(mm, node);
>>> +}
>>> +
>>> +/**
>>> + * intel_vgt_balloon - balloon out reserved graphics address trunks
>>> + * @dev: drm device
>>> + *
>>> + * This function is called at the initialization stage, to balloon
>>> out the
>>> + * graphic address space allocated to other VMs, by marking these
>>> spaces as
>>> + * reserved.
>>> + *
>>> + * The ballooning related knowledges(starting address and size of the
>>> low/high
>>
>> s/knowledges\(/knowledge /
> Got it.
>>
>>> + * graphic memory) are depicted in the vgt_if structure in a reserved
>>> MMIO
>>> + * range.
>>> + *
>>> + * Returns:
>>> + * zero on success, non-zero if configuration invalid or ballooning
>>> failed
>>> + */
>>> +int intel_vgt_balloon(struct drm_device *dev)
>>> +{
>>> +    struct drm_i915_private *dev_priv = to_i915(dev);
>>> +    struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
>>> +    unsigned long ggtt_vm_end = ggtt_vm->start + ggtt_vm->total;
>>> +
>>> +    unsigned long low_gm_base, low_gm_size, low_gm_end;
>>> +    unsigned long high_gm_base, high_gm_size, high_gm_end;
>>> +    int ret;
>>> +
>>> +    low_gm_base = I915_READ(vgtif_reg(avail_rs.low_gmadr.my_base));
>>> +    low_gm_size = I915_READ(vgtif_reg(avail_rs.low_gmadr.my_size));
>>> +    high_gm_base = I915_READ(vgtif_reg(avail_rs.high_gmadr.my_base));
>>> +    high_gm_size = I915_READ(vgtif_reg(avail_rs.high_gmadr.my_size));
>>
>> Get rid of my_ prefix ?
> OK. Will do. :)
>>
>>> +
>>> +    low_gm_end = low_gm_base + low_gm_size;
>>> +    high_gm_end = high_gm_base + high_gm_size;
>>> +
>>> +    DRM_INFO("VGT ballooning configuration:\n");
>>> +    DRM_INFO("Low graphic memory: base 0x%lx size %ldKB\n",
>>> +         low_gm_base, low_gm_size / 1024);
>>> +    DRM_INFO("High graphic memory: base 0x%lx size %ldKB\n",
>>> +         high_gm_base, high_gm_size / 1024);
>>> +
>>> +    if (low_gm_base < ggtt_vm->start
>>> +        || low_gm_end > dev_priv->gtt.mappable_end
>>> +        || high_gm_base < dev_priv->gtt.mappable_end
>>> +        || high_gm_end > ggtt_vm_end) {
>>> +        DRM_ERROR("Invalid ballooning configuration!\n");
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    memset(&bl_info, 0, sizeof(bl_info));
>>
>> If bl_info is static then you don't need this memset?
> Oh, yes. :)
>>
>>> +    /* High graphic memory ballooning */
>>> +    if (high_gm_base > dev_priv->gtt.mappable_end) {
>>> +        ret = vgt_balloon_space(&ggtt_vm->mm,
>>> +                    &bl_info.space[2],
>>> +                    dev_priv->gtt.mappable_end,
>>> +                    high_gm_base);
>>> +
>>> +        if (ret)
>>> +            goto err;
>>> +    }
>>> +
>>> +    /*
>>> +     * No need to partition out the last physical page,
>>> +     * because it is reserved to the guard page.
>>> +     */
>>> +    if (high_gm_end < ggtt_vm_end - PAGE_SIZE) {
>>> +        ret = vgt_balloon_space(&ggtt_vm->mm,
>>> +                    &bl_info.space[3],
>>> +                    high_gm_end,
>>> +                    ggtt_vm_end - PAGE_SIZE);
>>> +        if (ret)
>>> +            goto err;
>>> +    }
>>> +
>>> +    /* Low graphic memory ballooning */
>>> +    if (low_gm_base > ggtt_vm->start) {
>>> +        ret = vgt_balloon_space(&ggtt_vm->mm,
>>> +                    &bl_info.space[0],
>>> +                    ggtt_vm->start, low_gm_base);
>>> +
>>> +        if (ret)
>>> +            goto err;
>>> +    }
>>> +
>>> +    if (low_gm_end < dev_priv->gtt.mappable_end) {
>>> +        ret = vgt_balloon_space(&ggtt_vm->mm,
>>> +                    &bl_info.space[1],
>>> +                    low_gm_end,
>>> +                    dev_priv->gtt.mappable_end);
>>> +
>>> +        if (ret)
>>> +            goto err;
>>> +    }
>>
>> Okay, I've figured it out. :) I suppose going back to patch 1, where it
>> says "Each VM can only have one portion of continuous area for now",
>> with the emphasis on _one_. That threw me off thinking you have two
>> "balloons" meaning forbidden areas. And then with low and high naming I
>> got the wrong idea that one balloon marks the bottom inaccessible part,
>> and the other top. I didn't figure out the whole low == mappable, high
>> == non-mappable split. I suppose it was my inexperience, but if you can
>> think of a way on how to improve the comment, even ASCII art would be
>> very nice if possible.
> Hah. Guess I need a better comments above. Let me think how to
> organize.  :)

Cool thanks!

>> Actually ballooning in the first place confuses me because I thought
>> prior use for that in virtualisation was for memory which can shrink and
>> grow at runtime. Did I get that wrong?
>>
> Well, for memory virtualization, the word 'ballooning' does mean this -
> it can shrink and grow.
> Yet for our Intel GVT-g, the 'ballooning' is used, to emphasis that we
> are not just partitioning the GM space to each vgpu, the vgpu can also
> perceive its GM address space as starting from zero(although the usable
> GM spaces may probably not).
> By now, we do not grow/shrink. :)

Unless there are plan to make it grow and shrink it seems that we agree 
it is confusing. But I suppose it is too late to change the term now?

Regards,

Tvrtko