[PATCH v2 1/1] drm/virtio: Implement device_attach

Wed Jan 31 10:20:12 UTC 2024

On 2024/1/30 22:23, Christian König wrote:
> Am 30.01.24 um 12:16 schrieb Daniel Vetter:
>> On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
>>> On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
>>>> As vram objects don't have backing pages and thus can't implement
>>>> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
>>>> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
>>>> and implement virtgpu specific map/unmap/attach callbacks to support
>>>> both of shmem objects and vram objects.
>>>>
>>>> Signed-off-by: Julia Zhang <julia.zhang at amd.com>
>>>> ---
>>>>   drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>>>>   1 file changed, 36 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
>>>> index 44425f20d91a..b490a5343b06 100644
>>>> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
>>>> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
>>>> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>>>>   {
>>>>       struct drm_gem_object *obj = attach->dmabuf->priv;
>>>>       struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>> +    struct sg_table *sgt;
>>>> +    int ret;
>>>>         if (virtio_gpu_is_vram(bo))
>>>>           return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>>>>   -    return drm_gem_map_dma_buf(attach, dir);
>>>> +    sgt = drm_prime_pages_to_sg(obj->dev,
>>>> +                    to_drm_gem_shmem_obj(obj)->pages,
>>>> +                    obj->size >> PAGE_SHIFT);
>>>> +    if (IS_ERR(sgt))
>>>> +        return sgt;
>>>> +
>>>> +    ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
>>>> +    if (ret) {
>>>> +        sg_free_table(sgt);
>>>> +        kfree(sgt);
>>>> +        return ERR_PTR(ret);
>>>> +    }
>>>> +
>>>> +    return sgt;
>>>>   }
>>>>     static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>>>> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>>>>       struct drm_gem_object *obj = attach->dmabuf->priv;
>>>>       struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>>   +    if (!sgt)
>>>> +        return;
>>>> +
>>>>       if (virtio_gpu_is_vram(bo)) {
>>>>           virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
>>>> -        return;
>>>> +    } else {
>>>> +        dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
>>>> +        sg_free_table(sgt);
>>>> +        kfree(sgt);
>>>>       }
>>>> +}
>>>> +
>>>> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
>>>> +                     struct dma_buf_attachment *attach)
>>>> +{
>>>> +    struct drm_gem_object *obj = attach->dmabuf->priv;
>>>> +    struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>> +    int ret = 0;
>>>> +
>>>> +    if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
>>>> +        ret = obj->funcs->pin(obj);
>>>>   -    drm_gem_unmap_dma_buf(attach, sgt, dir);
>>>> +    return ret;
>>> This doesn't look like what I've expected. There should be no need to
>>> change the map/unmap functions, especially not for the usual gem bo case.
>>> We should definitely keep using the exact same code for that. Instead all
>>> I expected is roughly
>>>
>>> virtgpu_gem_device_attach()
>>> {
>>>     if (virtio_gpu_is_vram(bo)) {
>>>         if (can_access_virtio_vram_directly(attach->dev)
>>>             return 0;
>>>         else
>>>             return -EBUSY;
>>>     } else {
>>>         return drm_gem_map_attach();
>>>     }
>>> }
>>>
>>> Note that I think can_access_virtio_vram_directly() needs to be
>>> implemented first. I'm not even sure it's possible, might be that all the
>>> importers need to set the attachment->peer2peer flag. Which is why this
>>> thing exists really. But that's a pile more work to do.
> 
Hi Sima, Christian,

> Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.

I see, I will modify this.

> 
> What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).
> 
> I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.
>>>>
>>> Frankly the more I look at the original patch that added vram export
>>> support the more this just looks like a "pls revert, this is just too
>>> broken".
>> The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
>> exported vram"). The commit message definitely needs to cite that one, and
>> also needs a cc: stable because not rejecting invalid imports is a pretty
>> big deal.
> 
> Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.
> 

Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.
Would you mind to explain more about it. Thanks!

Best regards,
Julia

> Regards,
> Christian.
> 
>>
>> Also adding David.
>> -Sima
>>
>>> We should definitely not open-code any functions for the gem_bo export
>>> case, which your patch seems to do? Or maybe I'm just extremely confused.
>>> -Sima
>>>
>>>>     static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>>>> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>>>>           .vmap = drm_gem_dmabuf_vmap,
>>>>           .vunmap = drm_gem_dmabuf_vunmap,
>>>>       },
>>>> -    .device_attach = drm_gem_map_attach,
>>>> +    .device_attach = virtgpu_gem_device_attach,
>>>>       .get_uuid = virtgpu_virtio_get_uuid,
>>>>   };
>>>>   -- 
>>>> 2.34.1
>>>>
>>> -- 
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> http://blog.ffwll.ch
>