[PATCH 12/12] drm/amdgpu: enable foreign DMA-buf objects

Tue Jul 4 16:39:48 UTC 2017

Am 04.07.2017 um 17:56 schrieb Felix Kuehling:
> On 17-07-04 03:32 AM, Christian König wrote:
>> Am 03.07.2017 um 23:11 schrieb Felix Kuehling:
>>> +
>>> +    list_add(&gobj->list, &bo->gem_objects);
>>> +    gobj->bo = amdgpu_bo_ref(bo);
>>> +    bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>> It's a bit more tricker than that. IIRC my original patch limited the
>> BO to GTT space as well.
> I dug out your original patch for reference (attached), which was
> originally applied on a 4.1-based KFD staging branch. Your original
> patch series also included a change for VRAM VM mappings with the system
> bit (also attached). So your original intention was clearly to also
> allow VRAM P2P access. The VM patch didn't apply after some VM changes
> (probably related to the vram manager) and was later replaced by Amber's
> patch.

Ok in this case my memory fooled me.

>> VRAM peer to peer access doesn't work with most PCIe chipsets.
>>
>> At bare minimum we need to put this behind a config option or add a
>> white list for the chipset or only enable it if
>> "pci=pcie_bus_peer2peer" is set or something like this.
> Well we're using it without any special pci= flags.
> pci=pcie_bus_peer2peer can reduce performance, so we should not require
> it if it's not needed on all systems.
>
> There are other issues that can prevent P2P access between some pairs of
> devices. For example on Intel dual-socket boards the QPI link between
> the sockets doesn't work for P2P traffic. So P2P only works between
> devices connected to the same socket.
>
> I think it's impractical to check all those chipset-specific limitations
> at this level.

Seriously? Sorry, but that approach is a clear no-go.

Long story short all those cases must cleanly be handled before this 
patch series can land upstream (or even be in any hybrid release).

> Importing and mapping a foreign BO should be no problem
> either way. If remote access is limited, that's something the
> application can figure out on its own. In case of KFD, this is done
> based on IO-link topology information.

I'm pretty sure that the patch as is would break A+A laptops, so pushing 
it to any branch outside some KFD testing environment is a clear NAK 
from my side.

I would also strongly discourage to use it in a production system until 
those issues are sorted out. This patch series was a technical prove of 
concept, not something we can upstream as is.

What needs to be done is working on a white list for chipsets and/or 
interconnections between PCIe bus systems.

Regards,
Christian.

>
> Regards,
>    Felix
>
>
>> BTW: If you modify a patch as severely as that please add your
>> Signed-of-by line as well.
>>
>> Regards,
>> Christian.
>>
>>> +
>>> +    ww_mutex_unlock(&bo->tbo.resv->lock);
>>> +
>>> +    return &gobj->base;
>>> +}
>>> +
>>> +struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
>>> +                           struct dma_buf *dma_buf)
>>> +{
>>> +    struct amdgpu_device *adev = dev->dev_private;
>>> +
>>> +    if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) {
>>> +        struct drm_gem_object *obj = dma_buf->priv;
>>> +
>>> +        if (obj->dev != dev && obj->dev->driver == dev->driver) {
>>> +            /* It's a amdgpu_bo from a different driver instance */
>>> +            struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>>> +
>>> +            return amdgpu_gem_prime_foreign_bo(adev, bo);
>>> +        }
>>> +    }
>>> +
>>> +    return drm_gem_prime_import(dev, dma_buf);
>>> +}
>>