[PATCH 14/25] drm/amdkfd: Populate DRM render device minor

Wed Feb 14 19:01:12 UTC 2018

On 2018-02-14 01:33 PM, Christian König wrote:
> Am 14.02.2018 um 19:24 schrieb Felix Kuehling:
>> On 2018-02-14 01:15 PM, Christian König wrote:
>>> Am 14.02.2018 um 17:35 schrieb Felix Kuehling:
>>>> [SNIP]
>>>>>>>>> I don't see how you can have separate TTM objects referring to
>>>>>>>>> the
>>>>>>>>> same memory.
>>>>>>>> Well that is trivial, we do this all the time with prime and I+A
>>>>>>>> laptops.
>>>>>>> As I understand it, you use DMABuf to export/import buffers on
>>>>>>> multiple
>>>>>>> devices. I believe all devices share a single amdgpu_bo, which
>>>>>>> contains
>>>>>>> the ttm_buffer_object.
>>>>> That's incorrect as well. Normally multiple devices have multiple
>>>>> ttm_buffer_object, one for each device.
>>>>> Going a bit higher that actually makes sense because the status of
>>>>> each BO is deferent for each device. E.g. one device could have
>>>>> the BO
>>>>> in access while it could be idle on another device.
>>>> Can you point me where this is done? I'm looking at
>>>> amdgpu_gem_prime_foreign_bo. It is used if an AMDGPU BO is imported
>>>> into
>>>> a different AMDGPU device. It creates a new GEM object, with a
>>>> reference
>>>> to the same amdgpu BO (gobj->bo = amdgpu_bo_ref(bo)). To me this looks
>>>> very much like the same amdgpu_bo, and cosequently the same TTM BO
>>>> being
>>>> shared by two GEM objects and two devices.
>>> As Michel pointed out as well that stuff isn't upstream and judging
>>> from the recent requirements it will never go upstream.
>>>
>>>>> If this is enabled by any changes that break existing buffer sharing
>>>>> for
>>>>> A+A or A+I systems, please point it out to me. I'm not aware that
>>>>> this
>>>>> patch series does anything to that effect.
>>>>> As I said it completely breaks scanout with A+I systems.
>>>> Please tell me what "it" is. What in the changes I have posted is
>>>> breaking A+I systems. I don't see it.
>>> Using the same amdgpu_bo structure with multiple devices is what "it"
>>> means here.
>> That statement seems to contradict this previous statement by you:
>>>>>    What we do is map the same
>>>>> BO in VMs on other devices. It has no effect on GART mappings.
>>> Correct VM mapping is unaffected here.
>> Can you clarify that contradiction? Is it OK for us to map the same BO
>> in multiple VMs or not?
>
> By  the current requirements I have I think the answer is no.
>
>>> As I said that concept is incompatible with the requirements on A+A
>>> systems, so we need to find another solution to provide the
>>> functionality.
>> Do you mean you need to find another solution for A+A buffer sharing
>> specifically? Or is this a more general statement that includes the
>> mapping of BOs to multiple VMs on different devices?
>
> A more general statement. We need to find a solution which works for
> everybody and not just works like this in the KFD but breaks A+A
> buffer sharing and so needs to be disabled there.

Well, KFD sharing system memory BOs between GPUs doesn't break A+A.
Implementing a solution for A+A that involves DMABufs will not affect
KFD. And KFD isn't actually broken as far as I know. Once you have a
solution for A+A, maybe it will help me understand the problem and I can
evaluate whether the solution is applicable to KFD and worth adopting.
But for now I have neither a good understanding of the problem, no
evidence that there is a problem affecting KFD, and no way towards a
solution.

>
>>
>>> What's on my TODO list anyway is to extend DMA-buf to not require
>>> pinning and to be able to deal with P2P.
>> Sounds good. That said, KFD is not using DMABufs here.
>>
>>> The former is actually rather easy and already mostly done by sharing
>>> the reservation object between exporter and importer.
>>>
>>> The later is a bit more tricky because I need to create the necessary
>>> P2P infrastructure, but even that is doable in the mid term.
>> The sooner you can share your plans, the better. Right now I'm in a bit
>> of limbo. I feel you're blocking KFD upstreaming based on AMDGPU plans
>> and changes that no one has seen yet.
>
> Well as far as I understand it that is not blocking for the current
> upstreaming because you didn't planned to upstream this use case
> anyway, didn't you?

Which use case? The current patch series enables multi-GPU buffer
sharing of system memory BOs. If it is actually broken, I can reduce the
scope to single-GPU support. But I have no evidence that multi-GPU is
actually broken.

Regards,
  Felix

>
> Regards,
> Christian.
>
>>
>> Thanks,
>>    Felix
>>
>>> Regards,
>>> Christian.
>