[PATCH 14/25] drm/amdkfd: Populate DRM render device minor

Wed Feb 14 18:12:18 UTC 2018

On 2018-02-14 11:50 AM, Michel Dänzer wrote:
> On 2018-02-14 05:35 PM, Felix Kuehling wrote:
>> On 2018-02-14 02:42 AM, Christian König wrote:
>>> Am 14.02.2018 um 00:17 schrieb Felix Kuehling:
>>>> On 2018-02-13 02:18 PM, Felix Kuehling wrote:
>>>>> On 2018-02-13 01:15 PM, Christian König wrote:
>>>>>> Am 13.02.2018 um 18:18 schrieb Felix Kuehling:
>>>>>>> I don't see how you can have separate TTM objects referring to the
>>>>>>> same memory.
>>>>>> Well that is trivial, we do this all the time with prime and I+A
>>>>>> laptops.
>>>>> As I understand it, you use DMABuf to export/import buffers on multiple
>>>>> devices. I believe all devices share a single amdgpu_bo, which contains
>>>>> the ttm_buffer_object.
>>> That's incorrect as well. Normally multiple devices have multiple
>>> ttm_buffer_object, one for each device.
>>> Going a bit higher that actually makes sense because the status of
>>> each BO is deferent for each device. E.g. one device could have the BO
>>> in access while it could be idle on another device.
>> Can you point me where this is done? I'm looking at
>> amdgpu_gem_prime_foreign_bo. It is used if an AMDGPU BO is imported into
>> a different AMDGPU device. It creates a new GEM object, with a reference
>> to the same amdgpu BO (gobj->bo = amdgpu_bo_ref(bo)). To me this looks
>> very much like the same amdgpu_bo, and cosequently the same TTM BO being
>> shared by two GEM objects and two devices.
> amdgpu_gem_prime_foreign_bo doesn't exist in amd-staging-drm-next, let
> alone upstream. Even on current internal branches, it's no longer used
> for dma-buf import AFAICT.

You're right. It exists on the KFD branch for so long that I was taking
it for granted. I see that on amd-staging-drm-next, importing a BO from
a different device, even with the same driver, results in pinning and
using SG tables.

Either way, the patch series discussed here doesn't touch any of that
code. All we do is map system memory BOs into multiple VMs on different
devices. And according Christian that is OK. So I'm a bit perplexed
about the opposition I'm facing from him.

Maybe it's time to take a step back from discussing details that are
irrelevant to the patch series being reviewed here. I understand and I
agree that the hacks we have in the KFD branch for enabling P2P
(including amdgpu_gem_prime_foreign_bo) will not be accepted upstream.
We discussed this a few monts ago with a patch series that was rejected.
Maybe your understanding has evolved beyond that since then and I'm just
catching up on that.

I also want to make it clear, that none of those hacks are included in
this patch series. They are not needed for enabling multi-GPU support
with system memory for KFD. If anything slipped into this patch series
that is objectionable from your point of view, please point it out to
me, and I will gladly address it.

The constructive feedback I've gathered so far concerns:

  * Use of VMs from DRM file descriptors for KFD instead of creating our own
      o I agree with this one because it's a more efficient use of resources
  * Creating GEM objects for KFD buffers
      o I disagree with this one because we don't need GEM objects and I
        haven't seen a good reason to convince me otherwise
  * Using GEM buffer handlers (and IDR) instead of our own in KFD
      o Related to the previous one. I think the overhead of KFD having
        its own per-process IDR is small, compared to the overhead of
        creating a GEM object for each KFD BO. Again, I haven't seen a
        good argument to convince me otherwise

Thanks,
  Felix

>
>