Proposal to merge KFD into amdgpu

Felix Kuehling felix.kuehling at amd.com
Wed Jul 4 21:36:40 UTC 2018


Since KFD is only supported by a single GPU driver now (amdgpu), it
makes sense to merge the two. This has been raised on the amd-gfx list
before and I've been putting it off to avoid more churn while I was
working on upstreaming KFD. Now seems a good time to pick this up again.

At this stage there are some things that I don't expect to change
significantly:

  * Directory structure
  * KFD function naming conventions
  * KFD device and sysfs interfaces

This is a rough overview of the changes I have in mind. We should be
able to implement these step-by-step and minimize disruption:

1. Change the build system to build KFD into amdgpu.ko

This should make KFD similar to DAL or powerplay. It's still a mostly
separate code base and Makefile with its own directory, but gets linked
into amdgpu.ko.

In the kernel configuration HSA_AMD would become a boolean option under
DRM_AMDGPU that controls whether KFD functionality gets built into amdgpu.

Any code inside #if defined(CONFIG_HSA_AMD_MODULE) can be removed.

2. Simplify the kfd2kgd and kgd2kfg interfaces

Function pointers in struct kgd2kfd_calls are no longer needed. These
functions can be called directly from amdgpu.

Hardware-independent function pointers in kfd2kgd_calls are no longer
needed. These function can be called directly from amdkfd. Some of the
function pointers in kfd2kgd_calls are used for hardware abstraction
with different implementations for each GFX HW generation. These will
need to remain function pointers.

At some later stage, the HW-specific functions could be moved into
gfx_v*.c and the function pointers added to struct amdgpu_gfx. But at
this stage I think I'd leave them where they are.

3. Reduce duplicate tracking of device and BO structures

Currently KFD and AMDGPU pretend to not know each other's data
structures. If both are in the same module, we could allow KFD to access
some amdgpu structures directly (e.g. amdgpu_device and amdgpu_bo). This
way some of the duplicate tracking of devices and buffer objects could
be eliminated.

This may present opportunities to simplify some functionality that's
currently split across both modules, such as suspend/resume, memory
management and evictions.

Some interfaces that just query information from amdgpu could be removed
if KFD can access that information directly (e.g. firmware versions, CU
info, ...).

Please let me know if you have any objections, suggestions, ideas ...

Regards,
  Felix

-- 
F e l i x   K u e h l i n g
PMTS Software Development Engineer | Linux Compute Kernel
1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada
(O) +1(289)695-1597
   _     _   _   _____   _____
  / \   | \ / | |  _  \  \ _  |
 / A \  | \M/ | | |D) )  /|_| |
/_/ \_\ |_| |_| |_____/ |__/ \|   facebook.com/AMD | amd.com



More information about the amd-gfx mailing list