[RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem

Fri May 10 15:07:21 UTC 2019

On Fri, May 10, 2019 at 8:31 AM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> I think it is a good approach to try to add a global limit first and
> when that's working go ahead with limiting device specific resources.
What are some of the global drm resource limit/allocation that would
be useful to implement? I would be happy to dig into those.

Regards,
Kenny


> The only major issue I can see is on patch #4, see there for further
> details.
>
> Christian.
>
> Am 09.05.19 um 23:04 schrieb Kenny Ho:
> > This is a follow up to the RFC I made last november to introduce a cgroup controller for the GPU/DRM subsystem [a].  The goal is to be able to provide resource management to GPU resources using things like container.  The cover letter from v1 is copied below for reference.
> >
> > Usage examples:
> > // set limit for card1 to 1GB
> > sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> > // set limit for card0 to 512MB
> > sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
> >
> >
> > v2:
> > * Removed the vendoring concepts
> > * Add limit to total buffer allocation
> > * Add limit to the maximum size of a buffer allocation
> >
> > TODO: process migration
> > TODO: documentations
> >
> > [a]: https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> >
> > v1: cover letter
> >
> > The purpose of this patch series is to start a discussion for a generic cgroup
> > controller for the drm subsystem.  The design proposed here is a very early one.
> > We are hoping to engage the community as we develop the idea.
> >
> >
> > Backgrounds
> > ==========
> > Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
> > tasks, and all their future children, into hierarchical groups with specialized
> > behaviour, such as accounting/limiting the resources which processes in a cgroup
> > can access[1].  Weights, limits, protections, allocations are the main resource
> > distribution models.  Existing cgroup controllers includes cpu, memory, io,
> > rdma, and more.  cgroup is one of the foundational technologies that enables the
> > popular container application deployment and management method.
> >
> > Direct Rendering Manager/drm contains code intended to support the needs of
> > complex graphics devices. Graphics drivers in the kernel may make use of DRM
> > functions to make tasks like memory management, interrupt handling and DMA
> > easier, and provide a uniform interface to applications.  The DRM has also
> > developed beyond traditional graphics applications to support compute/GPGPU
> > applications.
> >
> >
> > Motivations
> > =========
> > As GPU grow beyond the realm of desktop/workstation graphics into areas like
> > data center clusters and IoT, there are increasing needs to monitor and regulate
> > GPU as a resource like cpu, memory and io.
> >
> > Matt Roper from Intel began working on similar idea in early 2018 [2] for the
> > purpose of managing GPU priority using the cgroup hierarchy.  While that
> > particular use case may not warrant a standalone drm cgroup controller, there
> > are other use cases where having one can be useful [3].  Monitoring GPU
> > resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
> > (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
> > sysadmins get a better understanding of the applications usage profile.  Further
> > usage regulations of the aforementioned resources can also help sysadmins
> > optimize workload deployment on limited GPU resources.
> >
> > With the increased importance of machine learning, data science and other
> > cloud-based applications, GPUs are already in production use in data centers
> > today [5,6,7].  Existing GPU resource management is very course grain, however,
> > as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
> > alternative is to use GPU virtualization (with or without SRIOV) but it
> > generally acts on the entire GPU instead of the specific resources in a GPU.
> > With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
> > resource management (in addition to what may be available via GPU
> > virtualization.)
> >
> > In addition to production use, the DRM cgroup can also help with testing
> > graphics application robustness by providing a mean to artificially limit DRM
> > resources availble to the applications.
> >
> > Challenges
> > ========
> > While there are common infrastructure in DRM that is shared across many vendors
> > (the scheduler [4] for example), there are also aspects of DRM that are vendor
> > specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
> > handle different kinds of cgroup controller.
> >
> > Resources for DRM are also often device (GPU) specific instead of system
> > specific and a system may contain more than one GPU.  For this, we borrowed some
> > of the ideas from RDMA cgroup controller.
> >
> > Approach
> > =======
> > To experiment with the idea of a DRM cgroup, we would like to start with basic
> > accounting and statistics, then continue to iterate and add regulating
> > mechanisms into the driver.
> >
> > [1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
> > [2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > [3] https://www.spinics.net/lists/cgroups/msg20720.html
> > [4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
> > [5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
> > [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
> > [7] https://github.com/RadeonOpenCompute/k8s-device-plugin
> > [8] https://github.com/kubernetes/kubernetes/issues/52757
> >
> > Kenny Ho (5):
> >    cgroup: Introduce cgroup for drm subsystem
> >    cgroup: Add mechanism to register DRM devices
> >    drm/amdgpu: Register AMD devices for DRM cgroup
> >    drm, cgroup: Add total GEM buffer allocation limit
> >    drm, cgroup: Add peak GEM buffer allocation limit
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |   4 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
> >   drivers/gpu/drm/drm_gem.c                  |   7 +
> >   drivers/gpu/drm/drm_prime.c                |   9 +
> >   include/drm/drm_cgroup.h                   |  54 +++
> >   include/drm/drm_gem.h                      |  11 +
> >   include/linux/cgroup_drm.h                 |  47 ++
> >   include/linux/cgroup_subsys.h              |   4 +
> >   init/Kconfig                               |   5 +
> >   kernel/cgroup/Makefile                     |   1 +
> >   kernel/cgroup/drm.c                        | 497 +++++++++++++++++++++
> >   11 files changed, 643 insertions(+)
> >   create mode 100644 include/drm/drm_cgroup.h
> >   create mode 100644 include/linux/cgroup_drm.h
> >   create mode 100644 kernel/cgroup/drm.c
> >
>