[RFC PATCH 0/5] cgroup support for GPU devices
Tejun Heo
tj at kernel.org
Mon May 6 15:26:43 UTC 2019
Hello,
On Wed, May 01, 2019 at 10:04:33AM -0400, Brian Welty wrote:
> The patch series enables device drivers to use cgroups to control the
> following resources within a GPU (or other accelerator device):
> * control allocation of device memory (reuse of memcg)
> and with future work, we could extend to:
> * track and control share of GPU time (reuse of cpu/cpuacct)
> * apply mask of allowed execution engines (reuse of cpusets)
>
> Instead of introducing a new cgroup subsystem for GPU devices, a new
> framework is proposed to allow devices to register with existing cgroup
> controllers, which creates per-device cgroup_subsys_state within the
> cgroup. This gives device drivers their own private cgroup controls
> (such as memory limits or other parameters) to be applied to device
> resources instead of host system resources.
> Device drivers (GPU or other) are then able to reuse the existing cgroup
> controls, instead of inventing similar ones.
I'm really skeptical about this approach. When creating resource
controllers, I think what's the most important and challenging is
establishing resource model - what resources are and how they can be
distributed. This patchset is going the other way around - building
out core infrastructure for bolierplates at a significant risk of
mixing up resource models across different types of resources.
IO controllers already implement per-device controls. I'd suggest
following the same interface conventions and implementing a dedicated
controller for the subsystem.
Thanks.
--
tejun
More information about the dri-devel
mailing list