[PATCH RFC v4 00/16] new cgroup controller for gpu/drm subsystem
Tejun Heo
tj at kernel.org
Sat Aug 31 04:28:57 UTC 2019
Hello,
I just glanced through the interface and don't have enough context to
give any kind of detailed review yet. I'll try to read up and
understand more and would greatly appreciate if you can give me some
pointers to read up on the resources being controlled and how the
actual use cases would look like. That said, I have some basic
concerns.
* TTM vs. GEM distinction seems to be internal implementation detail
rather than anything relating to underlying physical resources.
Provided that's the case, I'm afraid these internal constructs being
used as primary resource control objects likely isn't the right
approach. Whether a given driver uses one or the other internal
abstraction layer shouldn't determine how resources are represented
at the userland interface layer.
* While breaking up and applying control to different types of
internal objects may seem attractive to folks who work day in and
day out with the subsystem, they aren't all that useful to users and
the siloed controls are likely to make the whole mechanism a lot
less useful. We had the same problem with cgroup1 memcg - putting
control of different uses of memory under separate knobs. It made
the whole thing pretty useless. e.g. if you constrain all knobs
tight enough to control the overall usage, overall utilization
suffers, but if you don't, you really don't have control over actual
usage. For memcg, what has to be allocated and controlled is
physical memory, no matter how they're used. It's not like you can
go buy more "socket" memory. At least from the looks of it, I'm
afraid gpu controller is repeating the same mistakes.
Thanks.
--
tejun
More information about the amd-gfx
mailing list