[PATCH RFC v4 00/16] new cgroup controller for gpu/drm subsystem
Daniel Vetter
daniel at ffwll.ch
Tue Sep 3 07:55:50 UTC 2019
On Fri, Aug 30, 2019 at 09:28:57PM -0700, Tejun Heo wrote:
> Hello,
>
> I just glanced through the interface and don't have enough context to
> give any kind of detailed review yet. I'll try to read up and
> understand more and would greatly appreciate if you can give me some
> pointers to read up on the resources being controlled and how the
> actual use cases would look like. That said, I have some basic
> concerns.
>
> * TTM vs. GEM distinction seems to be internal implementation detail
> rather than anything relating to underlying physical resources.
> Provided that's the case, I'm afraid these internal constructs being
> used as primary resource control objects likely isn't the right
> approach. Whether a given driver uses one or the other internal
> abstraction layer shouldn't determine how resources are represented
> at the userland interface layer.
Yeah there's another RFC series from Brian Welty to abstract this away as
a memory region concept for gpus.
> * While breaking up and applying control to different types of
> internal objects may seem attractive to folks who work day in and
> day out with the subsystem, they aren't all that useful to users and
> the siloed controls are likely to make the whole mechanism a lot
> less useful. We had the same problem with cgroup1 memcg - putting
> control of different uses of memory under separate knobs. It made
> the whole thing pretty useless. e.g. if you constrain all knobs
> tight enough to control the overall usage, overall utilization
> suffers, but if you don't, you really don't have control over actual
> usage. For memcg, what has to be allocated and controlled is
> physical memory, no matter how they're used. It's not like you can
> go buy more "socket" memory. At least from the looks of it, I'm
> afraid gpu controller is repeating the same mistakes.
We do have quite a pile of different memories and ranges, so I don't
thinkt we're doing the same mistake here. But it is maybe a bit too
complicated, and exposes stuff that most users really don't care about.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the amd-gfx
mailing list