[RFC PATCH v3 09/11] drm, cgroup: Add per cgroup bw measure and control

Tue Jul 2 13:20:03 UTC 2019

On Fri, Jun 28, 2019 at 03:49:28PM -0400, Kenny Ho wrote:
> On Thu, Jun 27, 2019 at 2:11 AM Daniel Vetter <daniel at ffwll.ch> wrote:
> > I feel like a better approach would by to add a cgroup for the various
> > engines on the gpu, and then also account all the sdma (or whatever the
> > name of the amd copy engines is again) usage by ttm_bo moves to the right
> > cgroup.  I think that's a more meaningful limitation. For direct thrashing
> > control I think there's both not enough information available in the
> > kernel (you'd need some performance counters to watch how much bandwidth
> > userspace batches/CS are wasting), and I don't think the ttm eviction
> > logic is ready to step over all the priority inversion issues this will
> > bring up. Managing sdma usage otoh will be a lot more straightforward (but
> > still has all the priority inversion problems, but in the scheduler that
> > might be easier to fix perhaps with the explicit dependency graph - in the
> > i915 scheduler we already have priority boosting afaiui).
> My concern with hooking into the engine/ lower level is that the
> engine may not be process/cgroup aware.  So the bandwidth tracking is

Why is the engine not process aware? Thus far all command submission I'm
aware of is done by a real process from userspace ... we should be able to
track these with cgroups perfectly.

> per device.  I am also wondering if this is also potentially be a case
> of perfect getting in the way of good.  While ttm_bo_handle_move_mem
> may not track everything, it is still a key function for a lot of the
> memory operation.  Also, if the programming model is designed to
> bypass the kernel then I am not sure if there are anything the kernel
> can do.  (Things like kernel-bypass network stack comes to mind.)  All
> that said, I will certainly dig deeper into the topic.

The problem is there's not a full bypass of the kernel, any reasonable
workload will need both. But if you only control one side of the bandwidth
usuage, you're not really controlling anything.

Also, this is uapi: Perfect is pretty much the bar we need to clear, any
mistake will hurt us for the next 10 years at least :-)

btw if you haven't read it yet: The lwn article about the new block io
controller is pretty interesting. I think you're trying to solve a similar
problem here:

https://lwn.net/SubscriberLink/792256/e66982524fa9477b/

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch