Proposal to report GPU private memory allocations with sysfs nodes [plain text version]

Fri Dec 13 22:09:32 UTC 2019

Hi folks,

Would we be able to track the below for each of the graphics kmds:
(1) Global total memory
(2) Per-process total memory
(3) Per-process total memory not mapped to userland -> when it's
mapped it's shown in RSS, so this is to help complete the picture of
RSS

Would it be better reported under each kmd's device node? or in proc/
or sys/? Any draft ideas or concerns are so welcome!

As for the previous detailed tracking for the userland contexts, on
downstream Android we'll throw a HAL for memory data from those
detailed categories.

Thanks for all the info and comments so far! Look forward to better
ideas as well!

Best regards,
Yiwei

On Thu, Nov 14, 2019 at 5:02 PM Yiwei Zhang <zzyiwei at google.com> wrote:
>
> Thanks for all the comments and feedback, and they are all so valuable to me.
>
> Let me summarize the main concerns so far here:
> (1) Open source driver never specifies what API is creating a gem
> object (opengl, vulkan, ...) nor what purpose (transient, shader,
> ...).
> (2) The ioctl to label anything to a BO and the label can change over
> the BO's lifetime: https://patchwork.kernel.org/patch/11185643/
> (3) BOs are not attached to pids, but files, and can be shared.
>
> Besides the discussions here, there was a lot of internal discussion
> for this proposal as well. The general principle is that I'll align my
> proposal with what exists on the upstream so that to help Android
> common kernel stay close to the upstream kernel for the sake of future
> graphics driver architecture.
>
> I think tracking BOs per process would be a good thing on upstream as
> well. Some of the GPU addressable memory may have been mapped to the
> userspace which is visible in RSS. However, tools consuming RSS data
> can benefit more by knowing the amount of GPU memory which are not
> mapped. It's a good thing for per process memory accounting.
>
> BOs on the upstream is not equal to what's on Android today. Android
> GPU memory objects are purely private and thus indexed by pid, and the
> shared memory is allocated through ion/dmabuf interface. The
> ion/dmabuf is similar to the upstream BO except that GEM BOs may just
> be an anon inode without a fd before sharing. For Android ion/dmabuf
> accounting, there was already an effort here to improve the dma-buf
> tracking(https://patchwork.kernel.org/cover/10831029/), and there's
> userspace API built on top of the "proc/<pid>/fdinfo"
> node(https://android.googlesource.com/platform/system/core/+/refs/heads/master/libmeminfo/libdmabufinfo/include/dmabufinfo/dmabufinfo.h#103).
>
> Is it reasonable to add another ioctl or something equivalent to label
> a BO with what PID makes the allocation? When the BO gets shared to
> other processes, this information also needs to be bookkept somewhere
> for tracking. Basically I wonder if it's possible for upstream to
> track BOs in a similar way Android tracks dmabuf. Then there's a node
> implemented by cgroup in proc listing all the BOs per process with
> information like label, refcount, etc. Then Android GPU vendors can
> implement the same nodes which is going to be compatible even if they
> later adopts drm subsystem.
>
> So my sketch idea for the nodes are:
> (1) /proc/gpu0_meminfo, /proc/gpu1_meminfo
> This is a list of all BOs with pids holding a reference to it and the
> current label of each BO
> (2) /proc/<pid>/gpu0_meminfo, /proc/<pid>/gpu1_meminfo
> This is a list of all BOs this process holds a reference to.
> (3) Is it reasonable to implement another nodes for {total,
> total_unmapped} counters? or just surface through /proc/meminfo?
>
> Many thanks for the feedback!
> Yiwei
>
>
> On Tue, Nov 12, 2019 at 12:18 PM Jerome Glisse <jglisse at redhat.com> wrote:
> >
> > On Tue, Nov 12, 2019 at 10:17:10AM -0800, Yiwei Zhang wrote:
> > > Hi folks,
> > >
> > > What do you think about:
> > > > For the sysfs approach, I'm assuming the upstream vendors still need
> > > > to provide a pair of UMD and KMD, and this ioctl to label the BO is
> > > > kept as driver private ioctl. Then will each driver just define their
> > > > own set of "label"s and the KMD will only consume the corresponding
> > > > ones so that the sysfs nodes won't change at all? Report zero if
> > > > there's no allocation or re-use under a particular "label".
> >
> > To me this looks like a way to abuse the kernel into provide a specific
> > message passing API between process only for GPU. It would be better to
> > use existing kernel/userspace API to pass message between process than
> > add a new one just for a special case.
> >
> > Note that I believe that listing GPU allocation for a process might
> > useful but only if it is a generic thing accross all GPU (for upstream
> > GPU driver we do not care about non upstream).
> >
> > Cheers,
> > Jérôme
> >