Proposal to report GPU private memory allocations with sysfs nodes

Daniel Vetter daniel at ffwll.ch
Mon Nov 4 10:30:18 UTC 2019


On Mon, Oct 28, 2019 at 11:07:28AM -0400, Sean Paul wrote:
> On Mon, Oct 28, 2019 at 10:56 AM Jordan Crouse <jcrouse at codeaurora.org> wrote:
> >
> > On Mon, Oct 28, 2019 at 08:47:58AM -0600, Jordan Crouse wrote:
> > > On Wed, Oct 23, 2019 at 11:00:58AM -0700, Yiwei Zhang wrote:
> > > > Hi folks,
> > > >
> > > > This is Yiwei from the Android Platform Graphics team. On the downstream
> > > > Android, vendors used to report GPU private memory allocations with debugfs
> > > > nodes in their own formats. However, debugfs nodes are getting deprecated
> > > > in the next Android release, so we are taking the chance to unify all the
> > > > vendors to migrate their existing debugfs nodes into a standardized sysfs
> > > > node structure. Then the platform is able to do a bunch of useful
> > > > things: memory profiling, system health coverage, field metrics, local
> > > > shell dump, in-app api, etc.
> > > >
> > > > Some vendors tend to do a lot of upstreams, so we are also seeking the
> > > > upstream possibilities here instead of making it an Android only thing.
> > > >
> > > > Attached are screenshots for the node structure we drafted and an example
> > > > for that.
> > > >
> > > > For the top level root, vendors can choose their own names based on the
> > > > value of ro.gfx.sysfs.0 the vendors set.
> > > >
> > > >    - For the multiple gpu driver cases, we can use ro.gfx.sysfs.1,
> > > >    ro.gfx.sysfs.2 for the 2nd and 3rd KMDs.
> > > >    - It's also allowed to put some sub-dir for example "kgsl/gpu_mem" or
> > > >    "mali0/gpu_mem" in the ro.gfx.sysfs.<channel> property if the root name
> > > >    under /sys/devices/ is already created and used for other purposes.
> > > >
> > > >
> > > > For the 2nd level pids, there are usually just a couple of them per
> > > > snapshot, since we only takes snapshot for the active ones.
> > > >
> > > > For the 3rd level types, the type name will be one of the GPU memory object
> > > > types in lower case, and the value will be a comma separated sequence of
> > > > size values for all the allocations under that specific type.
> > > >
> > > >    - For the GPU memory object types, we defined 9 different types for
> > > >    Android:
> > > >       -     // not accounted for in any other category
> > > >           UNKNOWN = 0;
> > > >           // shader binaries
> > > >           SHADER = 1;
> > > >           // allocations which have a lifetime similar to a VkCommandBuffer
> > > >           COMMAND = 2;
> > > >           // backing for VkDeviceMemory
> > > >           VULKAN = 3;
> > > >           // GL Texture and RenderBuffer
> > > >           GL_TEXTURE = 4;
> > > >           // GL Buffer
> > > >           GL_BUFFER = 5;
> > > >           // backing for query
> > > >           QUERY = 6;
> > > >           // allocations which have a lifetime similar to a VkDescriptorSet
> > > >           DESCRIPTOR = 7;
> > > >           // random transient things that the driver needs
> > > >           TRANSIENT = 8;
> > > >       - We are wondering if those type enumerations make sense to the
> > > >       upstream side as well, or maybe we just deal with our own different type
> > > >       sets. Cuz on the Android side, we'll just read those nodes named
> > > > after the
> > > >       types we defined in the sysfs node structure.
> > > >    - The node value can be: 4096,81920,...,4096
> > > >
> > > >
> > > > Looking forward to any concerns/comments/suggestions!
> > >
> > > Hi Yiwei.
> > >
> > > This is an important discussion that I think we need to have but many of us use
> > > text based email clients and PNG attachments are clumsy. It might help move the
> > > discussion along if you described the suggested interfaces in text (bonus: this
> > > could be the start of the .rst documentation) or provided a link to a cloud
> > > document that we could peruse.
> >
> > Which you have already done.  Sorry about that, maybe I should go through all my
> > inbox before opening my mouth. :)
> >
> 
> Fwiw, both copies hit my spam folder, I'm guessing because @google.com
> has DMARC enforced. So your email was a good reminder to check my
> spam.

Hm not finding the original mail at all, so replying here.

I think whatever we come up with this should be integrated into the
dma-buf/gem object naming we've just done. Maybe just dump all the objects
somewhere (atm it's debugfs) and let userspace tally it up? At least with
the current approach of having strings tallying up in the kernel would be
a real pain.


Also: Since we've gone with strings that drivers can pick however they
want to, not sure the enum thing is a workable idea.

Finally: Would need at least some mesa drivers using this, for
upstreaming.
-Daniel

> 
> Sean
> 
> > Jordan
> >
> > --
> > The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > a Linux Foundation Collaborative Project
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list