Proposal to report GPU private memory allocations with sysfs nodes [plain text version]

Wed Nov 6 16:55:34 UTC 2019

On Tue, Nov 5, 2019 at 1:47 AM Daniel Vetter <daniel at ffwll.ch> wrote:
>
> On Mon, Nov 04, 2019 at 11:34:33AM -0800, Yiwei Zhang wrote:
> > Hi folks,
> >
> > (Daniel, I just moved you to this thread)
> >
> > Below are the latest thoughts based on all the feedback and comments.
> >
> > First, I need to clarify on the gpu memory object type enumeration
> > thing. We don't want to enforce those enumerations across the upstream
> > and Android, and we should just leave those configurable and flexible.
> >
> > Second, to make this effort also useful to all the other memory
> > management tools like PSS. At least an additional node is needed for
> > the part of the gpu private allocation not mapped to the
> > userspace(invisible to PSS). This is especially critical for the
> > downstream Android so that low-memory-killer(lmkd) can be aware of the
> > actual total memory for a process and will know how much gets freed up
> > if it kills that process. This is an effort to de-mystify the "lost
> > ram".
> >
> > Given above, the new node structure would look like below:
> >
> > Global nodes:
> > /sys/devices/<root>/gpu_mem/global/total /* Total private allocation
> > for coherency, this should also include the anonymous memory allocated
> > in the kmd  */
> > /sys/devices/<root>/gpu_mem/global/total_unmapped /* Account for the
> > private allocation not mapped to userspace(not visible for PSS), don't
> > need to be coherent with the "total" node. lmkd or equivalent service
> > looking at PSS will only look at this node in addition. */
> > /sys/devices/<root>/gpu_mem/global/<type1> /* One total value per
> > type, this should also include the anonymous memory allocated in the
> > kmd(or maybe another anonymous type for global nodes)  */
> > /sys/devices/<root>/gpu_mem/global/<type2> /* One total value per type */
> > ...
> > /sys/devices/<root>/gpu_mem/global/<typeN> /* One total value per type */
> >
> > Per process nodes:
> > /sys/devices/<root>/gpu_mem/proc/<pid>/total /* Total private
> > allocation for coherency */
> > /sys/devices/<root>/gpu_mem/proc/<pid>/total_unmapped /* Account for
> > the private allocation not mapped to userspace(not visible for PSS),
> > don't need to be coherent with the "total" node. lmkd or equivalent
> > service looking at PSS will only look at this node in addition. */
> > /sys/devices/<root>/gpu_mem/proc/<pid>/<type1> /* One total value per type */
> > /sys/devices/<root>/gpu_mem/proc/<pid>/<type2> /* One total value per type */
> > ...
> > /sys/devices/<root>/gpu_mem/proc/<pid>/<typeN> /* One total value per type */
> >
> > The type1 to typeN for downstream Android will be the enumerations I
> > mentioned in the original email which are: unknown, shader,...,
> > transient. For the upstream, those can be the labeled BOs or any other
> > customized types.
> >
> > Look forward to the comments and feedback!
>
> I don't think this will work well, at least for upstream:
>
> - The labels are currently free-form, baking them back into your structure
>   would mean we'd need to do lots of hot add/remove of sysfs directory
>   trees. Which sounds like a real bad idea :-/

also, a bo's label can change over time if it is re-used for a
different purpose.. not sure what the overhead is for add/remove
sysfs, but I don't think I want that overhead in the bo_reuse path

(maybe that matters less for vk, where we aren't using a userspace bo cache)

BR,
-R

> - Buffer objects aren't attached to pids, but files. And files can be
>   shared. If we want to list this somewhere outside of debugfs, we need to
>   tie this into the files somehow (so proc), except the underlying files
>   are all anon inodes, so this gets really tricky I think to make work
>   well.
>
> Cheers, Daniel
>
> >
> > Best regards,
> > Yiwei
> >
> >
> >
> >
> > On Fri, Nov 1, 2019 at 1:37 AM Pekka Paalanen <ppaalanen at gmail.com> wrote:
> > >
> > > On Thu, 31 Oct 2019 13:57:00 -0400
> > > Kenny Ho <y2kenny at gmail.com> wrote:
> > >
> > > > Hi Yiwei,
> > > >
> > > > This is the latest series:
> > > > https://patchwork.kernel.org/cover/11120371/
> > > >
> > > > (I still need to reply some of the feedback.)
> > > >
> > > > Regards,
> > > > Kenny
> > > >
> > > > On Thu, Oct 31, 2019 at 12:59 PM Yiwei Zhang <zzyiwei at google.com> wrote:
> > > > >
> > > > > Hi Kenny,
> > > > >
> > > > > Thanks for the info. Do you mind forwarding the existing discussion to me or have me cc'ed in that thread?
> > > > >
> > > > > Best,
> > > > > Yiwei
> > > > >
> > > > > On Wed, Oct 30, 2019 at 10:23 PM Kenny Ho <y2kenny at gmail.com> wrote:
> > > > >>
> > > > >> Hi Yiwei,
> > > > >>
> > > > >> I am not sure if you are aware, there is an ongoing RFC on adding drm
> > > > >> support in cgroup for the purpose of resource tracking.  One of the
> > > > >> resource is GPU memory.  It's not exactly the same as what you are
> > > > >> proposing (it doesn't track API usage, but it tracks the type of GPU
> > > > >> memory from kmd perspective) but perhaps it would be of interest to
> > > > >> you.  There are no consensus on it at this point.
> > >
> > > Hi Yiwei,
> > >
> > > I'd like to point out an effort to have drivers label BOs for debugging
> > > purposes: https://lists.freedesktop.org/archives/dri-devel/2019-October/239727.html
> > >
> > > I don't know if it would work, but an obvious idea might be to use
> > > those labels for tracking the kinds of buffers - a piece of UAPI which I
> > > believe you are still missing.
> > >
> > >
> > > Thanks,
> > > pq
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel