Proposal to report GPU private memory allocations with sysfs nodes [plain text version]

Yiwei Zhang zzyiwei at google.com
Mon Nov 4 19:34:33 UTC 2019


Hi folks,

(Daniel, I just moved you to this thread)

Below are the latest thoughts based on all the feedback and comments.

First, I need to clarify on the gpu memory object type enumeration
thing. We don't want to enforce those enumerations across the upstream
and Android, and we should just leave those configurable and flexible.

Second, to make this effort also useful to all the other memory
management tools like PSS. At least an additional node is needed for
the part of the gpu private allocation not mapped to the
userspace(invisible to PSS). This is especially critical for the
downstream Android so that low-memory-killer(lmkd) can be aware of the
actual total memory for a process and will know how much gets freed up
if it kills that process. This is an effort to de-mystify the "lost
ram".

Given above, the new node structure would look like below:

Global nodes:
/sys/devices/<root>/gpu_mem/global/total /* Total private allocation
for coherency, this should also include the anonymous memory allocated
in the kmd  */
/sys/devices/<root>/gpu_mem/global/total_unmapped /* Account for the
private allocation not mapped to userspace(not visible for PSS), don't
need to be coherent with the "total" node. lmkd or equivalent service
looking at PSS will only look at this node in addition. */
/sys/devices/<root>/gpu_mem/global/<type1> /* One total value per
type, this should also include the anonymous memory allocated in the
kmd(or maybe another anonymous type for global nodes)  */
/sys/devices/<root>/gpu_mem/global/<type2> /* One total value per type */
...
/sys/devices/<root>/gpu_mem/global/<typeN> /* One total value per type */

Per process nodes:
/sys/devices/<root>/gpu_mem/proc/<pid>/total /* Total private
allocation for coherency */
/sys/devices/<root>/gpu_mem/proc/<pid>/total_unmapped /* Account for
the private allocation not mapped to userspace(not visible for PSS),
don't need to be coherent with the "total" node. lmkd or equivalent
service looking at PSS will only look at this node in addition. */
/sys/devices/<root>/gpu_mem/proc/<pid>/<type1> /* One total value per type */
/sys/devices/<root>/gpu_mem/proc/<pid>/<type2> /* One total value per type */
...
/sys/devices/<root>/gpu_mem/proc/<pid>/<typeN> /* One total value per type */

The type1 to typeN for downstream Android will be the enumerations I
mentioned in the original email which are: unknown, shader,...,
transient. For the upstream, those can be the labeled BOs or any other
customized types.

Look forward to the comments and feedback!

Best regards,
Yiwei




On Fri, Nov 1, 2019 at 1:37 AM Pekka Paalanen <ppaalanen at gmail.com> wrote:
>
> On Thu, 31 Oct 2019 13:57:00 -0400
> Kenny Ho <y2kenny at gmail.com> wrote:
>
> > Hi Yiwei,
> >
> > This is the latest series:
> > https://patchwork.kernel.org/cover/11120371/
> >
> > (I still need to reply some of the feedback.)
> >
> > Regards,
> > Kenny
> >
> > On Thu, Oct 31, 2019 at 12:59 PM Yiwei Zhang <zzyiwei at google.com> wrote:
> > >
> > > Hi Kenny,
> > >
> > > Thanks for the info. Do you mind forwarding the existing discussion to me or have me cc'ed in that thread?
> > >
> > > Best,
> > > Yiwei
> > >
> > > On Wed, Oct 30, 2019 at 10:23 PM Kenny Ho <y2kenny at gmail.com> wrote:
> > >>
> > >> Hi Yiwei,
> > >>
> > >> I am not sure if you are aware, there is an ongoing RFC on adding drm
> > >> support in cgroup for the purpose of resource tracking.  One of the
> > >> resource is GPU memory.  It's not exactly the same as what you are
> > >> proposing (it doesn't track API usage, but it tracks the type of GPU
> > >> memory from kmd perspective) but perhaps it would be of interest to
> > >> you.  There are no consensus on it at this point.
>
> Hi Yiwei,
>
> I'd like to point out an effort to have drivers label BOs for debugging
> purposes: https://lists.freedesktop.org/archives/dri-devel/2019-October/239727.html
>
> I don't know if it would work, but an obvious idea might be to use
> those labels for tracking the kinds of buffers - a piece of UAPI which I
> believe you are still missing.
>
>
> Thanks,
> pq


More information about the dri-devel mailing list