<div dir="ltr"><div dir="ltr">On Mon, Oct 28, 2019 at 8:26 AM Jerome Glisse <<a href="mailto:jglisse@redhat.com">jglisse@redhat.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Fri, Oct 25, 2019 at 11:35:32AM -0700, Yiwei Zhang wrote:<br> > Hi folks,<br> > <br> > This is the plain text version of the previous email in case that was<br> > considered as spam.<br> > <br> > --- Background ---<br> > On the downstream Android, vendors used to report GPU private memory<br> > allocations with debugfs nodes in their own formats. However, debugfs nodes<br> > are getting deprecated in the next Android release.<br> <br> Maybe explain why it is useful first ?<br></blockquote><div><br></div><div>Memory is precious on Android mobile platforms. Apps using a large amount of</div><div>memory, games, tend to maintain a table for the memory on different devices with</div><div>different prediction models. Private gpu memory allocations is currently semi-blind</div><div>to the apps and the platform as well.</div><div><br></div><div>By having the data, the platform can do:</div><div>(1) GPU memory profiling as part of the huge Android profiler in progress.</div><div>(2) Android system health team can enrich the performance test coverage.</div><div>(3) We can collect filed metrics to detect any regression on the gpu private memory</div><div>allocations in the production population.</div><div>(4) Shell user can easily dump the allocations in a uniform way across vendors.</div><div>(5) Platform can feed the data to the apps so that apps can do memory allocations</div><div>in a more predictable way.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> > <br> > --- Proposal ---<br> > We are taking the chance to unify all the vendors to migrate their existing<br> > debugfs nodes into a standardized sysfs node structure. Then the platform<br> > is able to do a bunch of useful things: memory profiling, system health<br> > coverage, field metrics, local shell dump, in-app api, etc. This proposal<br> > is better served upstream as all GPU vendors can standardize a gpu memory<br> > structure and reduce fragmentation across Android and Linux that clients<br> > can rely on.<br> > <br> > --- Detailed design ---<br> > The sysfs node structure looks like below:<br> > /sys/devices/<ro.gfx.sysfs.0>/<pid>/<type_name><br> > e.g. "/sys/devices/mali0/gpu_mem/606/gl_buffer" and the gl_buffer is a node<br> > having the comma separated size values: "4096,81920,...,4096".<br> <br> How does kernel knows what API the allocation is use for ? With the<br> open source driver you never specify what API is creating a gem object<br> (opengl, vulkan, ...) nor what purpose (transient, shader, ...).<br></blockquote><div><br></div><div>Oh, is this a hard requirement for the open source drivers to not bookkeep any</div><div>data from userland? I think the API is just some additional metadata passed down.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> > For the top level root, vendors can choose their own names based on the<br> > value of ro.gfx.sysfs.0 the vendors set. (1) For the multiple gpu driver<br> > cases, we can use ro.gfx.sysfs.1, ro.gfx.sysfs.2 for the 2nd and 3rd KMDs.<br> > (2) It's also allowed to put some sub-dir for example "kgsl/gpu_mem" or<br> > "mali0/gpu_mem" in the ro.gfx.sysfs.<channel> property if the root name<br> > under /sys/devices/ is already created and used for other purposes.<br> <br> On one side you want to standardize on the other you want to give<br> complete freedom on the top level naming scheme. I would rather see a<br> consistent naming scheme (ie something more restraint and with little<br> place for interpration by individual driver)<br></blockquote><div><br></div><div>Thanks for commenting on this. We definitely need some suggestions on the root</div><div>directory. In the multi-gpu case on desktop, is there some existing consumer to</div><div>query "some data" from all the GPUs? How does the tool find all GPUs and</div><div>differentiate between them? Is this already standardized?<br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> > For the 2nd level "pid", there are usually just a couple of them per<br> > snapshot, since we only takes snapshot for the active ones.<br> <br> ? Do not understand here, you can have any number of applications with<br> GPU objects ? And thus there is no bound on the number of PID. Please<br> consider desktop too, i do not know what kind of limitation android<br> impose.<br></blockquote><div><br></div><div>We are only interested in tracking *active* GPU private allocations. So yes, any</div><div>application currently holding an active GPU context will probably has a node here.</div><div>Since we want to do profiling for specific apps, the data has to be per application</div><div>based. I don't get your concerns here. If it's about the tracking overhead, it's rare</div><div>to see tons of application doing private gpu allocations at the same time. Could</div><div>you help elaborate a bit?</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> > For the 3rd level "type_name", the type name will be one of the GPU memory<br> > object types in lower case, and the value will be a comma separated<br> > sequence of size values for all the allocations under that specific type.<br> > <br> > We especially would like some comments on this part. For the GPU memory<br> > object types, we defined 9 different types for Android:<br> > (1) UNKNOWN // not accounted for in any other category<br> > (2) SHADER // shader binaries<br> > (3) COMMAND // allocations which have a lifetime similar to a<br> > VkCommandBuffer<br> > (4) VULKAN // backing for VkDeviceMemory<br> > (5) GL_TEXTURE // GL Texture and RenderBuffer<br> > (6) GL_BUFFER // GL Buffer<br> > (7) QUERY // backing for query<br> > (8) DESCRIPTOR // allocations which have a lifetime similar to a<br> > VkDescriptorSet<br> > (9) TRANSIENT // random transient things that the driver needs<br> ><br> > We are wondering if those type enumerations make sense to the upstream side<br> > as well, or maybe we just deal with our own different type sets. Cuz on the<br> > Android side, we'll just read those nodes named after the types we defined<br> > in the sysfs node structure.<br> <br> See my above point of open source driver and kernel being unaware<br> of the allocation purpose and use.<br> <br> Cheers,<br> Jérôme<br> <br></blockquote><div><br></div><div>Many thanks for the reply!</div><div>Yiwei</div></div></div>