[PATCH] drm/sysfs: Add drm class-wide attribute to get active device clients
Lucas De Marchi
lucas.demarchi at intel.com
Wed May 1 16:19:48 UTC 2024
On Wed, May 01, 2024 at 04:58:05PM GMT, Tvrtko Ursulin wrote:
>
>Hi,
>
>On 24/04/2024 15:48, Adrián Larumbe wrote:
>>Hi Tvrtko,
>>
>>On 15.04.2024 13:50, Tvrtko Ursulin wrote:
>>>
>>>On 05/04/2024 18:59, Rob Clark wrote:
>>>>On Wed, Apr 3, 2024 at 11:37 AM Adrián Larumbe
>>>><adrian.larumbe at collabora.com> wrote:
>>>>>
>>>>>Up to this day, all fdinfo-based GPU profilers must traverse the entire
>>>>>/proc directory structure to find open DRM clients with fdinfo file
>>>>>descriptors. This is inefficient and time-consuming.
>>>>>
>>>>>This patch adds a new device class attribute that will install a sysfs file
>>>>>per DRM device, which can be queried by profilers to get a list of PIDs for
>>>>>their open clients. This file isn't human-readable, and it's meant to be
>>>>>queried only by GPU profilers like gputop and nvtop.
>>>>>
>>>>>Cc: Boris Brezillon <boris.brezillon at collabora.com>
>>>>>Cc: Tvrtko Ursulin <tursulin at ursulin.net>
>>>>>Cc: Christopher Healy <healych at amazon.com>
>>>>>Signed-off-by: Adrián Larumbe <adrian.larumbe at collabora.com>
>>>>
>>>>It does seem like a good idea.. idk if there is some precedent to
>>>>prefer binary vs ascii in sysfs, but having a way to avoid walking
>>>>_all_ processes is a good idea.
>>>
>>>I naturally second that it is a needed feature, but I do not think binary
>>>format is justified. AFAIR it should be used for things like hw/fw
>>>standardised tables or firmware images, not when exporting a simple list of
>>>PIDs. It also precludes easy shell/script access and the benefit of avoiding
>>>parsing a short list is I suspect completely dwarfed by needing to parse all
>>>the related fdinfo etc.
>>
>>I'd rather keep it as a binary file for the sake of easily parsing the number
>>list on the client side, in gputop or nvtop. For textual access, there's already
>>a debugfs file that presents the same information, so I thought it was best not
>>to duplicate that functionality and restrict sysfs to serving the very specific
>>use case of UM profilers having to access the DRM client list.
>>
>>I should mention I did something controversial here, which is a semantically
>>binary attribute through the regular attribute interface. I guess if I keep it
>>as a binary attribute in the end, I should switch over to the binary attribute
>>API.
>>
>>Another reason why I implemented it as a binary file is that we can only send
>>back at most a whole page. If a PID takes 4 bytes, that's usually 1024 clients
>>at most, which is probably enough for any UM profiler, but will decrease even
>>more if we turn it into an ASCII readable file.
>
>I'm afraid I still think there is no reason for a binary file, even
>less so artificially limited to 1024 clients. Any consumer will have
>to parse text fdinfo so a binary list of pids is not adding any real
>cost.
yeah, I don't really understand why you'd want the binary number that
you'd then have to turn into a string to open the /proc/<pid>/. To me it
sounds more like we want a text output and that output to be:
<pid>/fdinfo/<fd>
So gputop could just read this file to know where the info is.
Too bad we can't symlink cross fs, otherwise we could just add symlinks
to e.g. /sys/class/drm/card<N>/clients/*, which then nicely separate it
per gpu too.
But see below.
>
>>I did some research into sysfs binary attributes, and while some sources mention that
>>it's often used for dumping or loading of driver FW, none of them claim it cannot
>>be used for other purposes.
>>
>>>>>---
>>>>> drivers/gpu/drm/drm_internal.h | 2 +-
>>>>> drivers/gpu/drm/drm_privacy_screen.c | 2 +-
>>>>> drivers/gpu/drm/drm_sysfs.c | 89 ++++++++++++++++++++++------
>>>>> 3 files changed, 74 insertions(+), 19 deletions(-)
>>>>>
>>>>>diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>>>>>index 2215baef9a3e..9a399b03d11c 100644
>>>>>--- a/drivers/gpu/drm/drm_internal.h
>>>>>+++ b/drivers/gpu/drm/drm_internal.h
>>>>>@@ -145,7 +145,7 @@ bool drm_master_internal_acquire(struct drm_device *dev);
>>>>> void drm_master_internal_release(struct drm_device *dev);
>>>>>
>>>>> /* drm_sysfs.c */
>>>>>-extern struct class *drm_class;
>>>>>+extern struct class drm_class;
>>>>>
>>>>> int drm_sysfs_init(void);
>>>>> void drm_sysfs_destroy(void);
>>>>>diff --git a/drivers/gpu/drm/drm_privacy_screen.c b/drivers/gpu/drm/drm_privacy_screen.c
>>>>>index 6cc39e30781f..2fbd24ba5818 100644
>>>>>--- a/drivers/gpu/drm/drm_privacy_screen.c
>>>>>+++ b/drivers/gpu/drm/drm_privacy_screen.c
>>>>>@@ -401,7 +401,7 @@ struct drm_privacy_screen *drm_privacy_screen_register(
>>>>> mutex_init(&priv->lock);
>>>>> BLOCKING_INIT_NOTIFIER_HEAD(&priv->notifier_head);
>>>>>
>>>>>- priv->dev.class = drm_class;
>>>>>+ priv->dev.class = &drm_class;
>>>>> priv->dev.type = &drm_privacy_screen_type;
>>>>> priv->dev.parent = parent;
>>>>> priv->dev.release = drm_privacy_screen_device_release;
>>>>>diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
>>>>>index a953f69a34b6..56ca9e22c720 100644
>>>>>--- a/drivers/gpu/drm/drm_sysfs.c
>>>>>+++ b/drivers/gpu/drm/drm_sysfs.c
>>>>>@@ -58,8 +58,6 @@ static struct device_type drm_sysfs_device_connector = {
>>>>> .name = "drm_connector",
>>>>> };
>>>>>
>>>>>-struct class *drm_class;
>>>>>-
>>>>> #ifdef CONFIG_ACPI
>>>>> static bool drm_connector_acpi_bus_match(struct device *dev)
>>>>> {
>>>>>@@ -128,6 +126,62 @@ static const struct component_ops typec_connector_ops = {
>>>>>
>>>>> static CLASS_ATTR_STRING(version, S_IRUGO, "drm 1.1.0 20060810");
>>>>>
>>>>>+static ssize_t clients_show(struct device *cd, struct device_attribute *attr, char *buf)
>>>>>+{
>>>>>+ struct drm_minor *minor = cd->driver_data;
>>>>>+ struct drm_device *ddev = minor->dev;
>>>>>+ struct drm_file *priv;
>>>>>+ ssize_t offset = 0;
>>>>>+ void *pid_buf;
>>>>>+
>>>>>+ if (minor->type != DRM_MINOR_RENDER)
>>>>>+ return 0;
>>>
>>>Why this?
>>
>>I return nothing in case of a non-render node because we don't want display drivers
>>to confuse UM GPU profilers.
>
>Feels to arbitrary to me. Let them handle it.
>
>>>>>+
>>>>>+ pid_buf = kvmalloc(PAGE_SIZE, GFP_KERNEL);
>>>
>>>I don't quite get the kvmalloc for just one page (or why even a temporay
>>>buffer and not write into buf directly?).
>>
>>Should've used kmalloc, you're right. Or else I could just write everything straight into 'buf'.
>>
>>>>>+ if (!pid_buf)
>>>>>+ return 0;
>>>>>+
>>>>>+ mutex_lock(&ddev->filelist_mutex);
>>>>>+ list_for_each_entry_reverse(priv, &ddev->filelist, lhead) {
>>>>>+ struct pid *pid;
>>>>>+
>>>>>+ if (drm_WARN_ON(ddev, (PAGE_SIZE - offset) < sizeof(pid_t)))
>>>>>+ break;
>>>
>>>Feels bad.. I would suggest exploring implementing a read callback (instead of
>>>show) and handling arbitrary size output.
>>
>>I think regular class attributes can only implement show() and set(). For a more complex
>>interface, I would have to turn it into an actual binary attribute, and that would be the only
>>choice if we want the list of clients to be of arbitrary size.
>
>Yeah, i915 uses that to dump the error capture file which can be huge
>and is text so it is doable.
>
>>>>>+
>>>>>+ rcu_read_lock();
>>>>>+ pid = rcu_dereference(priv->pid);
>>>>>+ (*(pid_t *)(pid_buf + offset)) = pid_vnr(pid);
>>>>>+ rcu_read_unlock();
>>>>>+
>>>>>+ offset += sizeof(pid_t);
>>>>>+ }
>>>>>+ mutex_unlock(&ddev->filelist_mutex);
>>>>>+
>>>>>+ if (offset < PAGE_SIZE)
>>>>>+ (*(pid_t *)(pid_buf + offset)) = 0;
>>>
>>>Either NULL terminated or PAGE_SIZE/sizeof(pid) entries and not NULL
>>>terminated feels weird. If I got that right.
>>>
>>>For me everything points towards going for text output.
>>
>>Yes, I know it might sound weird, but my reasoning was: either there are PAGE_SIZE/sizeof(pid) entries
>>and the file isn't NULL terminated (which should be picked up by clients as being one page worth
>>of data, the sysfs attribute maximum transfer unit), or else there aren't enough entries to fill
>>a page and after the last one there's a NULL entry.
>>
>>
>>>>>+
>>>>>+ memcpy(buf, pid_buf, offset);
>>>>>+
>>>>>+ kvfree(pid_buf);
>>>>>+
>>>>>+ return offset;
>>>>>+
>>>>>+}
>>>>>+static DEVICE_ATTR_RO(clients);
/proc/<pid>/fdinfo/ is only readable by the owner. if we report what are
the open fds (or even what are the pids with a drm fd), we are
leaking that info. So we should probably make this
DEVICE_ATTR_ADMIN_RO.
Lucas De Marchi
>>>
>>>Shouldn't BIN_ATTR_RO be used for binary files in sysfs?
>>
>>Like I said above, I sort of faked a binary attribute through the regular sysfs attr API,
>>which is most likely a bad idea.
>>
>>>Regards,
>>>
>>>Tvrtko
>>>
>>>P.S. Or maybe it is time for drmfs? Where each client gets a directory and
>>>drivers can populate files. Such as per client logging streams and whatnot.
>>
>>Yes, but maybe this is something we can discuss in depth in an RFC at a later time?
>
>Yes of course, it is just a long standing idea for flexible per client
>stuff.
>
>Regards,
>
>Tvrtko
>
>>
>>>>>+
>>>>>+static struct attribute *drm_device_attrs[] = {
>>>>>+ &dev_attr_clients.attr,
>>>>>+ NULL,
>>>>>+};
>>>>>+ATTRIBUTE_GROUPS(drm_device);
>>>>>+
>>>>>+struct class drm_class = {
>>>>>+ .name = "drm",
>>>>>+ .dev_groups = drm_device_groups,
>>>>>+};
>>>>>+
>>>>>+static bool drm_class_initialised;
>>>>>+
>>>>> /**
>>>>> * drm_sysfs_init - initialize sysfs helpers
>>>>> *
>>>>>@@ -142,18 +196,19 @@ int drm_sysfs_init(void)
>>>>> {
>>>>> int err;
>>>>>
>>>>>- drm_class = class_create("drm");
>>>>>- if (IS_ERR(drm_class))
>>>>>- return PTR_ERR(drm_class);
>>>>>+ err = class_register(&drm_class);
>>>>>+ if (err)
>>>>>+ return err;
>>>>>
>>>>>- err = class_create_file(drm_class, &class_attr_version.attr);
>>>>>+ err = class_create_file(&drm_class, &class_attr_version.attr);
>>>>> if (err) {
>>>>>- class_destroy(drm_class);
>>>>>- drm_class = NULL;
>>>>>+ class_destroy(&drm_class);
>>>>> return err;
>>>>> }
>>>>>
>>>>>- drm_class->devnode = drm_devnode;
>>>>>+ drm_class.devnode = drm_devnode;
>>>>>+
>>>>>+ drm_class_initialised = true;
>>>>>
>>>>> drm_sysfs_acpi_register();
>>>>> return 0;
>>>>>@@ -166,12 +221,12 @@ int drm_sysfs_init(void)
>>>>> */
>>>>> void drm_sysfs_destroy(void)
>>>>> {
>>>>>- if (IS_ERR_OR_NULL(drm_class))
>>>>>+ if (!drm_class_initialised)
>>>>> return;
>>>>> drm_sysfs_acpi_unregister();
>>>>>- class_remove_file(drm_class, &class_attr_version.attr);
>>>>>- class_destroy(drm_class);
>>>>>- drm_class = NULL;
>>>>>+ class_remove_file(&drm_class, &class_attr_version.attr);
>>>>>+ class_destroy(&drm_class);
>>>>>+ drm_class_initialised = false;
>>>>> }
>>>>>
>>>>> static void drm_sysfs_release(struct device *dev)
>>>>>@@ -372,7 +427,7 @@ int drm_sysfs_connector_add(struct drm_connector *connector)
>>>>> return -ENOMEM;
>>>>>
>>>>> device_initialize(kdev);
>>>>>- kdev->class = drm_class;
>>>>>+ kdev->class = &drm_class;
>>>>> kdev->type = &drm_sysfs_device_connector;
>>>>> kdev->parent = dev->primary->kdev;
>>>>> kdev->groups = connector_dev_groups;
>>>>>@@ -550,7 +605,7 @@ struct device *drm_sysfs_minor_alloc(struct drm_minor *minor)
>>>>> minor_str = "card%d";
>>>>>
>>>>> kdev->devt = MKDEV(DRM_MAJOR, minor->index);
>>>>>- kdev->class = drm_class;
>>>>>+ kdev->class = &drm_class;
>>>>> kdev->type = &drm_sysfs_device_minor;
>>>>> }
>>>>>
>>>>>@@ -579,10 +634,10 @@ struct device *drm_sysfs_minor_alloc(struct drm_minor *minor)
>>>>> */
>>>>> int drm_class_device_register(struct device *dev)
>>>>> {
>>>>>- if (!drm_class || IS_ERR(drm_class))
>>>>>+ if (!drm_class_initialised)
>>>>> return -ENOENT;
>>>>>
>>>>>- dev->class = drm_class;
>>>>>+ dev->class = &drm_class;
>>>>> return device_register(dev);
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(drm_class_device_register);
>>>>>
>>>>>base-commit: 45c734fdd43db14444025910b4c59dd2b8be714a
>>>>>--
>>>>>2.44.0
>>>>>
>>
>>Adrian Larumbe
More information about the dri-devel
mailing list