[igt-dev] [i-g-t, 1/1] Uevent listener for fake gt reset failure.
Ghimiray, Himal Prasad
himal.prasad.ghimiray at intel.com
Tue Jun 13 08:46:18 UTC 2023
Hi Kamil,
Thanks for the review.
> -----Original Message-----
> From: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> Sent: 09 June 2023 15:06
> To: igt-dev at lists.freedesktop.org
> Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>; Kumar,
> Janga Rahul <janga.rahul.kumar at intel.com>
> Subject: Re: [igt-dev] [i-g-t, 1/1] Uevent listener for fake gt reset failure.
>
> Hi Himal,
>
> please do not put dot at the end of subject:
>
> Uevent listener for fake gt reset failure.
> -----------------------------------------^
>
> Remove this '.' from the end. Also please add at begin what part of igt you
> modify, here it is adding new xe test so:
>
> tests/xe/xe_uevent: add new test for uevent gt reset failure
Will address in next patch.
>
> On 2023-06-05 at 11:20:59 +0530, Himal Prasad Ghimiray wrote:
> > This test is to cause the fake reset failure and capture the uevent
> --------------------------- ^^^^
> Describe why we need a fake one, not real one.
Sure.
>
> > sent in case of gt reset failure.
> >
> > The sending of Uevent is taken care by series:
> > drm/xe: Notify Userspace when engine/gt reset fails.
>
Will upload the patchwork link in next patch.
> Please write here link to public patchwork or GitLab or, if you prefer to not
> include it inot this description, put it after signed-off-by line and afer "---"
> line, see below.
>
> >
> > The infra to cause fake reset failure is in series:
> > drm/xe: Add a debugfs for faking gt reset failure.
>
> Same here, please also provide link.
Will address.
>
> >
> > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > ---
>
> ---
> Write here other comments or links which you do not want to be merged.
>
>
> > tests/meson.build | 1 +
> > .../xe_uevent_listener_fake_reset_failure.c | 128 ++++++++++++++++++
> > 2 files changed, 129 insertions(+)
> > create mode 100644 tests/xe/xe_uevent_listener_fake_reset_failure.c
> >
> > diff --git a/tests/meson.build b/tests/meson.build index
> > f71be1db..cc5843d7 100644
> > --- a/tests/meson.build
> > +++ b/tests/meson.build
> > @@ -268,6 +268,7 @@ xe_progs = [
> > 'xe_query',
> > 'xe_vm',
> > 'xe_waitfence',
> > + 'xe_uevent_listener_fake_reset_failure',
> ----------------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is too long for a name
> and also looks like one-only test, so please make it short, something like:
>
> 'xe_uevent',
>
> This way we can extend it in future for new subtests.
> Also, as noted by Rahul, keep it sorted alphabetiacally.
Sure. Will address it.
>
> > ]
> >
> > msm_progs = [
> > diff --git a/tests/xe/xe_uevent_listener_fake_reset_failure.c
> > b/tests/xe/xe_uevent_listener_fake_reset_failure.c
> > new file mode 100644
> > index 00000000..dd22e4c2
> > --- /dev/null
> > +++ b/tests/xe/xe_uevent_listener_fake_reset_failure.c
> > @@ -0,0 +1,128 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2023 Intel Corporation */
> > +
> > +/**
> > + * TEST: cause fake gt reset failure and listen uevent from KMD
> > + * SUBTEST:fake_reset_uevent_listener
> > + * Description:
> > + * Test creates uevent listener and causes fake reset failure for
> gt0
> > + * and returns success if uevent is sent by driver and listened by
> listener.
> > + */
> > +
> > +#include "igt.h"
> ------------ ^
> > +#include "xe_drm.h"
> ------------ ^
> Keep includes sorted alphabetically, also please put system includes first,
> then write one newline, then igt includes.
> > +#include "xe/xe_query.h"
>
> Newline here to separate system and igt includes.
>
> > +#include <string.h>
> ------------ ^
> > +#include <libudev.h>
> ------------ ^
> > +#include <sys/stat.h>
> ------------ ^
> Same here, sort them and swap (first system headers, then igt).
Will address them.
>
> > +
> > +static bool xe_force_gt_fake_reset(int fd, int gt) {
> > + char reset_string[128];
> > + struct stat st;
> > +
> > + igt_assert_eq(fstat(fd, &st), 0);
> > + snprintf(reset_string, sizeof(reset_string),
> > + "cat /sys/kernel/debug/dri/%d/gt%d/fake_reset_failure",
> > +minor(st.st_rdev), gt);
> ----------------- ^
> Just open and write to this file, search for igt helper in lib.
Will look into it.
>
> > +
> > + if (system(reset_string))
> ----------- ^
> Direct write is better for error handling.
>
> > + return false;
> > +
> > + return true;
> > +}
> > +
> > +static bool listen_reset_fail_uevent(struct udev_device *device,
> > +const char *source, int gt_id) {
> > + struct udev_list_entry *list_entry;
> > + bool reset_failed = false;
> > + bool reset_unit_is_gt = false;
> > + bool gt_id_matches = false;
> > + const char *name, *val;
> > +
> > + udev_list_entry_foreach(list_entry,
> udev_device_get_properties_list_entry(device))
> > + {
> > + name = udev_list_entry_get_name(list_entry);
> > + val = udev_list_entry_get_value(list_entry);
> > +
> > + if (!strcmp(name, "RESET_FAILED") && !strcmp(val, "1")) {
> > + printf("%s = %s\n", name, val);
> ----------------------- ^
> Do not use printf, use igt_debug or igt_info. Use of igt_debug is preferable as
> to not clobber output.
Sure. Will address in next patch.
>
> > + reset_failed = true;
> > + continue;
> > + }
> > +
> > + if (!strcmp(name, "RESET_UNIT") && !strcmp(val, "gt")) {
> > + printf("%s = %s\n", name, val);
> ----------------------- ^
> Same here.
Will address in next patch.
>
> > + reset_unit_is_gt = true;
> > + continue;
> > + }
> > +
> > + if (!strcmp(name, "RESET_ID") && (atoi(val) == gt_id)) {
> > + printf("%s = %s\n", name, val);
> ----------------------- ^
> Same here.
Will address in next patch.
>
> > + gt_id_matches = true;
> > + continue;
> > + }
> > + }
> > +
> > + return (reset_failed && reset_unit_is_gt && gt_id_matches); }
> > +
> > +static void fake_reset_uevent_listener(int fd, int gt_id) {
> > + struct udev *udev;
> > + struct udev_device *dev;
> > + struct udev_monitor *mon;
> > + bool event_recieved = false;
> > + bool event_sent = false;
> > +
> > + /* create udev object */
> > + udev = udev_new();
> > + if (!udev) {
> > + fprintf(stderr, "Can't create udev\n");
> --------------- ^
> use igt_debug
Sure.
>
> > + goto no_udev;
> --------------- ^
> Instead of goto please use igt_assert_f(false, "describe fail here\n");
Ok.
>
> > + }
> > +
> > + mon = udev_monitor_new_from_netlink(udev, "kernel");
> > + udev_monitor_filter_add_match_subsystem_devtype(mon, "drm",
> "drm_minor");
> > + udev_monitor_enable_receiving(mon);
> > + igt_until_timeout(5) {
> ------------------------- ^
> Write it as define or constant.
Sure.
>
> > + dev = udev_monitor_receive_device(mon);
> > +
> > + if (dev) {
> > + event_recieved =
> listen_reset_fail_uevent(dev, "kernel", gt_id);
> > + udev_device_unref(dev);
> > + }
> > +
> > + if (event_sent)
> > + goto free_udev;
> ------------------------------- ^
> Just
> break;
>
> is imho better.
Makes sense.
>
> > +
> > + event_sent = xe_force_gt_fake_reset(fd, gt_id);
> > + usleep(50 * 1000);
> ---------------------- ^
> Write it as define or constant also comment why you want to sleep here.
Sure.
>
> > + }
> > +
> > +free_udev:
> > + udev_unref(udev);
> > +no_udev:
> > + igt_assert(event_recieved);
> > +}
> > +
> > +igt_main
> > +{
> > + int fd;
> > +
> > + igt_fixture {
> > + fd = drm_open_driver(DRIVER_XE);
> > + xe_device_get(fd);
> > + }
> > +
> > + usleep(50 * 1000);
> ------- ^
> Do not make such sleeps here, why do you need it here?
drm_open_driver, loads driver incase it is not loaded already. Which will cause lot of uevents
and listener might interpret wrong events and cause failure. Above delay is to ensure uevents triggered during driver load are
settled down before initiating listener for fake reset failure.
Will add the reason for delay in comment section.
>
> Regards,
> Kamil
>
> > +
> > + igt_subtest("fake_reset_uevent_listener")
> > + fake_reset_uevent_listener(fd, 0);
> > +
> > + igt_fixture {
> > + xe_device_put(fd);
> > + close(fd);
> > + }
> > +}
> > --
> > 2.25.1
> >
More information about the igt-dev
mailing list