[PATCH 06/13] tests/intel/xe_oa: Rewrite the polling small buf test

Tue Feb 25 04:26:56 UTC 2025

On Mon, 24 Feb 2025 14:56:05 -0800, Umesh Nerlige Ramappa wrote:
>
> On Mon, Feb 24, 2025 at 12:11:37PM -0800, Dixit, Ashutosh wrote:
> > On Tue, 18 Feb 2025 12:28:04 -0800, Umesh Nerlige Ramappa wrote:
> >>
> >
> > Hi Umesh,
> >
> >> Use mmio reads as a side-channel to determine if reports are available
> >> and ensure that poll will return with POLLIN set. Then provide a small
> >> buffer to force ENOSPC error. Then poll with a timeout of 0 to check if
> >> POLLIN is still set.
> >
> > Will need a reason for doing this here. But see below.
> >
> >>
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> >> ---
> >>  tests/intel/xe_oa.c | 64 +++++++++++++++++++++++++--------------------
> >>  1 file changed, 35 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/tests/intel/xe_oa.c b/tests/intel/xe_oa.c
> >> index aaf92308a..5792ffec2 100644
> >> --- a/tests/intel/xe_oa.c
> >> +++ b/tests/intel/xe_oa.c
> >> @@ -2216,7 +2216,6 @@ static void test_polling(uint64_t requested_oa_period,
> >>   */
> >>  static void test_polling_small_buf(void)
> >>  {
> >> -	int oa_exponent = max_oa_exponent_for_period_lte(40 * 1000); /* 40us */
> >>	uint64_t properties[] = {
> >>		DRM_XE_OA_PROPERTY_OA_UNIT_ID, 0,
> >>
> >> @@ -2226,50 +2225,57 @@ static void test_polling_small_buf(void)
> >>		/* OA unit configuration */
> >>		DRM_XE_OA_PROPERTY_OA_METRIC_SET, default_test_set->perf_oa_metrics_set,
> >>		DRM_XE_OA_PROPERTY_OA_FORMAT, __ff(default_test_set->perf_oa_format),
> >> -		DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT, oa_exponent,
> >> +		DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT, oa_exponent_default,
> >>		DRM_XE_OA_PROPERTY_OA_DISABLED, true,
> >>	};
> >>	struct intel_xe_oa_open_prop param = {
> >>		.num_properties = ARRAY_SIZE(properties) / 2,
> >>		.properties_ptr = to_user_pointer(properties),
> >>	};
> >> -	uint32_t test_duration = 80 * 1000 * 1000;
> >> -	int sample_size = get_oa_format(default_test_set->perf_oa_format).size;
> >> -	int n_expected_reports = test_duration / oa_exponent_to_ns(oa_exponent);
> >> -	int n_expect_read_bytes = n_expected_reports * sample_size;
> >> -	struct timespec ts = {};
> >> -	int n_bytes_read = 0;
> >> -	uint32_t n_polls = 0;
> >> +	int report_size = get_oa_format(default_test_set->perf_oa_format).size;
> >> +	u32 oa_tail, prev_tail;
> >> +	struct pollfd pollfd;
> >> +	uint8_t buf[10];
> >> +	int ret, i = 0;
> >> +
> >> +	intel_register_access_init(&mmio_data,
> >> +				   igt_device_get_pci_device(drm_fd), 0);
> >>
> >>	stream_fd = __perf_open(drm_fd, &param, true /* prevent_pm */);
> >>	set_fd_flags(stream_fd, O_CLOEXEC | O_NONBLOCK);
> >> -	do_ioctl(stream_fd, DRM_XE_OBSERVATION_IOCTL_ENABLE, 0);
> >> -
> >> -	while (igt_nsec_elapsed(&ts) < test_duration) {
> >> -		struct pollfd pollfd = { .fd = stream_fd, .events = POLLIN };
> >>
> >> -		ppoll(&pollfd, 1, NULL, NULL);
> >> -		if (pollfd.revents & POLLIN) {
> >> -			uint8_t buf[1024];
> >> -			int ret;
> >> +#define OAG_OATAILPTR	(0xdb04)
> >> +	/* Save the current tail */
> >> +	prev_tail = oa_tail = intel_register_read(&mmio_data, OAG_OATAILPTR);
> >>
> >> -			ret = read(stream_fd, buf, sizeof(buf));
> >> -			if (ret >= 0)
> >> -				n_bytes_read += ret;
> >> -		}
> >> +	/* Kickstart the capture */
> >> +	do_ioctl(stream_fd, DRM_XE_OBSERVATION_IOCTL_ENABLE, 0);
> >>
> >> -		n_polls++;
> >> +	/* Wait for 5 reports */
> >
> > Wait for 5 reports or 10 ms ?
> >
> >
> >> +	while ((oa_tail - prev_tail) < (5 * report_size)) {
> >> +		usleep(1000);
> >> +		oa_tail = intel_register_read(&mmio_data, OAG_OATAILPTR);
> >> +		if (i++ > 10)
> >
> > So on slow platforms we might not get any reports in 10 ms? The idea here
> > should be to not have any timing dependence? So if we want to wait for 5
> > reports, just wait for 5 reports?
>
> Oh, I think the loop was stuck while debugging something, so had added a
> counter to bail out in 10 iterations. I will remove that. We only need to
> wait for 5 reports.

Ok.

>
> >
> > We tried doing this for the mmap OA buffer: see
> > mmap_wait_for_periodic_reports(), the function waits indefinitely.
>
> You mean this:
>
> while (num_periodic_reports < n) {
>	usleep(4 * n * period_us);
>	num_periodic_reports = 0;
>	for (reports = (uint32_t *)oa_vaddr;
>	     reports[0] && oa_timestamp(reports, fmt) && oa_report_is_periodic(reports);
>	     reports += get_oa_format(fmt).size) {
>		num_periodic_reports++;
>	}
> }
>
> Well.. if your reports start coming in fast enough, then you would just
> spin in the inner for loop. Maybe break the inner for loop when
> num_periodic_reports >= n;

Ok, but I was just pointing out that this sort of code is beginning to get
in. Sai Teja did this.

>
> >
> > So if this is done I am not sure if the intel_register_read() approach is
> > needed (but I didn't think of doing that :). But I guess we can use it to
> > see when there are N reports available.
> >
> > Longer term it would be nice to have a centralized function
> > wait_for_n_reports(int n) or something like that which different tests can
> > use.
>
> Agree, except that some tests will read the actual reports, while others
> just want to take a peek at how many reports are available without reading
> them. Since mmap is also a feature under test, I took the easier
> approach. We can always refine it if we find something better.

Can have similar peek_for_n_reports(n), which would have to use your
register read method.

>
> >
> >> +			break;
> >>	}
> >>
> >> -	igt_info("Read %d expected %d (%.2f%% of the expected number), polls=%u\n",
> >> -		 n_bytes_read, n_expect_read_bytes,
> >> -		 n_bytes_read * 100.0f / n_expect_read_bytes,
> >> -		 n_polls);
> >> +	intel_register_access_fini(&mmio_data);
> >>
> >> -	__perf_close(stream_fd);
> >> +	/* Just read one report and expect ENOSPC */
> >> +	pollfd.fd = stream_fd;
> >> +	pollfd.events = POLLIN;
> >> +	poll(&pollfd, 1, 1000);
> >> +	igt_assert(pollfd.revents & POLLIN);
> >
> > Is the assumption here that the kernel timer is firing every 5 ms (so if
> > we've waited for 10 ms POLLIN must be set since the timer is firing every 5
> > ms)? I am not sure if that 5 ms is uapi. Or is it? Actually I was thinking
> > of changing that 5 ms time or changing the timer to a delayed work.
>
> But here I am waiting 1000ms in the poll above. That should be sufficient
> for POLLIN  to be set. If not, we could set the timeout to a large value (a
> few seconds).

Oh ok, sorry I missed the 1000 ms.

>
> >
> >> +	errno = 0;
> >> +	ret = read(stream_fd, buf, sizeof(buf));
> >> +	igt_assert_eq(ret, -1);
> >> +	igt_assert_eq(errno, ENOSPC);
> >
> > This part looks ok, it's uapi.
> >
>
> Note:
> ENOSPC is returned only if the buffer is small enough that not even one
> report will fit in. Initially I had a 600 byte buffer, but I did not get
> ENOSPC. Instead I got 576 in ret which I think is the correct behavior.

OK.

> >>
> >> -	igt_assert(abs(n_expect_read_bytes - n_bytes_read) <
> >> -		   0.20 * n_expect_read_bytes);
> >> +	/* Poll with 0 timeout and expect POLLIN flag to be set */
> >> +	poll(&pollfd, 1, 0);
> >> +	igt_assert(pollfd.revents & POLLIN);
> >> +
> >> +	__perf_close(stream_fd);
> >
> > How about just reading N reports using a small buffer for this test,
> > however long it takes? N can 5 or 10.
>
> Not sure I understand. You mean at this stage of the test, read 5/10
> reports? OR just alter the entire test somehow to do something different?
>
> I thought the test was specifically testing that POLLIN is still set after
> an ENOSPC error, so I have written it for that case alone. The 0 timeout
> will bypass the wait in the poll so that we only get the state of POLLIN.

The original test, as I see it, seems to be just doing a lot of regular
reads with a small buffer (1K instead of 16 MB say).

But what you are doing here is also probably ok/better than the
original. Let me think some more about it.

Thanks.
--
Ashutosh