[PATCH i-g-t 1/1] tests/intel/xe_eu_stall: Do not check for presence of data on simulation

Dixit, Ashutosh ashutosh.dixit at intel.com
Wed May 21 21:30:13 UTC 2025


On Mon, 19 May 2025 21:09:06 -0700, Harish Chegondi wrote:
>

Hi Harish,

> On Tue, May 13, 2025 at 11:03:53PM -0700, Dixit, Ashutosh wrote:
> > On Tue, 13 May 2025 15:16:14 -0700, Dixit, Ashutosh wrote:
> > >
> > > On Tue, 13 May 2025 13:57:41 -0700, Harish Chegondi wrote:
> > > >
> > > > On Tue, May 13, 2025 at 09:43:32AM -0700, Dixit, Ashutosh wrote:
> > > > > On Mon, 12 May 2025 20:07:38 -0700, Harish Chegondi wrote:
> > > > > >
> > > > > > Some simulation models may not have full EU stall sampling support.
> > > > > >
> > > > > > Signed-off-by: Harish Chegondi <harish.chegondi at intel.com>
> > > > > > ---
> > > > > >  tests/intel/xe_eu_stall.c | 4 +++-
> > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/tests/intel/xe_eu_stall.c b/tests/intel/xe_eu_stall.c
> > > > > > index 411c30871..bdfa0fc4b 100644
> > > > > > --- a/tests/intel/xe_eu_stall.c
> > > > > > +++ b/tests/intel/xe_eu_stall.c
> > > > > > @@ -586,7 +586,6 @@ enable:
> > > > > >
> > > > > >	ret = wait_child(&work_load);
> > > > > >	igt_assert_f(ret == 0, "waitpid() - ret: %d, errno: %d\n", ret, errno);
> > > > > > -	igt_assert_f(num_samples, "No EU stalls detected during the workload\n");
> > > > > >
> > > > > >	do_ioctl(stream_fd, DRM_XE_OBSERVATION_IOCTL_DISABLE, 0);
> > > > > >	if (--iter)
> > > > > > @@ -594,6 +593,9 @@ enable:
> > > > > >
> > > > > >	close(stream_fd);
> > > > > >	free(buf);
> > > > > > +
> > > > > > +	if (!igt_run_in_simulation())
> > > > > > +		igt_assert_f(num_samples, "No EU stalls detected during the workload\n");
> > > > >
> > > > > Do we really want to move this here? Wasn't the earlier location better
> > > > > since it checked num_samples for every iteration, whereas now we'd check it
> > > > > only for the last iteration?
> > > > Hi Ashutosh,
> > > >
> > > > Initially I didn't move. When testing I noticed that if there is no
> > > > data, the assert triggers and the following close() and free() are not
> > > > called. When the next sub-test gets executed, it returns EBUSY as the
> > > > stream is not closed in the previous test. So, I moved this check here.
> > > > Anyhow the data from the first iteration is checked in the blocking-read
> > > > and non-blocking-read subtests where there is only one iteration.
> > >
> > > Hmm, the problem is, it's making the code look weird now. Also, if the
> > > process dies in an assert, the fd should get closed when the process
> > > died.
> >
> > Actually, not sure about this, because it is not a fd created by the
> > process but an anon fd returned by the open ioctl, so not sure if it gets
> > closed. I have seen similar EBUSY's happening in OA tests too when things
> > fail. Anyway, take a quick look and see what's happening, if release() is
> > getting called or not.
> >
> > If EBUSY's are ok, we can add the sim check in the previous place.
> Hi Ashutosh,
>
> I have put prints in the driver release() function and ran several
> tests. I see that if the test doesn't explicitly close the fd, the fd
> gets closed at the end of the test. If no subtest is specified to run,
> all sub-tests are run one after another. If one subtest opens an fd but
> doesn't close it (due to an assert before close()), it will be closed
> only after all subtests have run. Buf if only one subtest is run and the
> fd isn't closed in the test due to an assert, the fd gets closed and
> release() called at the end of the subtest.
>
> Since I moved the assert() after close(), if the assert gets triggered,
> it doesn't affect the next subtest as the close() was called. If I don't
> move the assert() after close(), an assert will not close fd and the
> next subtest, if it calls open to get another fd, the driver returns
> -EBUSY as an earlier session was not closed.
> >
> > >
> > > Or is there a delay between the process dying and fd getting closed? And
> > > the next process is trying to open the fd before the previous process
> > > closed the fd? Where did you see the EBUSY issue, is it happening in CI? Or
> > > what are you executing to reproduce the EBUSY issue?
> If a subtest doesn't call close() due to an assert() and the next
> subtest calls open(), the driver while initializing notices that there
> is an open session and returns -EBUSY.
> > >
> > > Can you please add a print in the eu stall release fops and see if
> > > release() is not getting called when the process dies (put an artificial
> > > assert in igt if needed).
> Please see above. I can confirm that if the test doesn't close the fd,
> the fd is getting closed at the end of the test.
> > >
> > > Better to investigate this a little bit more I think.

Thanks for the experiments. At this time our suggestion is to use a global
stream_fd and unconditionally close it before opening it in the next
subtest, see __perf_open() in xe_oa.c. And let's see if it resolves this
issue.

Thanks.
--
Ashutosh


More information about the igt-dev mailing list