[igt-dev] [RFC PATCH v3 0/8] Add multi-process subtests for multi-GPUs

Kamil Konieczny kamil.konieczny at linux.intel.com
Tue Nov 29 11:04:23 UTC 2022


Hi Petri,

On 2022-11-28 at 13:26:24 +0200, Petri Latvala wrote:
> On Fri, Nov 25, 2022 at 01:01:40PM +0100, Kamil Konieczny wrote:
> > Add one simple macro igt_fork_dyn() and two new helpers in
> > igt_core to enable running dynamic tests on two or more GPUs in
> > parallel.
> > To test this idea I added two subtests gem_basic at create-close
> > and gem_exec_gttfill at multigpu-basic.
> > It is open-coded for ease of debug but can be converted
> > into macro if this idea will get acceptance.
> > 
> > v3: added log for opened device extension from Mauro with
> >   some modifications
> >   added tests for fork_dyn so it works as igt_fork
> >   added prefix log to help debug problems
> >   rework gttfill multigpu-basic subtest
> 
> In its current form there's multiple hurdles still.
> 
> The socket communications strictly cannot cope with dynamic subtest
> outputs being interleaved. The parsing is done with one pass through
> the comms dump, tracking whether the output is inside a dynamic
> subtest and which one. Having another dynamic subtest start while
> another is being tracked will make it confused.

I agree with that and I will change commit description,
I also change name from igt_fork_dyn to igt_multi_fork.

> 
> I don't know how the text-based parsing handles that.
> 
> But that's not an important point before another question is answered:
> What is the point of having dynamic subtests per gpu index?

I agree, that was wrong idea (so it is RFC to ask for opinions).
 
> The results won't be comparable across systems.

They will if we will test on machines with two or more same
discrete cards. As Mauro noted, we can also benefit from
checking concurrent access to memory.
The basic assumption is that test should pass on single gpu
machine, only then we can conclude anything from multi-card.

> 
> Compare to for example the kernel selftests, a selftest "hugepages"
> might not be there for a platform, or for a different kernel version,
> but if it is, it's comparable to another "hugepages" result. You can
> say "it works on this platform, but doesn't on this platform" and
> point to a kernel bug.
> 
> What can you say if gpu-0 fails on host "fi-multi-dg2" but passes on
> "fi-multi-dg1"? What can you even say if gpu-0 fails and gpu-1 passes?
> Is that the interesting data? In my opinion
> "create-closemultigpu at gpu-0" is not the interesting part, it's just
> "create-close-multigpu".

I agree that test should not be broken into @gpuNUMBER but
treated as one instance, create-close-multigpu or
multigpu-create-close. I did that with gem_exec_gttfill and
multigpu-basic, I will also change gem_basic.

> 
> (That's not even getting to the implied requirements on the host system
> for this test setup of kinda maybe requiring those devices to be
> identical.)

That would be starting point, one or two machines with two same
discrete gpu cards.

Regards,
Kamil

> 
> 
> -- 
> Petri Latvala
> 
> 
> 
> 
> > 
> > See some logs below.
> > 
> > Cc: Anna Karas <anna.karas at intel.com>
> > Cc: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
> > Cc: Mauro Carvalho Chehab <mauro.chehab at linux.intel.com>
> > Cc: Petri Latvala <petri.latvala at intel.com>
> > 
> > Starting subtest: multigpu-basic
> > <g:0> Setup 1025 batches in 3398.88ms
> > <g:1> Setup 1025 batches in 3392.46ms
> > [..skipped..]
> > <g:0> Total: 33 cycles
> > <g:1> Total: 33 cycles
> > Subtest multigpu-basic: SUCCESS (36.248s)
> > 
> > Kamil Konieczny (7):
> >   lib/igt_core: add fork for dynamic tests
> >   lib/igt_core: add prefix to logging
> >   lib/tests/igt_fork: add tests for igt_fork_dyn
> >   lib/tests/igt_fork: change comments into prints
> >   tests/i915/gem_basic: add multi-gpu tests
> >   tests/i915/gem_exec_gttfill: add new subtest multigpu-basic
> >   HAX test few multi-gpu subtests
> > 
> > Mauro Carvalho Chehab (1):
> >   lib/igt_core: store GPU string or opened device name
> > 
> >  lib/drmtest.c                         |   4 +-
> >  lib/igt_core.c                        | 248 ++++++++++++++++++++++++--
> >  lib/igt_core.h                        |  27 +++
> >  lib/tests/igt_fork.c                  |  93 +++++++---
> >  tests/i915/gem_basic.c                |  28 ++-
> >  tests/i915/gem_exec_gttfill.c         |  31 +++-
> >  tests/intel-ci/fast-feedback.testlist | 176 +-----------------
> >  7 files changed, 386 insertions(+), 221 deletions(-)
> > 
> > -- 
> > 2.34.1
> > 


More information about the igt-dev mailing list