[Intel-gfx] [RFC i-g-t] igt/media-bench.pl: Media workload analyzer

Tvrtko Ursulin tursulin at ursulin.net
Fri May 12 12:12:31 UTC 2017


From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

The high level goal of this script is to programatically analyze
the simulated media workloads (gem_wsim) by finding an optimal
load balancing strategy, and also detecting any possible
shortcomings of the same.

When run without command line arguments script will run through
both of its phases.

In the first phase it will be running all the known balancers
against the all the known workloads, and for each combination
look for a point where aggregated system throughput cannot be
increased by running more parallel workload instances.

At that point each balancer gets a score proportional to the
throughput achieved, which is added to the running total for the
complete phase.

Several different score boards are kept - total throughput, per
client throughput and combined (total + per client). Weighted
scoreboards are also kept where scores are weighted based on the
total variance detected for a single workload. This means scores
for workloads which respond well to being balanced will be worth
more than of the ones which do not balance well in neither of
the configurations.

Based on the first phase a "best" balancing strategy will be
selected based on the combined weighted scoreboard.

Second phase will then proceed to profile all the selected
workloads with this balancer and look at potential problems with
GPU engines not being completely saturated.

If none of the active engine is saturated the workload will be
flagged, as it will if the only saturated engine is one of the
ones which can be balanced, but the other one in the same class
is under-utilized.

Flagged workloads then need to be analyzed which can be achieved
by looking at the html of the engine timelines which are
generated during this phase. (These files are all put in the
current working directory.)

It is quite possible that something flagged by the script as
suspect is completely fine and just a consequence of the
workload in question being fundementally unbalanced.

It is possible to skip directly to the second phase of the
evaluation by using the -b command line option. This option must
contain a string exactly as understood by gem_wsim's -b option.
For example '-b "-b rtavg -R"'.

Apart from being run with no arguments, script also supports a
selection of command line switches to enable fine tuning.

For example, also including the complete output from the script
in order to be more illustrative:

-8<---
+ scripts/media-bench.pl -n 642317 -r 2 -B rand,rtavg -W media_load_balance_hd12.wsim,media_load_balance_fhd26u7.wsim
Workloads:
  media_load_balance_hd12.wsim
  media_load_balance_fhd26u7.wsim
Balancers: rand,rtavg,
Target workload duration is 2s.
Calibration tolerance is 0.01.
Nop calibration is 642317.

Evaluating 'media_load_balance_hd12.wsim'... 2s is 990 workloads. (error=0.00750000000000006)
  Finding saturation points for 'media_load_balance_hd12.wsim'...
    rand balancer ('-b rand'): 6 clients (1412.576 wps, 235.429333333333 wps/client).
    rand balancer ('-b rand -R'): 6 clients (1419.639 wps, 236.6065 wps/client).
    rtavg balancer ('-b rtavg'): 5 clients (1430.143 wps, 286.0286 wps/client).
    rtavg balancer ('-b rtavg -H'): 5 clients (1339.775 wps, 267.955 wps/client).
    rtavg balancer ('-b rtavg -R'): 5 clients (1386.384 wps, 277.2768 wps/client).
    rtavg balancer ('-b rtavg -R -H'): 6 clients (1365.943 wps, 227.657166666667 wps/client).
  Best balancer is '-b rtavg'.

Evaluating 'media_load_balance_fhd26u7.wsim'... 2s is 52 workloads. (error=0.002)
  Finding saturation points for 'media_load_balance_fhd26u7.wsim'...
    rand balancer ('-b rand'): 3 clients (46.532 wps, 15.5106666666667 wps/client).
    rand balancer ('-b rand -R'): 3 clients (46.242 wps, 15.414 wps/client).
    rtavg balancer ('-b rtavg'): 6 clients (61.232 wps, 10.2053333333333 wps/client).
    rtavg balancer ('-b rtavg -H'): 4 clients (57.608 wps, 14.402 wps/client).
    rtavg balancer ('-b rtavg -R'): 6 clients (61.793 wps, 10.2988333333333 wps/client).
    rtavg balancer ('-b rtavg -R -H'): 7 clients (60.697 wps, 8.671 wps/client).
  Best balancer is '-b rtavg -R'.

Total wps rank:
===============
  1: '-b rtavg' (1)
  2: '-b rtavg -R' (0.989191465637926)
  3: '-b rtavg -R -H' (0.973103630772601)
  4: '-b rtavg -H' (0.938804458876241)
  5: '-b rand -R' (0.874465740398305)
  6: '-b rand' (0.874342391093453)

Total weighted wps rank:
========================
  1: '-b rtavg -R' (1)
  2: '-b rtavg' (0.998877134022041)
  3: '-b rtavg -R -H' (0.982849160383224)
  4: '-b rtavg -H' (0.938950446314292)
  5: '-b rand' (0.80507369080098)
  6: '-b rand -R' (0.80229656623594)

Per client wps rank:
====================
  1: '-b rtavg -H' (1)
  2: '-b rand' (0.977356849770376)
  3: '-b rand -R' (0.976222085591368)
  4: '-b rtavg' (0.888825068013012)
  5: '-b rtavg -R' (0.875653417817828)
  6: '-b rtavg -R -H' (0.726389466714194)

Per client weighted wps rank:
=============================
  1: '-b rand' (1)
  2: '-b rand -R' (0.996866139192282)
  3: '-b rtavg -H' (0.986348733324348)
  4: '-b rtavg' (0.811593544774355)
  5: '-b rtavg -R' (0.805704548552663)
  6: '-b rtavg -R -H' (0.671567075453688)

Combined wps rank:
==================
  1: '-b rtavg' (1)
  2: '-b rtavg -R' (0.989191465637926)
  3: '-b rtavg -H' (0.972251783752137)
  4: '-b rtavg -R -H' (0.949708930404222)
  5: '-b rand' (0.914594701126905)
  6: '-b rand -R' (0.914312395840401)

Combined weighted wps rank:
===========================
  1: '-b rtavg' (1)
  2: '-b rtavg -R' (0.995945739226824)
  3: '-b rtavg -H' (0.984347862855008)
  4: '-b rtavg -R -H' (0.956920992185625)
  5: '-b rand' (0.899001713089319)
  6: '-b rand -R' (0.896984246540919)

Balancer is '-b rtavg'.
Idleness tolerance is 2%.

Profiling 'media_load_balance_hd12.wsim'...
      2s is 992 workloads. (error=0.00150000000000006)
      Saturation at 6 clients (1434.207 workloads/s).
    Pass [ 0: 0.57%, 2: 22.59%, 3: 23.30%, ]

Profiling 'media_load_balance_fhd26u7.wsim'...
      2s is 52 workloads. (error=0.001)
      Saturation at 6 clients (61.823 workloads/s).
    WARN [ 0: 7.77%, 2: 0.66%, 3: 28.70%, ]

Problematic workloads were:
   media_load_balance_fhd26u7.wsim -c 6 -r 52 [ 0: 7.77%, 2: 0.66%, 3: 28.70%, ]

-8<---

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
---
 scripts/Makefile.am    |   2 +-
 scripts/media-bench.pl | 460 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 461 insertions(+), 1 deletion(-)
 create mode 100755 scripts/media-bench.pl

diff --git a/scripts/Makefile.am b/scripts/Makefile.am
index 641715294936..e26a39e2f072 100644
--- a/scripts/Makefile.am
+++ b/scripts/Makefile.am
@@ -1,2 +1,2 @@
-dist_noinst_SCRIPTS = intel-gfx-trybot who.sh run-tests.sh trace.pl
+dist_noinst_SCRIPTS = intel-gfx-trybot who.sh run-tests.sh trace.pl media-bench.pl
 noinst_PYTHON = throttle.py
diff --git a/scripts/media-bench.pl b/scripts/media-bench.pl
new file mode 100755
index 000000000000..e048b039e0fa
--- /dev/null
+++ b/scripts/media-bench.pl
@@ -0,0 +1,460 @@
+#! /usr/bin/perl
+#
+# Copyright © 2017 Intel Corporation
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+# IN THE SOFTWARE.
+#
+
+use strict;
+use warnings;
+use 5.010;
+
+use Getopt::Std;
+
+chomp(my $igt_root = `pwd -P`);
+my $wsim = "$igt_root/benchmarks/gem_wsim";
+my $wrk_root = "$igt_root/benchmarks/wsim";
+my $tracepl = "$igt_root/scripts/trace.pl";
+my $tolerance = 0.01;
+my $client_target_s = 10;
+my $idle_tolerance_pct = 2.0;
+my $show_cmds = 0;
+my $balancer;
+my $nop;
+my %opts;
+
+my @balancers = ( 'rr', 'rand', 'qd', 'qdr', 'qdavg', 'rt', 'rtr', 'rtavg' );
+my %bal_skip_H = ( 'rr' => 1, 'rand' => 1 );
+
+my @workloads = (
+	'media_load_balance_17i7.wsim',
+	'media_load_balance_19.wsim',
+	'media_load_balance_4k12u7.wsim',
+	'media_load_balance_fhd26u7.wsim',
+	'media_load_balance_hd01.wsim',
+	'media_load_balance_hd06mp2.wsim',
+	'media_load_balance_hd12.wsim',
+	'media_load_balance_hd17i4.wsim',
+	'media_1n2_480p.wsim',
+	'media_1n3_480p.wsim',
+	'media_1n4_480p.wsim',
+	'media_1n5_480p.wsim',
+	'media_mfe2_480p.wsim',
+	'media_mfe3_480p.wsim',
+	'media_mfe4_480p.wsim',
+	'media_nn_1080p.wsim',
+	'media_nn_480p.wsim',
+    );
+
+sub show_cmd
+{
+	my ($cmd) = @_;
+
+	say "\n+++ $cmd" if $show_cmds;
+}
+
+sub calibrate_nop
+{
+	my ($delay, $nop);
+	my $cmd = "$wsim";
+
+	show_cmd($cmd);
+	open WSIM, "$cmd |" or die;
+	while (<WSIM>) {
+		chomp;
+		if (/Nop calibration for (\d+)us delay is (\d+)./) {
+			$delay = $1;
+			$nop = $2;
+		}
+
+	}
+	close WSIM;
+
+	die unless $nop;
+
+	return $nop
+}
+
+sub can_balance_workload
+{
+	my ($wrk) = @_;
+	my $res = 0;
+
+	open WRK, "$wrk_root/$wrk" or die;
+	while (<WRK>) {
+		chomp;
+		if (/\.VCS\./) {
+			$res = 1;
+			last;
+		}
+	}
+	close WRK;
+
+	return $res;
+}
+
+sub run_workload
+{
+	my (@args) = @_;
+	my ($time, $wps, $cmd);
+
+	unshift @args, "$wsim";
+	$cmd = join ' ', @args;
+	show_cmd($cmd);
+
+	open WSIM, "$cmd |" or die;
+	while (<WSIM>) {
+		chomp;
+		if (/^(\d+\.\d+)s elapsed \((\d+\.?\d+) workloads\/s\)$/) {
+			$time = $1;
+			$wps = $2;
+		}
+	}
+	close WSIM;
+
+	return ($time, $wps);
+}
+
+sub trace_workload
+{
+	my ($wrk, $b, $r, $c) = @_;
+	my @args = ( "-n $nop", "-w $wrk_root/$wrk", $balancer, "-r $r", "-c $c");
+	my $min_batches = 16 + $r * $c / 2;
+	my @skip_engine;
+	my %engines;
+	my $cmd;
+
+	unshift @args, '-q';
+	unshift @args, "$tracepl --trace $wsim";
+	$cmd = join ' ', @args;
+	show_cmd($cmd);
+	system($cmd);
+
+	$cmd = "perf script | $tracepl";
+	show_cmd($cmd);
+	open CMD, "$cmd |" or die;
+	while (<CMD>) {
+		chomp;
+		if (/Ring(\d+): (\d+) batches.*?(\d+\.?\d+)% idle,/) {
+			if ($2 >= $min_batches) {
+				$engines{$1} = $3;
+			} else {
+				push @skip_engine, $1;
+			}
+		}
+	}
+	close CMD;
+
+	$cmd = "perf script | $tracepl --html -x ctxsave -s --squash-ctx-id ";
+	$cmd .= join ' ', map("-i $_", @skip_engine);
+	$wrk =~ s/ /_/g;
+	$b =~ s/ /_/g;
+	$cmd .= " > ${wrk}_${b}_-r${r}_-c${c}.html";
+	show_cmd($cmd);
+	system($cmd);
+
+	return \%engines;
+}
+
+sub calibrate_workload
+{
+	my ($wrk) = @_;
+	my $tol = $tolerance;
+	my $loops = 0;
+	my $error;
+	my $r;
+
+	$r = 23;
+	for (;;) {
+		my @args = ( "-n $nop", "-w $wrk_root/$wrk", "-r $r");
+		my ($time, $wps);
+
+		($time, $wps) = run_workload(@args);
+
+		$error = abs($time - $client_target_s) / $client_target_s;
+
+		last if $error <= $tol;
+
+		$r = int($wps * $client_target_s);
+		$loops = $loops + 1;
+		if ($loops >= 4) {
+			$tol = $tol * (1.0 + ($tol));
+			$loops = 0;
+		}
+		last if $tol > 0.2;
+	}
+
+	return ($r, $error);
+}
+
+sub find_saturation_point
+{
+	my (@args) = @_;
+	my $last_wps;
+	my $c;
+
+	for ($c = 1; ; $c = $c + 1) {
+		my ($time, $wps);
+
+		($time, $wps) = run_workload((@args, ("-c $c")));
+
+		if ($c > 1) {
+			my $error = abs($wps - $last_wps) / $last_wps;
+			last if $wps < $last_wps or $error <= $tolerance;
+		}
+
+		$last_wps = $wps;
+	}
+
+	return ($c - 1, $last_wps);
+}
+
+getopts('hxn:b:W:B:r:t:i:', \%opts);
+
+if (defined $opts{'h'}) {
+	print <<ENDHELP;
+Supported options:
+
+  -h          Help text.
+  -x          Show external commands.
+  -n num      Nop calibration.
+  -b str      Balancer to pre-select.
+              Skips balancer auto-selection.
+              Passed straight the gem_wsim so use like -b "-b qd -R"
+  -W a,b,c    Override the default list of workloads.
+  -B a,b,c    Override the default list of balancers.
+  -r sec      Target workload duration.
+  -t pct      Calibration tolerance.
+  -i pct      Engine idleness tolerance.
+ENDHELP
+	exit 0;
+}
+
+$show_cmds = $opts{'x'} if defined $opts{'x'};
+$balancer = $opts{'b'} if defined $opts{'b'};
+if (defined $opts{'B'}) {
+	@balancers = split /,/, $opts{'B'};
+} else {
+	unshift @balancers, '';
+}
+ at workloads = split /,/, $opts{'W'} if defined $opts{'W'};
+$client_target_s = $opts{'r'} if defined $opts{'r'};
+$tolerance = $opts{'t'} / 100.0 if defined $opts{'t'};
+$idle_tolerance_pct = $opts{'i'} if defined $opts{'i'};
+
+say "Workloads:";
+print map { "  $_\n" } @workloads;
+print "Balancers: ";
+say map { "$_," } @balancers;
+say "Target workload duration is ${client_target_s}s.";
+say "Calibration tolerance is $tolerance.";
+$nop = $opts{'n'};
+$nop = calibrate_nop() unless $nop;
+say "Nop calibration is $nop.";
+
+goto VERIFY if defined $balancer;
+
+my %scores;
+my %wscores;
+my %cscores;
+my %cwscores;
+my %mscores;
+my %mwscores;
+
+sub add_points
+{
+	my ($wps, $scores, $wscores) = @_;
+	my ($min, $max, $spread);
+	my @sorted;
+
+	@sorted = sort { $b <=> $a } values %{$wps};
+	$max = $sorted[0];
+	$min = $sorted[-1];
+	$spread = $max - $min;
+	die if $spread < 0;
+
+	foreach my $w (keys %{$wps}) {
+		my ($score, $wscore);
+
+		unless (exists $scores->{$w}) {
+			$scores->{$w} = 0;
+			$wscores->{$w} = 0;
+		}
+
+		$score = $wps->{$w} / $max;
+		$scores->{$w} = $scores->{$w} + $score;
+		$wscore = $score * $spread / $max;
+		$wscores->{$w} = $wscores->{$w} + $wscore;
+	}
+}
+
+foreach my $wrk (@workloads) {
+	my ($r, $error, $should_b);
+	my (%wps, %cwps, %mwps);
+	my $best;
+	my @args;
+
+	$should_b = can_balance_workload($wrk);
+
+	print "\nEvaluating '$wrk'...";
+
+	($r, $error) = calibrate_workload($wrk);
+	say " ${client_target_s}s is $r workloads. (error=$error)";
+	@args = ( "-n $nop", "-w $wrk_root/$wrk", "-r $r");
+
+	say "  Finding saturation points for '$wrk'...";
+
+	BAL: foreach my $bal (@balancers) {
+		RBAL: foreach my $R ('', '-R') {
+			foreach my $H ('', '-H') {
+				my @xargs;
+				my ($wps, $c);
+				my $bid;
+
+				if ($bal ne '') {
+					push @xargs, "-b $bal";
+					push @xargs, '-R' if $R ne '';
+					push @xargs, '-H' if $H ne '';
+					$bid = join ' ', @xargs;
+					print "    $bal balancer ('$bid'): ";
+				} else {
+					$bid = '<none>';
+					print "    No balancing: ";
+				}
+
+				($c, $wps) = find_saturation_point((@args,
+								    @xargs));
+
+				$wps{$bid} = $wps;
+				$cwps{$bid} = $wps / $c;
+				$mwps{$bid} = $wps{$bid} + $cwps{$bid};
+
+				say "$c clients ($wps wps, $cwps{$bid} wps/client).";
+
+				last BAL unless $should_b;
+				next BAL if $bal eq '';
+				next RBAL if exists $bal_skip_H{$bal};
+			}
+		}
+	}
+
+	$best = (sort { $mwps{$b} <=> $mwps{$a} } keys %mwps)[0];
+	say "  Best balancer is '$best'.";
+
+	add_points(\%wps, \%scores, \%wscores);
+	add_points(\%cwps, \%cscores, \%cwscores);
+	add_points(\%mwps, \%mscores, \%mwscores);
+}
+
+sub dump_scoreboard
+{
+	my ($n, $h) = @_;
+	my ($max, $i, $str, $balancer);
+
+	$str = "$n rank:";
+	say "\n$str";
+	say '=' x length($str);
+	$i = 1;
+	foreach my $w (sort { $h->{$b} <=> $h->{$a} } keys %{$h}) {
+		my $score = $h->{$w};
+
+		if ($i == 1) {
+			$max = $score;
+			$balancer = $w;
+		}
+
+		$score = $score / $max;
+
+		say "  $i: '$w' ($score)";
+
+		$i = $i + 1;
+	}
+
+	return $balancer;
+}
+
+dump_scoreboard('Total wps', \%scores);
+dump_scoreboard('Total weighted wps', \%wscores);
+dump_scoreboard('Per client wps', \%cscores);
+dump_scoreboard('Per client weighted wps', \%cwscores);
+dump_scoreboard('Combined wps', \%mscores);
+$balancer = dump_scoreboard('Combined weighted wps', \%mwscores);
+
+VERIFY:
+
+my %problem_wrk;
+
+die unless defined $balancer;
+
+say "\nBalancer is '$balancer'.";
+say "Idleness tolerance is $idle_tolerance_pct%.";
+
+foreach my $wrk (@workloads) {
+	my @args = ( "-n $nop", "-w $wrk_root/$wrk", $balancer);
+	my ($r, $error, $c, $wps);
+	my $saturated = 0;
+	my $result = 'Pass';
+	my %problem;
+	my $engines;
+
+	next unless can_balance_workload($wrk);
+
+	say "  \nProfiling '$wrk'...";
+
+	($r, $error) = calibrate_workload($wrk);
+	say "      ${client_target_s}s is $r workloads. (error=$error)";
+	push @args, "-r $r";
+
+	($c, $wps) = find_saturation_point(@args);
+	say "      Saturation at $c clients ($wps workloads/s).";
+	push @args, "-c $c";
+
+	$engines = trace_workload($wrk, $balancer, $r, $c);
+
+	foreach my $key (keys %{$engines}) {
+		$saturated = $saturated + 1
+			     if $engines->{$key} < $idle_tolerance_pct;
+	}
+
+	if ($saturated == 0 or
+	    ($saturated == 1 and
+	     ($engines->{'2'} < $idle_tolerance_pct or
+	      $engines->{'3'} < $idle_tolerance_pct))) {
+		$result = $saturated == 0 ? 'FAIL' : 'WARN';
+		$problem{'c'} = $c;
+		$problem{'r'} = $r;
+		$problem{'stats'} = $engines;
+		$problem_wrk{$wrk} = \%problem;
+	}
+
+	print "    $result [";
+	print map " $_: $engines->{$_}%,", sort keys %{$engines};
+	say " ]";
+}
+
+say "\nProblematic workloads were:" if scalar(keys %problem_wrk) > 0;
+foreach my $wrk (sort keys %problem_wrk) {
+	my $problem = $problem_wrk{$wrk};
+
+	print "   $wrk -c $problem->{'c'} -r $problem->{'r'} [";
+	print map " $_: $problem->{'stats'}->{$_}%,",
+	      sort keys %{$problem->{'stats'}};
+	say " ]";
+}
-- 
2.9.3



More information about the Intel-gfx mailing list