[Bug 103314] Strange slow-fast performance latching between gfxbench3 (and 4) test runs

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Oct 17 17:53:05 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=103314

            Bug ID: 103314
           Summary: Strange slow-fast performance latching between
                    gfxbench3 (and 4) test runs
           Product: Mesa
           Version: git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Drivers/DRI/i965
          Assignee: intel-3d-bugs at lists.freedesktop.org
          Reporter: tvrtko.ursulin at linux.intel.com
        QA Contact: intel-3d-bugs at lists.freedesktop.org

I have observed a strange phenomena where benchmarks scores can vary, by a
differing degree, between benchmark runs. By looking at what is happening with
the system I have noticed the fast case only ever uses the RCS engine, while
the slow case uses BCS at 100% and RCS at a lower percentage (depends on the
benchmark exactly how much).

I have already chatted with Kenneth (and some other people) about this and
apparently this is somewhat known and caused by different BO upload paths. It
was supposed to be alleviated in the master branch but for me the results are
exactly the same (bad) as what I originally discovered.

The most obvious example is the gl_driver test:


Slow run, 17.5% RCS busy, 100% BCS busy:
========================================

root at e31:~/benchmarks/gfxbench3_desktop# INTEL_DEBUG=perf ~/bin/run-mesa ~/mesa
./gfxbench-driver.sh
Running following GfxBench 3.x test-cases:
- gl_driver

In following resolutions:
- 1920x1080

Fullscreened:
- 1

On/offscreen:
- on

COMMAND: build/linux/gfxbench_Release/mainapp/mainapp -w 1920 -ow 1920 -h 1080
-oh 1080 -t gl_driver -fullscreen 1
ATTENTION: default value of option vblank_mode overridden by environment.
Scanning index buffer to compute index buffer bounds.  Use
glDrawRangeElements() to avoid this.
CPU mapping a busy "statebuffer" BO stalled and took 0.017 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.035 ms.
CPU mapping a busy "streamed data" BO stalled and took 0.012 ms.
CPU mapping a busy "statebuffer" BO stalled and took 0.010 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.022 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.013 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.025 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.038 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.039 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.018 ms.
#Name, FPS, Score, Unit, Width, Height, GL version:
GLB30_gl_driver, 34.2, 1026.0, frames, 1920, 1080, 3.0 Mesa 17.3.0-devel
(git-31fb7bbe0b)

Results are in:
- gfxbench-result-fullscreen-1.csv

Full output logs are in:
- gfxbench-result-fullscreen-1.txt

Fast run, 75% RCS busy, 0% BCS busy:
=====================================

root at e31:~/benchmarks/gfxbench3_desktop# INTEL_DEBUG=perf ~/bin/run-mesa ~/mesa
./gfxbench-driver.sh
Running following GfxBench 3.x test-cases:
- gl_driver

In following resolutions:
- 1920x1080

Fullscreened:
- 1

On/offscreen:
- on

COMMAND: build/linux/gfxbench_Release/mainapp/mainapp -w 1920 -ow 1920 -h 1080
-oh 1080 -t gl_driver -fullscreen 1
ATTENTION: default value of option vblank_mode overridden by environment.
Scanning index buffer to compute index buffer bounds.  Use
glDrawRangeElements() to avoid this.
CPU mapping a busy "batchbuffer" BO stalled and took 0.010 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.020 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.021 ms.
CPU mapping a busy "streamed data" BO stalled and took 0.019 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.015 ms.
CPU mapping a busy "statebuffer" BO stalled and took 0.017 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.037 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.036 ms.
CPU mapping a busy "statebuffer" BO stalled and took 0.021 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.029 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.019 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.017 ms.
CPU mapping a busy "streamed data" BO stalled and took 0.019 ms.
CPU mapping a busy "statebuffer" BO stalled and took 0.016 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.023 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.011 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.014 ms.
CPU mapping a busy "statebuffer" BO stalled and took 0.015 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.011 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.013 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.015 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.027 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.010 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.015 ms.
CPU mapping a busy "statebuffer" BO stalled and took 0.011 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.031 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.038 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.023 ms.
CPU mapping a busy "batchbuffer" BO stalled and took 0.026 ms.
#Name, FPS, Score, Unit, Width, Height, GL version:
GLB30_gl_driver, 101.4, 3042.0, frames, 1920, 1080, 3.0 Mesa 17.3.0-devel
(git-31fb7bbe0b)

Results are in:
- gfxbench-result-fullscreen-1.csv

Full output logs are in:
- gfxbench-result-fullscreen-1.txt



Other tests show a smaller margin between slow and fast modes, but the crux of
the problem is still there.

I think this is quite a problem for benchmarking, and unless people will say
something along the lines of "Don't use gfxbench3/4 ever because A B C", I
think it would be good to get to the bottom of this and having something which
behaves somewhat more predictably.

I also wonder if it is possible that something in the kernel (i915) is causing
Mesa to behave like this? Because I am pretty sure I used these benchmarks
before, but I don't remember noticing this issue before. 3x difference in
gl_driver certainly looks like something one would have thought people would
have noticed before if it was old behaviour?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20171017/2c7ad483/attachment.html>


More information about the intel-3d-bugs mailing list