[Bug 111090] 3% perf drop in GfxBench Manhattan 3.0, 3.1 and CarChase test-cases

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Jul 11 15:55:13 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=111090

--- Comment #10 from Eero Tamminen <eero.t.tamminen at intel.com> ---
(In reply to Chris Wilson from comment #9)
> e8f06c34fa: median of 30 manhattan runs, 1327.5289306640625 (score)
> 8991a80f85: median of 30 manhattan runs, 1327.635986328125

I'm able to reproduce 2-3% Manhattan 3.0 perf drop also manually, when using
kernel builds with indicated commits (with yesterday's Mesa & X git versions).


> using ezbench, gfxbench3,

v3?  (Manhattan should work the same in v3, v4 & v5, but it's a difference)


> bxt J3455, 1920x1080 fullscreen, bare Xorg

That's a 12EU BXT with same TDP as 18EU J4205 I'm running.  It's GPU/CPU/memory
perf balance is different and it's much less likely to be TDP limited (in case
that matters for this).

Do you have a 18EU BXT you could test?


> perf doesn't suggest any contention in the kernel, and does not appear to be
> ratelimited by i915.ko submission overhead, i.e. if it was the locking
> changes I expected that to be reflected in the perf profile of i915.ko. Hmm.

Looking at the BXT ftrace data from successive Manhattan runs, there's
something odd.

Manhattan has 3 threads.  Third does nothing, and main thread does just some
messaging during benchmarking:
     0.13%  [kernel.kallsyms]   [k] __fget_light
     0.04%  [kernel.kallsyms]   [k] _copy_from_user
     0.03%  [kernel.kallsyms]   [k] fpregs_assert_state_consistent
     0.01%  libpthread-2.29.so  [.] recvmsg
     0.01%  [kernel.kallsyms]   [k] update_rq_clock
     ...

Except for the first frame (which comes from main thread), all other frames
come from the second thread:
     1.14%  i965_dri.so              [.] brw_upload_render_state
     1.11%  i965_dri.so              [.] update_stage_texture_surfaces
     1.04%  i965_dri.so              [.] hash_table_search
     1.02%  i965_dri.so              [.] brw_draw_prims
     0.96%  i965_dri.so              [.] isl_gen9_surf_fill_state_s
     0.89%  [kernel.kallsyms]        [k] i915_gem_madvise_ioctl
     0.85%  testfw_app               [.] 0x00000000003d4b28
     0.80%  i965_dri.so              [.] brw_predraw_resolve_inputs
     0.79%  [kernel.kallsyms]        [k] __entry_text_start
    ...

The odd thing is that:
* according to kernel power::cpu_frequency events, one of the four cores is
running at much higher frequency than the other cores
* the thread doing buffer swaps (I assume it's one doing rendering), most of
the time isn't running on that core, when it does the buffer swap
* because rendering thread is using by far most CPU (50%), I assume it has been
on the high speed core (as why that core would otherwise run at high freq?)

=> Why kernel would put rendering thread to low CPU freq core before it does
buffer swap?  Transform feedback?

(E.g. unigine Heaven demo has more threads, but does rendering from main
thread, and kernel has scheduled that thread on fastest core when it does
buffer swap.)


Anyway, this doesn't explain the regression because this behavior is same
before and after the regression.  But it may partly explain why GfxBench tests
behavior differs from other benchmarks on BXT.

(I don't get CPU freq ftrace events from Core device, so I can't check whether
same happens also on KBL.)


PS. regarding the mutex changes during the commit period... Could those be
related to the more serious i915 deadlock bug 110848?  Both of these bugs are
close together time-wise...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190711/08fa91df/attachment.html>


More information about the intel-gfx-bugs mailing list