[Bug 92744] [g965 Regression bisected] Performance regression and piglit assertions due to liveness analysis
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Fri Oct 30 12:20:06 PDT 2015
https://bugs.freedesktop.org/show_bug.cgi?id=92744
Bug ID: 92744
Summary: [g965 Regression bisected] Performance regression and
piglit assertions due to liveness analysis
Product: Mesa
Version: git
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: blocker
Priority: medium
Component: Drivers/DRI/i965
Assignee: idr at freedesktop.org
Reporter: mark.a.janes at intel.com
QA Contact: intel-3d-bugs at lists.freedesktop.org
Many g965 assertions were generated due to the following commit:
Author: Connor Abbott <cwabbott0 at gmail.com>
AuthorDate: Tue Jun 9 10:26:53 2015 -0700
Commit: Connor Abbott <cwabbott0 at gmail.com>
CommitDate: Fri Oct 30 02:19:43 2015 -0400
i965/sched: use liveness analysis for computing register pressure
Previously, we were using some heuristics to try and detect when a write
was about to begin a live range, or when a read was about to end a live
range. We never used the liveness analysis information used by the
register allocator, though, which meant that the scheduler's and the
allocator's ideas of when a live range began and ended were different.
Not only did this make our estimate of the register pressure benefit of
scheduling an instruction wrong in some cases, but it was preventing us
from knowing the actual register pressure when scheduling each
instruction, which we want to have in order to switch to register
pressure scheduling only when the register pressure is too high.
This commit rewrites the register pressure tracking code to use the same
model as our register allocator currently uses. We use the results of
liveness analysis, as well as the compute_payload_ranges() function that
we split out in the last commit. This means that we compute live ranges
twice on each round through the register allocator, although we could
speed it up by only recomputing the ranges and not the live in/live out
sets after scheduling, since we only shuffle around instructions within
a single basic block when we schedule.
Shader-db results on bdw:
total instructions in shared programs: 7130187 -> 7129880 (-0.00%)
instructions in affected programs: 1744 -> 1437 (-17.60%)
helped: 1
HURT: 1
total cycles in shared programs: 172535126 -> 172473226 (-0.04%)
cycles in affected programs: 11338636 -> 11276736 (-0.55%)
helped: 876
HURT: 873
LOST: 8
GAINED: 0
v2: use regs_read() in more places.
Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>
The test regressions:
shaders.zero-tex-coord bias
shaders.zero-tex-coord texture2d
spec.!opengl 2_0.gl-2.0-active-sampler-conflict
spec.arb_framebuffer_object.fbo-drawbuffers-none glblitframebuffer
spec.arb_sampler_objects.sampler-incomplete
spec.arb_shader_texture_lod.compiler.tex_grad-texture2d-2d-vec2.frag
spec.arb_shader_texture_lod.compiler.tex_grad-texture2dproj-2d-vec4.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture1d-1d-float.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture1dproj-1d-vec2.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture1dproj-1d-vec4.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture2d-2d-vec2.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture3d-3d-vec3.frag
spec.ext_texture_array.maxlayers
spec.ext_texture_swizzle.depth_texture_mode_and_swizzle
spec.glsl-1_10.compiler.constant-expressions.sampler-array-index-01.frag
spec.glsl-es-1_00.compiler.structure-and-array-operations.sampler-array-index.frag
Sample output:
arb_sampler_objects-sampler-incomplete:
/mnt/space/jenkins/jobs/Leeroy/workspace/repos/mesa/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp:112:
void brw::fs_live_variables::setup_one_write(brw::block_data*, fs_inst*, int,
const fs_reg&): Assertion `var < num_vars' failed.
glslparsertest:
/mnt/space/jenkins/jobs/Leeroy/workspace at 3/repos/mesa/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp:60:
void brw::fs_live_variables::setup_one_read(brw::block_data*, fs_inst*, int,
const fs_reg&): Assertion `var < num_vars' failed.
Additionally, this commit generates a large performance regression in g965.
Even when running glslparsertest on a non-g965 machine
(INTEL_DEVID_OVERRIDE=0x29A2), cpu tests take significantly longer.
--
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20151030/71661268/attachment.html>
More information about the intel-3d-bugs
mailing list