<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - [g965 Regression bisected] Performance regression and piglit assertions due to liveness analysis"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92744">92744</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[g965 Regression bisected] Performance regression and piglit assertions due to liveness analysis
</td>
</tr>
<tr>
<th>Product</th>
<td>Mesa
</td>
</tr>
<tr>
<th>Version</th>
<td>git
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>blocker
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>Drivers/DRI/i965
</td>
</tr>
<tr>
<th>Assignee</th>
<td>idr@freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>mark.a.janes@intel.com
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>intel-3d-bugs@lists.freedesktop.org
</td>
</tr></table>
<p>
<div>
<pre>Many g965 assertions were generated due to the following commit:
Author: Connor Abbott <<a href="mailto:cwabbott0@gmail.com">cwabbott0@gmail.com</a>>
AuthorDate: Tue Jun 9 10:26:53 2015 -0700
Commit: Connor Abbott <<a href="mailto:cwabbott0@gmail.com">cwabbott0@gmail.com</a>>
CommitDate: Fri Oct 30 02:19:43 2015 -0400
i965/sched: use liveness analysis for computing register pressure
Previously, we were using some heuristics to try and detect when a write
was about to begin a live range, or when a read was about to end a live
range. We never used the liveness analysis information used by the
register allocator, though, which meant that the scheduler's and the
allocator's ideas of when a live range began and ended were different.
Not only did this make our estimate of the register pressure benefit of
scheduling an instruction wrong in some cases, but it was preventing us
from knowing the actual register pressure when scheduling each
instruction, which we want to have in order to switch to register
pressure scheduling only when the register pressure is too high.
This commit rewrites the register pressure tracking code to use the same
model as our register allocator currently uses. We use the results of
liveness analysis, as well as the compute_payload_ranges() function that
we split out in the last commit. This means that we compute live ranges
twice on each round through the register allocator, although we could
speed it up by only recomputing the ranges and not the live in/live out
sets after scheduling, since we only shuffle around instructions within
a single basic block when we schedule.
Shader-db results on bdw:
total instructions in shared programs: 7130187 -> 7129880 (-0.00%)
instructions in affected programs: 1744 -> 1437 (-17.60%)
helped: 1
HURT: 1
total cycles in shared programs: 172535126 -> 172473226 (-0.04%)
cycles in affected programs: 11338636 -> 11276736 (-0.55%)
helped: 876
HURT: 873
LOST: 8
GAINED: 0
v2: use regs_read() in more places.
Reviewed-by: Jason Ekstrand <<a href="mailto:jason.ekstrand@intel.com">jason.ekstrand@intel.com</a>>
The test regressions:
shaders.zero-tex-coord bias
shaders.zero-tex-coord texture2d
spec.!opengl 2_0.gl-2.0-active-sampler-conflict
spec.arb_framebuffer_object.fbo-drawbuffers-none glblitframebuffer
spec.arb_sampler_objects.sampler-incomplete
spec.arb_shader_texture_lod.compiler.tex_grad-texture2d-2d-vec2.frag
spec.arb_shader_texture_lod.compiler.tex_grad-texture2dproj-2d-vec4.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture1d-1d-float.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture1dproj-1d-vec2.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture1dproj-1d-vec4.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture2d-2d-vec2.frag
spec.arb_shader_texture_lod.compiler.tex_lod-texture3d-3d-vec3.frag
spec.ext_texture_array.maxlayers
spec.ext_texture_swizzle.depth_texture_mode_and_swizzle
spec.glsl-1_10.compiler.constant-expressions.sampler-array-index-01.frag
spec.glsl-es-1_00.compiler.structure-and-array-operations.sampler-array-index.frag
Sample output:
arb_sampler_objects-sampler-incomplete:
/mnt/space/jenkins/jobs/Leeroy/workspace/repos/mesa/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp:112:
void brw::fs_live_variables::setup_one_write(brw::block_data*, fs_inst*, int,
const fs_reg&): Assertion `var < num_vars' failed.
glslparsertest:
/mnt/space/jenkins/jobs/Leeroy/workspace@3/repos/mesa/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp:60:
void brw::fs_live_variables::setup_one_read(brw::block_data*, fs_inst*, int,
const fs_reg&): Assertion `var < num_vars' failed.
Additionally, this commit generates a large performance regression in g965.
Even when running glslparsertest on a non-g965 machine
(INTEL_DEVID_OVERRIDE=0x29A2), cpu tests take significantly longer.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>