[Mesa-dev] [PATCH] i965/vs: fix unit mismatch in base_offset.
Kenneth Graunke
kenneth at whitecape.org
Sun Sep 30 22:08:39 PDT 2012
On 09/20/2012 07:06 PM, Paul Berry wrote:
> Callers of emit_scratch_{read,write}() pass a base_offset that is
> measured in bytes, but emit_scratch_{read,write}() was assuming it was
> measured in multiples of a complete register. As a result, a shader
> using a scratch read/write offset greater than zero (in practice, a
> shader containing more than one variable in scratch) would use too
> large an offset, frequently exceeding the available scratch space.
>
> This patch corrects the mismatch by dividing base_offset by REG_SIZE
> at the beginning of emit_scratch_{read,write}().
>
> NOTE: This is a candidate for stable release branches.
> ---
>
> No test cases were failing because of this bug, but it was getting in
> the way of Ken's efforts to implement register spilling for vertex
> shaders.
>
> Ken: I've done a full piglit run on master to verify that it doesn't
> regress anything, but I didn't check whether there are any remaining
> regressions when I apply it to your "spilling" branch. All I checked
> was that it fixed the 3 regressions you told me about earlier today.
>
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index 682837f..60fab54 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -2443,12 +2443,19 @@ vec4_visitor::get_pull_constant_offset(vec4_instruction *inst,
> /**
> * Emits an instruction before @inst to load the value named by @orig_src
> * from scratch space at @base_offset to @temp.
> + *
> + * @base_offset is measured in bytes.
> */
> void
> vec4_visitor::emit_scratch_read(vec4_instruction *inst,
> dst_reg temp, src_reg orig_src,
> int base_offset)
> {
> + /* Convert base_offset from bytes to a register number, since that's what
> + * get_scratch_offset() expects.
> + */
> + base_offset /= REG_SIZE;
> +
> int reg_offset = base_offset + orig_src.reg_offset;
> src_reg index = get_scratch_offset(inst, orig_src.reladdr, reg_offset);
>
> @@ -2458,12 +2465,19 @@ vec4_visitor::emit_scratch_read(vec4_instruction *inst,
> /**
> * Emits an instruction after @inst to store the value to be written
> * to @orig_dst to scratch space at @base_offset, from @temp.
> + *
> + * @base_offset is measured in bytes.
> */
> void
> vec4_visitor::emit_scratch_write(vec4_instruction *inst,
> src_reg temp, dst_reg orig_dst,
> int base_offset)
> {
> + /* Convert base_offset from bytes to a register number, since that's what
> + * get_scratch_offset() expects.
> + */
> + base_offset /= REG_SIZE;
> +
> int reg_offset = base_offset + orig_dst.reg_offset;
> src_reg index = get_scratch_offset(inst, orig_dst.reladdr, reg_offset);
Okay. I've looked at this a bit more, and I think your patch is right,
but there might be a bug pre-Gen6.
Here's my thinking. In move_grf_array_access_to_scratch(), I see:
c->last_scratch += this->virtual_grf_sizes[src->reg] * 8 * 4;
This is clearly in measured in bytes: 8 * 4 is the size of a vec8 (or
simd4x2) register. This is passed to emit_scratch_read/write.
So the base_offset argument to emit_scratch_read/write is measured in
bytes. Your patch then causes it to be divided by REG_SIZE (8*4)
(ironically undoing the earlier scaling), at which point base_offset is
again the number of simd4x2 registers, or OWords. We add reg_offset,
which is also in OWords.
We then pass that to get_scratch_offset. Everything's dandy for Gen6+,
we want OWords. But for earlier hardware, we hit:
/* Pre-gen6, the message header uses byte offsets instead of vec4
* (16-byte) offset units.
*/
if (intel->gen < 6)
message_header_scale *= 16;
At which point I say...16 bytes??? We're converting from units of
OWords/SIMD4x2 registers, not units of vec4s. So I believe this should
be 32 bytes (REG_SIZE aka 8*4).
Thoughts?
More information about the mesa-dev
mailing list