[Mesa-dev] i965: Invalid memory accesses after resizing brw_codegen's store table

Kristian Høgsberg krh at bitplanet.net
Wed Oct 28 10:58:09 PDT 2015


On Wed, Oct 28, 2015 at 10:01:40AM +0100, Samuel Iglesias Gonsálvez wrote:
> There is no opinions about this issue or reviews of the proposed patch
> after one week.
> 
> This is just a reminder in case you have missed it :-)

Thanks for the reminder! How about something like this instead?

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index ebd811f..cd5c726 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2511,12 +2511,20 @@ brw_send_indirect_message(struct brw_codegen *p,
                           struct brw_reg desc)
 {
    const struct brw_device_info *devinfo = p->devinfo;
-   struct brw_inst *send, *setup;
+   struct brw_inst *send;
+   int setup;
 
    assert(desc.type == BRW_REGISTER_TYPE_UD);
 
+   /* We hold on to the setup instruction (the SEND in the direct case, the OR
+    * in the indirect case) by its index in the instruction store.  The
+    * pointer returned by next_insn() may become invalid if emitting the SEND
+    * in the indirect case reallocs the store.
+    */
+
    if (desc.file == BRW_IMMEDIATE_VALUE) {
-      setup = send = next_insn(p, BRW_OPCODE_SEND);
+      setup = p->nr_insn;
+      send = next_insn(p, BRW_OPCODE_SEND);
       brw_set_src1(p, send, desc);
 
    } else {
@@ -2531,7 +2539,8 @@ brw_send_indirect_message(struct brw_codegen *p,
        * caller can specify additional descriptor bits with the usual
        * brw_set_*_message() helper functions.
        */
-      setup = brw_OR(p, addr, desc, brw_imm_ud(0));
+      setup = p->nr_insn;
+      brw_OR(p, addr, desc, brw_imm_ud(0));
 
       brw_pop_insn_state(p);
 
@@ -2543,7 +2552,7 @@ brw_send_indirect_message(struct brw_codegen *p,
    brw_set_src0(p, send, retype(payload, BRW_REGISTER_TYPE_UD));
    brw_inst_set_sfid(devinfo, send, sfid);
 
-   return setup;
+   return &p->store[setup];
 }
 
 static struct brw_inst *


> Sam
> 
> On 21/10/15 12:23, Iago Toral wrote:
> > Hi,
> > 
> > The problem is with code like this (see brw_send_indirect_message):
> > 
> > setup = brw_OR(p, addr, desc, brw_imm_ud(0));
> > send = next_insn(p, BRW_OPCODE_SEND);
> > ...
> > return setup;
> > 
> > If next_insn triggers a realloc of the instruction store, then the setup
> > instruction pointer is no longer valid. Notice that this can happen
> > anywhere where we keep pointers to previous instructions before creating
> > new ones (!)
> > 
> > The patch from Samuel fixes this by special-casing this for SEND
> > instructions only (since we know that the indirect versions can hit
> > this, maybe there are more situations though). It does so by trying to
> > make sure that we never realloc the store with a SEND instruction. For
> > this, we realloc before we reach the end of the current store (32
> > instructions before the limit) as long as the instruction is not a SEND
> > (so that if it is a SEND we still have up to 32 opportunities to do the
> > realloc without a different instruction before running out of space in
> > the store).
> > 
> > Iago
> > 
> > On Wed, 2015-10-21 at 09:02 +0200, Samuel Iglesias Gonsalvez wrote:
> >> Hello,
> >>
> >> I have found several invalid memory accesses when running
> >> dEQP-GLES31.functional.ssbo.* tests on i965 driver (and gen7+). That
> >> invalid memory accesses were unluckily happening when generating the
> >> assembly instructions for SSBO stores for different compute shaders.
> >>
> >> However it looks like this problem could happen to other shaders and
> >> situations. Because of that, I am going to explain the problem here:
> >>
> >> When generating a untyped surface write/read, i965 driver will end up
> >> calling brw_send_indirect_message() through
> >> brw_send_indirect_surface_message(). At brw_send_indirect_message()'s
> >> 'else' branch, the code generates a load of the indirect descriptor to
> >> an address register using an OR instruction and it also generates a new
> >> SEND instruction; if this case happens, the OR instruction is returned.
> >> brw_send_indirect_surface_message() uses that OR instruction to set mlen
> >> and rlen's descriptor bits later.
> >>
> >> Just to give more context, when generating instructions in fs/vec4
> >> generators, i965 driver uses pointers to elements in the 'store' table
> >> inside struct brw_codegen. That table has an initial size of 1024 but,
> >> when it's full, it is resized (doubling its size each time,
> >> see brw_next_insn()). This resize operation ends up calling
> >> realloc(). However the returned pointer by realloc() could be different
> >> and the old allocated memory would be free'd as part of the process.
> >>
> >> Back to the issue, if the p->store's resize happens when we get the
> >> pointer to the SEND instruction at brw_send_indirect_message()'s 'else'
> >> branch, we could have the following problem:
> >>
> >> The realloc() returns a new pointer and *free'd* the old allocation, so
> >> the pointer we previously saved for the OR instruction at
> >> brw_send_indirect_surface_message() becomes invalid (because it is a
> >> pointer of the old allocation). Then, we access to that invalid pointer
> >> when setting up rlen/mlen bits at the end of
> >> brw_send_indirect_surface_message() and we would have undefined results.
> >>
> >> This issue is quite unlikely to happen but it is reproducible on
> >> ~120 dEQP-GLES31.functional.ssbo.* tests, basically because they have
> >> the same shaders except the buffer variable's data type. Those tests
> >> were failing intermittently at different rates but valgrind helped to
> >> find what was happening.
> >>
> >> I would like to expose publicly the problem and analyse possible
> >> solutions for it along with the community. For the time being, a patch
> >> is proposed to mitigate this issue, which is sent as a reply to this
> >> one.
> >>
> >> What this work-around patch does is: book enough room for one or more
> >> consecutive SEND* instructions at the end of the p->store table in order
> >> to avoid reallocating p->store in the aforementioned case. The 32 value
> >> was chosen arbitrary because it has low impact in p->store
> >> (its initial size is 1024) and makes this issue much less likely to
> >> happen. We could tune this number to a less conservative value if
> >> needed. If you want to test it, that patch should be applied on top of
> >> this Curro's patch [0] as it fixes a lot of compute shader compilation
> >> errors (~700 dEQP-GLES31.functional.ssbo.* tests). I have setup a branch
> >> with both patches in [1].
> >>
> >> Feel free to comment about other solutions, ideas, opinions, etc.
> >>
> >> Thanks,
> >>
> >> Sam
> >>
> >> [0] See attachment at
> >> http://lists.freedesktop.org/archives/mesa-dev/2015-October/097183.html
> >> [1] $ git clone -b dEQP-functional-ssbo-fixes-v1 \
> >> https://github.com/Igalia/mesa.git
> >>
> >> Samuel Iglesias Gonsalvez (1):
> >>   i965: book space at the end of p->store for SEND opcodes to avoid
> >>     invalid memory access
> >>
> >>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 12 +++++++++++-
> >>  1 file changed, 11 insertions(+), 1 deletion(-)
> >>
> > 
> > 
> > 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list