[Mesa-dev] [PATCH 05/20] i965/cfg: Add a foreach_inst_in_block_safe macro.

Wed Aug 6 11:16:20 PDT 2014

On Wed, Aug 6, 2014 at 5:22 AM, Pohjolainen, Topi
<topi.pohjolainen at intel.com> wrote:
> On Tue, Aug 05, 2014 at 09:14:55PM +0300, Pohjolainen, Topi wrote:
>> On Thu, Jul 24, 2014 at 07:54:12PM -0700, Matt Turner wrote:
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_cfg.h | 8 ++++++++
>> >  1 file changed, 8 insertions(+)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h b/src/mesa/drivers/dri/i965/brw_cfg.h
>> > index a5d2df5..913a1ed 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_cfg.h
>> > +++ b/src/mesa/drivers/dri/i965/brw_cfg.h
>> > @@ -120,6 +120,14 @@ struct cfg_t {
>> >          __inst != __block->end->next;                          \
>> >          __inst = (__type *)__inst->next)
>> >
>> > +#define foreach_inst_in_block_safe(__type, __inst, __block)    \
>> > +   for (__type *__inst = (__type *)__block->start,             \
>> > +               *__next = (__type *)__inst->next,               \
>> > +               *__end = (__type *)__block->end->next->next;    \
>>
>> Patches 4 and 7 make sense but the double ->next->next here is not obvious
>> to me.

Right, yep. exec_list uses head and tail sentinels, so the double-next
handles that. Explained below:

> I tried handwriting instructions into blocks (this is purely arbitrary):
>
> ip    opcode
> ------------------
> 0  :  BRW_OPCODE_?
> ..
> k  :  BRW_OPCODE_IF
> k+1:  BRW_OPCODE_?
> ..
> n  :  BRW_OPCODE_ELSE
> n+1:  BRW_OPCODE_?
> ..
> m  :  BRW_OPCODE_ENDIF
> m+1:  BRW_OPCODE_?
> ..
> t  :  BRW_OPCODE_?
>
>
> Following the logic in the constructor of cfg_t, I would deduce this:
>
> block 0:
>    start_ip = 0
>    num = 0
>    start = inst_0
>    end = inst_k      (if)
>
> block 1:
>    start_ip = k+1
>    num = 1
>    start = inst_k+1
>    end = inst_n      (else)
>
> block 2:
>    start_ip = n+1
>    num = 2
>    start = inst_n+1
>    end = inst_m-1
>
> block 3:
>    start_ip = m
>    num = 3
>    start = inst_m    (endif)
>    end = inst_t
>
>
> And as instructions are inherited from exec_node, for block 3 end->next
> should be NULL, right?

Since exec_list uses head and tail sentinels, so block[3]->end->next
will actually be the tail sentinel (and block[2]->end->next will be
the first instruction of block[3]).

The __end variable prevents us from dereferencing NULL if we remove
the last instruction in a block (and therefore remove the block). Note
that the continuing condition is (__next != __end). For each block, we
want to iterate through the instructions until we hit
block->end->next->next because if the block

 - isn't the last block, end->next->next will be two nodes (I say
node, rather than instruction because of the tail sentinel) after the
end

 - is the last block, end->next->next will be NULL.

In both cases we want to compare with __next, which after the
iteration is one past the node after block->end.

Does that make sense? There are really two things to remember: (1)
head and tail sentinels, and (2) this macro is _safe, so we're
comparing with __next (i.e., one past the end).