Dict parsing question

Fri Dec 15 15:28:28 UTC 2023

On Thu, 14 Dec 2023 at 18:34:04 -0500, Robert Middleton wrote:
> Here's the
> relevant part of the packet that it crashes on:
> 
> 0040   05 61 7b 73 76 7d 00 00 c8 02 00 00 00 00 00 00   .a{sv}..........
> 0050   08 00 00 00 50 61 69 72 77 69 73 65 00 02 61 73   ....Pairwise..as
> 0060   00 00 00 00 41 00 00 00 08 00 00 00 63 63 6d 70   ....A.......ccmp
> 0070   2d 32 35 36 00 00 00 00 08 00 00 00 67 63 6d 70   -256........gcmp
> 0080   2d 32 35 36 00 00 00 00 04 00 00 00 63 63 6d 70   -256........ccmp
> 0090   00 00 00 00 04 00 00 00 67 63 6d 70 00 00 00 00   ........gcmp....
> 00a0   04 00 00 00 74 6b 69 70 00 00 00 00 00 00 00 00   ....tkip........

To parse D-Bus messages by hand, it's often useful to write them 8 or
16 bytes per line (like the hex dumps in test/data/valid-messages/)
as you have done here, and then start breaking each line into fragments.

The D-Bus wire protocol is not a fully self-describing format: if we
start at an arbitrary offset, we can't know for sure what the bytes mean,
because their interpretation depends on the signature and endianness,
which you didn't quote here. However, from context, I'm guessing that
what you quoted is meant to be parsed as beginning with a variant 'v' in
little-endian. Based on that, and parsing your message fragment by hand:

05                                                # v signature is 5 bytes + \0
   61 7b 73 76 7d 00                              # "a{sv}" + \0
                     00                           # padding [1]
                        c8 02 00 00               # array is 0x2c8 = 712 bytes
                                    00 00 00 00   # padding [2]
                                                  # first {sv}
08 00 00 00                                       # string is 8 bytes + \0
            50 61 69 72 77 69 73 65 00            # "Pairwise" + \0
                                       02         # v signature is 2 bytes + \0
                                          61 73   # "as"...
00                                                # ... + \0
   00 00 00                                       # padding [1]
                                                  # BEGIN ARRAY "as"
            41 00 00 00                           # array is 0x41 = 65 bytes
                        08 00 00 00               # string is 8 bytes + \0
                                    63 63 6d 70   # "ccmp"...
2d 32 35 36 00                                    # "-256" + \0
               00 00 00                           # padding [3]
                        08 00 00 00               # string is 8 bytes + \0
                                    67 63 6d 70   # "gcmp"...
2d 32 35 36 00                                    # "-256" + \0
               00 00 00                           # padding [3]
                        04 00 00 00               # string is 4 bytes + \0
                                    63 63 6d 70   # "ccmp"...
00                                                # ... + \0
   00 00 00                                       # padding [3]
            04 00 00 00                           # string is 4 bytes + \0
                        67 63 6d 70 00            # "gcmp" + \0
                                       00 00 00   # padding [3]
04 00 00 00                                       # string is 4 bytes + \0
            74 6b 69 70 00                        # "tkip" + \0
                                                  # END OF ARRAY "as"
                           00 00 00 00 00 00 00   # padding [2]

[1] padding to the 4-byte alignment of the array length
[2] padding to the 8-byte alignment of a dict entry
[3] padding to the 4-byte alignment of the string length

You might find it useful to pass your messages to a known-good parser
that can print them as a string, like the one in GLib:

~/src/dbus$ python3
>>> blob = open("test/data/valid-messages/byteswap-fd-index.message-raw", 'rb').read()
>>> from gi.repository import Gio
>>> message = Gio.DBusMessage.new_from_blob(blob, Gio.DBusCapabilityFlags.NONE)
>>> print(message.get_body().print_(True))
(handle -131595768, <@av []>)

> Specifically, it is crashing on parsing the 'gcmp' entry in the dict.

I don't see a "gcmp" entry in a dict here. What I see is this (in GLib's
GVariant text syntax, since D-Bus doesn't have its own text format):

<                               # variant
    @a{sv} {                    # dict { string: variant }
        "Pairwise": <           # first entry: variant...
            @as [               # ... containing array of strings ...
                "ccmp-256",
                "gcmp-256",
                "ccmp",
                "gcmp",
                "tkip"
            ]
        >,
        ... more dict entries that you did not quote ...
    }
>

> I was under the assumption that the next dict entry should start on
> the 8-byte boundary

It does, but that's right at the end of the part you quoted.

> but it appears to be starting on a 4-byte
> boundary with the length of the string

That's because at the time you see "gcmp", you're still inside the array
of strings ("as") that is part of the value (variant) of the first
dict-entry in the dict. You haven't reached the end of the dict-entry yet,
so the padding before the next dict-entry also hasn't happened yet.

You can tell this because the array started with a byte count,
0x41 = 65 bytes. Everything before you have reached that byte count is
part of the array, not part of some larger data structure.

    smcv