Dict parsing question
Simon McVittie
smcv at collabora.com
Fri Dec 15 15:28:28 UTC 2023
On Thu, 14 Dec 2023 at 18:34:04 -0500, Robert Middleton wrote:
> Here's the
> relevant part of the packet that it crashes on:
>
> 0040 05 61 7b 73 76 7d 00 00 c8 02 00 00 00 00 00 00 .a{sv}..........
> 0050 08 00 00 00 50 61 69 72 77 69 73 65 00 02 61 73 ....Pairwise..as
> 0060 00 00 00 00 41 00 00 00 08 00 00 00 63 63 6d 70 ....A.......ccmp
> 0070 2d 32 35 36 00 00 00 00 08 00 00 00 67 63 6d 70 -256........gcmp
> 0080 2d 32 35 36 00 00 00 00 04 00 00 00 63 63 6d 70 -256........ccmp
> 0090 00 00 00 00 04 00 00 00 67 63 6d 70 00 00 00 00 ........gcmp....
> 00a0 04 00 00 00 74 6b 69 70 00 00 00 00 00 00 00 00 ....tkip........
To parse D-Bus messages by hand, it's often useful to write them 8 or
16 bytes per line (like the hex dumps in test/data/valid-messages/)
as you have done here, and then start breaking each line into fragments.
The D-Bus wire protocol is not a fully self-describing format: if we
start at an arbitrary offset, we can't know for sure what the bytes mean,
because their interpretation depends on the signature and endianness,
which you didn't quote here. However, from context, I'm guessing that
what you quoted is meant to be parsed as beginning with a variant 'v' in
little-endian. Based on that, and parsing your message fragment by hand:
05 # v signature is 5 bytes + \0
61 7b 73 76 7d 00 # "a{sv}" + \0
00 # padding [1]
c8 02 00 00 # array is 0x2c8 = 712 bytes
00 00 00 00 # padding [2]
# first {sv}
08 00 00 00 # string is 8 bytes + \0
50 61 69 72 77 69 73 65 00 # "Pairwise" + \0
02 # v signature is 2 bytes + \0
61 73 # "as"...
00 # ... + \0
00 00 00 # padding [1]
# BEGIN ARRAY "as"
41 00 00 00 # array is 0x41 = 65 bytes
08 00 00 00 # string is 8 bytes + \0
63 63 6d 70 # "ccmp"...
2d 32 35 36 00 # "-256" + \0
00 00 00 # padding [3]
08 00 00 00 # string is 8 bytes + \0
67 63 6d 70 # "gcmp"...
2d 32 35 36 00 # "-256" + \0
00 00 00 # padding [3]
04 00 00 00 # string is 4 bytes + \0
63 63 6d 70 # "ccmp"...
00 # ... + \0
00 00 00 # padding [3]
04 00 00 00 # string is 4 bytes + \0
67 63 6d 70 00 # "gcmp" + \0
00 00 00 # padding [3]
04 00 00 00 # string is 4 bytes + \0
74 6b 69 70 00 # "tkip" + \0
# END OF ARRAY "as"
00 00 00 00 00 00 00 # padding [2]
[1] padding to the 4-byte alignment of the array length
[2] padding to the 8-byte alignment of a dict entry
[3] padding to the 4-byte alignment of the string length
You might find it useful to pass your messages to a known-good parser
that can print them as a string, like the one in GLib:
~/src/dbus$ python3
>>> blob = open("test/data/valid-messages/byteswap-fd-index.message-raw", 'rb').read()
>>> from gi.repository import Gio
>>> message = Gio.DBusMessage.new_from_blob(blob, Gio.DBusCapabilityFlags.NONE)
>>> print(message.get_body().print_(True))
(handle -131595768, <@av []>)
> Specifically, it is crashing on parsing the 'gcmp' entry in the dict.
I don't see a "gcmp" entry in a dict here. What I see is this (in GLib's
GVariant text syntax, since D-Bus doesn't have its own text format):
< # variant
@a{sv} { # dict { string: variant }
"Pairwise": < # first entry: variant...
@as [ # ... containing array of strings ...
"ccmp-256",
"gcmp-256",
"ccmp",
"gcmp",
"tkip"
]
>,
... more dict entries that you did not quote ...
}
>
> I was under the assumption that the next dict entry should start on
> the 8-byte boundary
It does, but that's right at the end of the part you quoted.
> but it appears to be starting on a 4-byte
> boundary with the length of the string
That's because at the time you see "gcmp", you're still inside the array
of strings ("as") that is part of the value (variant) of the first
dict-entry in the dict. You haven't reached the end of the dict-entry yet,
so the padding before the next dict-entry also hasn't happened yet.
You can tell this because the array started with a byte count,
0x41 = 65 bytes. Everything before you have reached that byte count is
part of the array, not part of some larger data structure.
smcv
More information about the dbus
mailing list