Dict parsing question

Simon McVittie smcv at collabora.com
Tue Jan 2 11:47:30 UTC 2024


On Mon, 18 Dec 2023 at 20:05:58 +1300, Lawrence D'Oliveiro wrote:
> > The array is *4*-byte aligned, the same as any other array (because it
> > starts with a 4-byte length). After the length, the dict entries
> > inside it are 8-byte aligned, the same as structs.
> 
> I had to look at that twice, three times before I could be sure I
> understood it right. Is that really the case, that elements of a dict
> structure have a _greater_ alignment requirement than the structure
> itself? How is that supposed to work, exactly?

This is not just dicts: the elements of an array of (u)int64 or double
have the same property that the 8-byte alignment of each element is
greater than the 4-byte alignment of the first byte of the array
length. This just means that you might need padding between the
array length and the actual content (the first element). The D-Bus
Specification[0] explains this in detail.

[0] https://dbus.freedesktop.org/doc/dbus-specification.html#id-1.4.5

This means that the array will sometimes pack more efficiently than how
you might have imagined that it worked. For example, a message body with
signature 'uat' and value "uint32 0xaaaaaaaa, [uint64 0xbbbbbbbbcccccccc]"
(shown below in big-endian) fits neatly into the minimal 16 bytes required,
with no padding, even though you might have imagined that padding to an
8-byte boundary would have been required between the 0xaaaaaaaa and the
0x00000008:

    aa aa aa aa                # uint32 0xaaaaaaaa                   } u
                               # no padding needed for array length
                00 00 00 08    # array length = 0x8 bytes            }
                               # no padding needed for type t        } at
    bb bb bb bb cc cc cc cc    # first element = 0xbbbbbbbbcccccccc  }

If we want array elements to be aligned "naturally", which was one of
the axioms for how the D-Bus wire protocol was designed, then it was
always going to be the case that padding is sometimes required between
the array length and the elements. For example, consider a message body
with signature 'at' and value "[uint64 0xbbbbbbbbcccccccc]". The array
happens to be 8-byte-aligned (because in this example it starts at offset
0), but if we want the 0xbbbbbbbbccccccccc to be naturally-aligned,
then we have no choice but to add padding anyway:

    00 00 00 08                # array length = 0x8 bytes
                00 00 00 00    # padding to 8-byte alignment of type t
    bb bb bb bb cc cc cc cc    # first element = 0xbbbbbbbbcccccccc

As I said in another reply, if we were designing D-Bus for the first
time today, it would be possible to argue that "all elements are
naturally-aligned" is not a useful axiom to have; but it was treated as an
axiom at the time the protocol was designed, and in 2024 we are about 20
years too late to be making design decisions that affect interoperability.
So the protocol is what it is, and there will not be a "v2" protocol
unless the benefits are sufficiently compelling to outweigh the cost of
implementing and maintaining version-negotiation and the "v2" protocol
in multiple places (libdbus, GDBus, sd-bus, others).

To write or read an array of 8-byte-aligned elements

- start at some arbitrary offset into the message (wherever you happen
  to have finished writing or reading the previous field)
- write or skip 0-3 bytes of zeroes to reach a 4-byte boundary
- write or read 4 bytes of array length, *n* >= 0
- write or skip 0-7 bytes of zeroes to reach an 8-byte boundary
- the next *n* bytes are the elements

Variants have a similar setup: the variant starts with a signature,
which has a 1-byte alignment (it can start at any offset), but there is
usually padding required after the signature to reach the alignment of
the contained type.

    smcv


More information about the dbus mailing list