Arbitrary string to DBus path
simon.mcvittie at collabora.co.uk
Mon Nov 18 08:23:46 PST 2013
On 15/11/13 17:45, Lennart Poettering wrote:
> On Fri, 15.11.13 11:21, Ted Gould (ted at gould.cx) wrote:
>> One problem we've had is taking arbitrary string (usually application
>> names) and putting them into object paths on DBus.
>> libnih has a small
>> function that does this by taking any character that isn't an ASCII
>> character or digit and putting it's value as ASCII digits after an
> We use a similar algorithm in systemd too:
This is either a copy of Telepathy's tp_escape_as_identifier(), or a
remarkably similar reinvention.
This algorithm is optimal for strings that are already very close to
being valid object path components (or equivalently, C identifiers,
hence the name we used). It's highly inefficient (3 times as long as
UTF-8) for non-ASCII text: picking a random example, 日本 (Japan in
Japanese, according to Wikipedia) comes out as _e6_97_a5_e6_9c_ac, which
is neither short nor human-readable.
If you expect the "arbitrary string" to be basically ASCII, great, use
that algorithm; if you expect it to be arbitrary text, Base32 or even
raw hex might be better (D-Bus only has 63 codepoints available, so
Base64 is out).
Many Telepathy components use that algorithm internally, but we've
avoided making it part of our D-Bus API. It's reversible, but we only
use that fact to ensure uniqueness, and we never parse strings in that form.
Anything that does expect to parse strings in that form, and expects
other things to interoperate with its D-Bus API, needs to document the
algorithm as part of its own D-Bus API specification (which is one
reason why it might be useful to have in the D-Bus spec - so other
specifications can just point to it). However, I'd be inclined to keep
it as a uniqueness-generating implementation detail and never parse it,
as Telepathy does.
More information about the dbus