[PATCH] No longer need to escape '+' in a D-Bus address
Michael Witten
mfwitten at MIT.EDU
Tue Sep 9 09:25:11 PDT 2008
Hello again. I'm back to keep this thread alive.
On 29 Aug 2008, at 8:17 AM, Havoc Pennington wrote:
> Hi,
>
> On Fri, Aug 29, 2008 at 3:14 AM, Michael Witten <mfwitten at mit.edu>
> wrote:
>> The alternative is to loosen the specification (for at least file
>> hierarchy "addresses"), which seems acceptable to me.
>
> If you loosened it to never require escaping, or proved that "+" was
> the only possible escape-requiring char in this context, then it would
> be slightly easier to fix this OS X bug. But every time someone has a
> filename with a char that requires escaping, loosening the list of
> escape chars is not going to be the fix every time.
A one Peter O'Gorman on the darwin-userlevel()lists.apple.com mailing
list
pointed me to:
http://launchd.macosforge.org/trac/browser/trunk/launchd/src/launchd_core_logic.c
which shows where $TMPDIR is getting set:
char tmpdirpath[PATH_MAX];
...
r = confstr(_CS_DARWIN_USER_TEMP_DIR, tmpdirpath, sizeof(tmpdirpath));
if (likely(r > 0 && r < sizeof(tmpdirpath))) {
setenv("TMPDIR", tmpdirpath, 0);
}
The real guts of the code is in Mac OS X's libc's confstr implementation
(and the related helpers). I found this by downloading:
http://www.opensource.apple.com/darwinsource/tarballs/apsl/Libc-498.1.1.tar.gz
Inside gen/confstr.c, we have:
docopy:
if (len != 0 && buf != NULL)
strlcpy(buf, p, len);
return (strlen(p) + 1);
...
case _CS_DARWIN_USER_DIR:
if ((p = alloca(PATH_MAX)) == NULL) {
errno = ENOMEM;
return (CONFSTR_ERR_RET);
}
if (_dirhelper(DIRHELPER_USER_LOCAL, p, PATH_MAX) == NULL)
return (CONFSTR_ERR_RET);
goto docopy;
The function _dirhelper() is defined in darwin/_dirhelper.c; it creates
the necessary directory if it doesn't already exist and then yields the
respective path.
In particular, _dirhelper() calls __user_local_dirname() (in
_dirhelper.c),
which produces a uuid based on the user's uid and then encodes this
into a
path-worthy string using encode_uuid_uid() (in _dirhelper.c), which
performs
a possibly-non-standard encoding algorithm that maps to the following
set of
characters:
"+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
Hence! '+' is the only escape-requiring character in this context.
However, it is clear that this could change at any time, but I doubt
anyone
would introduce paths that could make shells unhappy (as per the
discussion
later).
> It doesn't follow then, from my perspective, that whenever someone has
> an escape-requiring char we stop requiring its escaping. Changing the
> escaped-char list needs a little more justification.
I agree. However, perhaps file-hierarchy 'addresses' could follow a
more forgiving specification.
>> It's really just a matter of deciding who has to do the grunt work---
>
> The grunt work in this case, assuming "+" is the only escape-requiring
> char, is to add 1 line "sed -e" that replaces "+" with "%2B", so I
> vote for the OS X port maintainer.
It doesn't really matter in the end, except that now there are all of
these pesky port eccentricities.
>> Frankly, enforcing (arbitrarily) strict input is never a good
>> idea,
>> and I suggest making dbus more forgiving on a wide range of
>> characters
>> (for instance, all printable characters).
>
> Then people would have to do grunt work in quite a few cases to
> *escape* rather than unescape dbus addresses, to prevent them from
> causing a problem in shell command lines.
>
> The point of requiring escaping is to allow dbus addresses to be used
> without shell-escaping them in contexts such as setting
> DBUS_SESSION_BUS_ADDRESS on the command line.
It seems like this is mainly a problem for tools that produce shell
code on the fly (such as dbus-launch --sh-syntax); wouldn't it be
more useful to have dbus-launch do the necessary escaping?
In fact, having the special case of the shell influence the spec
so intimately seems even grosser than now making allowances for at
least file-hiearchy paths.
If the specification has been written so as to make shell
scripting easy, then why not write the specification so
as to make file-hierarchy paths easy?
I suppose I am proposing to *extend* the specification,
in the sense that it would still be backward-compatible.
> The point of not requiring *all* chars to be escaped is that it makes
> the address mostly human-readable and much shorter.
Currently though, perfectly good file-hierarchy paths unnecessarily
morphed into annoyances that entail the discussion of design decisions,
sifting through abstruse code, and introducing special port-specific
code.
Sincerely,
Michael Witten
More information about the dbus
mailing list