[Xcb] [ANNOUNCE] xcb-util 0.3.9

Thu Jun 7 05:22:41 PDT 2012

On Thu, Jun 07, 2012 at 01:22:47AM -0700, Jeremy Huddleston wrote:
> On Jun 6, 2012, at 15:13, Josh Triplett <josh at joshtriplett.org> wrote:
> > On Wed, Jun 06, 2012 at 02:10:47PM -0700, Jeremy Huddleston wrote:
> >> On Jun 6, 2012, at 4:04 AM, Josh Triplett <josh at joshtriplett.org> wrote:
> >>> On Tue, Jun 05, 2012 at 10:03:44PM -0700, Jeremy Huddleston wrote:
> >>>> On Jun 5, 2012, at 6:35 PM, Josh Triplett <josh at joshtriplett.org> wrote:
> >>>>> I agree with your statement that from a functional standpoint this holds
> >>>>> true: the linker doesn't seem to enforce the minor version rule, so you
> >>>>> can build against a newer library and run with an older one, or vice
> >>>>> versa, as long as the major version matches.  The linker will complain
> >>>>> if you use a symbol that it can't resolve, though.
> >>>> 
> >>>> As it should (well unless the symbol is weak and can be checked for at
> >>>> runtime).
> >>> 
> >>> True, though for most symbols that doesn't make sense, since you'd have
> >>> to write things like "if (!xcb_useful_function) aieee();". :)
> >> 
> >> Well, if aieee() is really needed, then you wouldn't check for it, you'd
> >> just use it and bail out with the dynamic linker complaining that it
> >> couldn't resolve.
> > 
> > You don't necessarily have that option with a weak symbol.  Unless you
> > mean that the program can, on a symbol-by-symbol basis, choose whether
> > to use the symbol as weak or non-weak?  That seems feasible, but not
> > compatible with how most library header files normally define function
> > prototypes.
> 
> Well, it's how we do it on OS X.  For example:
> 
> ssize_t getdelim(char ** __restrict, size_t * __restrict, int, FILE * __restrict)
> __OSX_AVAILABLE_STARTING(__MAC_10_7, __IPHONE_4_3);
> 
> getdelim will be weak if the developer sets OS X 10.6 as a deployment target.
> If 10.7 or later is used as a deployment target, the symbol will not be weak.
> __OSX_AVAILABLE_STARTING is essentially a macro that evaluates to nothing or
> __attribute__((weak_import)).

I see.  So if you build with the headers and libraries of 10.7, and you
want to require 10.7, then you get a non-weak symbol and can use it
unconditionally, while if you build with 10.7 but say you want
compatibility with 10.6 then you get a weak symbol?

That seems like an eminently sensible approach.  The alternative, which
I've seen in the occasional Linux interface header that implements such
versioning, amounts to not defining the symbol at all if you say you
want compatibility with versions that don't have it.

Several of my responses below will amount to "this makes much more sense
now that I understand the standard idiom you had in mind". :)

> >> If you can avoid using it, you would do something like:
> >> 
> >> if (strlcpy) {
> >>   strlcpy(...);
> >> } else {
> >>   strncpy(...);
> >>   ...;
> >> }
> > 
> > In that case, I'd suggest just using strncpy unconditionally, or writing
> > your own version of strlcpy with a compatible interface and linking it
> > in if libc doesn't have one.
> 
> strncpy is a contrived example, and yes in that case, you probably want to just use strncpy instead.
> 
> if (optimized_new_codepath) {
>    optimized_new_codepath(...);
> } else {
>    less_optimal_codepath(...);
> }

Sure, that makes sense.  I'd prefer to wrap it up in a function
codepath(), though, so I don't have to open-code it.  And ideally I'd
prefer to have codepath() predefined for me rather than having to write
it even the once.

> >  I tend to subscribe to the Linux kernel's
> > style of never including #ifdef in .c files, and I consider code like
> > the above gross for similar reasons; it strongly suggests the need for
> > an abstraction layer to not have to deal with that at each call site.
> 
> Well then your abstraction layer will need to do that exact same logic.

True, but I don't have to write out that logic in every place I want to
call a function. :)

> >>> Better to use symbol versioning or similar to let the dynamic linker
> >>> tell you at the start of your program that a symbol doesn't exist.  
> >> 
> >> Why?  If you can do something in the case that it doesn't exist, that
> >> should be an option.
> > 
> > That falls under the case I mentioned below ("Or, if you really have
> > written your program so that you can cope with the absence of some
> > functionality,").  That doesn't represent the common case, though.
> 
> It's a very common case for OS X developers that want to support new
> technologies on newer OS versions but still want to run on older OS
> versions.  For example, when Snow Leopard first came out, developers
> were able to use this functionality to use their new dispatch codepath
> if GCD was available and pthreads if it was not.

Sorry, I should have said that it doesn't represent the common case *on
Linux*.  I do agree that it potentially represents the common case on
platforms with predominantly binary-only software.  I'd also say,
though, that the solution you described above with conditionally-weak
symbols makes much more sense than the notion of always-weak symbols
that I thought you had suggested.

> >>> Or,
> >>> if you really have written your program so that you can cope with the
> >>> absence of some functionality, consider either using dlopen/dlsym to
> >>> make that explicit or otherwise having a way to easily verify the
> >>> functionality you need without having to test every symbol for NULLness.
> >> 
> >> No, that's a horrible solution.  The code snippet above (for strlcpy)
> >> makes it easily accessible for developers.  Doing something with dlsym
> >> is ugly in comparison (and IMO would just cause developers to NOT use
> >> the new functionality):
> >> 
> >> size_t (*strlcpy_maybe)(char *, const char *, size_t);
> >> strlcpy_maybe = dlsym(RTLD_DEFAULT, "dlsym");
> >> if (strlcpy_maybe) {
> >>   strlcpy_maybe(...);
> >> } else {
> >>   strncpy(...);
> >>   ...;
> >> }
> > 
> > People expect that dlsym might fail.  For the most part, people *don't*
> > expect that a function defined in a header file might point to NULL;
> > they'll just call it, and segfault when it points to NULL.  Plus, if you
> > use dlopen/dlsym, you can cope with the complete absence of a library on
> > the system.
> 
> Well then if they want to use a function unconditionally that is first
> provided in libA.1.2, then they can't run on libA.1.1.  If they want to use
> that functionality *AND* be able to run on libA.1.1, they need to do
> something, and developers seem to have been fine with that case for the past
> decade or so...

And now that I understand the standard idiom for this on OS X, it makes
a lot more sense. :)

> > If you define an entirely new interface, you can define it using weak
> > symbols, but programmers will still trip over it unless you provide a
> > convenient way to say "no, really, I don't want to do runtime
> > detection, I just want to refuse to run if the functionality I expect
> > doesn't exist".
> 
> Yes, that is why we use a macro to determine if it's weak or not.  If the
> developer says, no I want to use this and can't run without it, they set
> the deployment target to a newer OS version.  If they want to support OS
> versions without the functionality, they set it appropriately, and the
> symbols are weak.

Makes perfect sense.

> > Among other things, I'd rather have an interface like the syscall
> > interface, where calling a non-existent syscall *works* and produces
> > ENOSYS.
> 
> >  Then, code that always says "die if the syscall fails" will
> > die, and code that uses the syscall for optional functionality will
> > gracefully fall back.  More importantly, I then don't need a conditional
> > at every callsite.
> 
> Well you should be checking the return value and acting appropriately.  In
> your case, you still need to do:
> 
> if (syscall(...) == ENOSYS) {
>     do_fallback();
> }

Except that I can fold that in with all the other errno handling, which
will tend to make it a bit more natural.  I frequently won't need to
distinguish between ENOSYS and any other error.  (Also, presumably you
meant if (syscall(...) < 0 && errno == ENOSYS) { ... }.)

> >>>>> In particular, the minor version serves as a hint to the programmer that
> >>>>> if they link against libABC.so.1.1, they might or might not successfully
> >>>>> run against libABC.so.1.0, depending on what symbols they used.
> >>>> 
> >>>> IMO, that should be annotated in header files in a way that allows those
> >>>> symbols to be weak linked and checked for at runtime (and thus go down an
> >>>> alternative codepath if unavailable).
> >>> 
> >>> Not unless that gets wrapped in some kind of interface that avoids the
> >>> need to check all used symbols against NULL before using them; I'd
> >>> prefer to make that the dynamic linker's job.
> >> 
> >> Yes, the dynamic linker will bail on you if you try to actually *use* the
> >> symbol, but you still need to check it.  I'm not sure what kind of interface
> >> you want.  This seems rather straightforward to me:
> >> 
> >> #if I_WANT_TO_SUPPORT_OLD_LIBRARY_VERSIONS_WITHOUT_STRLCPY
> >> if (strlcpy) {
> >>   strlcpy(...);
> >> } else
> >> #endif
> >> {
> >>   strncpy(...);
> >>   ...;
> >> }
> > 
> > No, I'd rather the interface looked like this:
> > 
> > strlcpy(...);
> > 
> > Or, if I don't want to count on that, and I don't want to provide a
> > compatible strlcpy replacement via autoconf or similar:
> > 
> > strncpy(...);
> > extra_pile_of_ugly(...);
> > 
> > That applies to the case where I need something with strlcpy's
> > functionality unconditionally, and only the implementation varies.  In
> > the case of something like XInput 2 support, a library that wants to use
> > XInput 2 iff available could use dlsym to use it conditionally (in which
> > case they work even with the library unavailable).
> 
> dlsym is a horrible interface for clients of a library.  They should be
> able to just use the real symbols, and the dlsym-cruft should be "done" by
> the dynamic linker.

Given the idiom you outlined above, I agree entirely.

> > Such a library could
> > also use weak symbols, though that seems both more error-prone and more
> > difficult to specify in a header file (you'd need either separate
> > headers for weak and non-weak usage or some kind of #define WEAK_XI2
> > before including the header file).
> 
> You don't need two header files, you just need something like the OS X
> availability macros which handle it based on the version the symbol
> entered in, and the version that we want to support running on.

*nod*.  Seems no more complex than the existing pile of macros in glibc
to determine available functionality based on things like _BSD_SOURCE
and _GNU_SOURCE.

> > Personally, I'd rather have a wrapper interface that looks like
> > unconditional function calls with error handling, rather than function
> > calls conditional on the function pointer itself.  Any approach that
> > makes every program and library author write their *own* wrappers seems
> > like a problem; why force everyone to write duplicate code for the
> > common case?
> 
> I would argue that it's not the common case.  The common case is probably supporting the most recent version of the library and not caring about running on older versions.  Also, the error-case is not always trivial, and I prefer to not have "code" in header files.

Agreed.

> > In any case, this seems like a far tangent from the issue of removing
> > symbols from a library. :)
> 
> Yeah, but I often get on tangents.  You often get on tangents... I guess we're rather obtuse individuals, although Nancy sometimes calls me acute...
> 
> Ok, stopping now...

I find myself wondering what a secant in a conversation would look like.
Attempting to inject a tenuously related notion into a conversation in
an attempt to reach a more clearly related distant point via a more
direct route? :)

> >> Symbol versioning is very useful for dealing with the flat namespace
> >> problem.  For example, consider an application that links against libA
> >> and libB.  libA links against libC.1 and libB links against libC.2. 
> >> Both libC.1 and libC.2 provide different versions of funC().  In a flat
> >> namespace without versioning, this situation would not work. funC at LIBC
> >> would collide.  ELF solves this by versioning the symbols in the global
> >> symbol list.  On OS X, we use a 2-level namespace, so versioning isn't
> >> necessary.
> > 
> > Interesting.  So, libA will reference "funC in libC.1" and libB will
> > reference "funC in libC.2", using a namespacing mechanism orthogonal to
> > variants?
> 
> Yes.  We support flat namespace, but it's not default, and it's not
> recommended.
> 
> When resolving a symbol in libB, it will only see symbols from its
> dependencies.  In fact, the Mach-O load commands essentially specify the
> full path to the dylib that provides the symbol, so even if I have
> /usr/lib/libB.1.dylib and /usr/local/lib/libB.1.dylib, they can both be
> used at the same time by different libraries.  There can be problems if
> clients assume that they can pass objects from one into the other (eg, a
> client of libB passing an objBPtr managed by /usr/lib/libB.1.dylib to
> another client of libB which was using /usr/local/lib/libB.1.dylib), but
> that isn't something hit too much in practice.

It does happen sometimes in practice on Linux; symbol versioning allows
two versions of a library to coexist in the same process without major
insanity, but various minor insanities still tend to crop up...

- Josh Triplett