[Xcb] [PATCH] Handle EAGAIN errno from poll(2) or select(2)

Josh Triplett josh at joshtriplett.org
Sat Aug 22 10:30:09 PDT 2015


On Sat, Aug 22, 2015 at 02:33:46AM -0700, Jeremy Huddleston Sequoia wrote:
> 
> > On Aug 20, 2015, at 09:21, Josh Triplett <josh at joshtriplett.org> wrote:
> > 
> > On Thu, Aug 20, 2015 at 12:18:41AM -0700, Jeremy Sequoia wrote:
> >> Yeah, I thought about sleeping before retrying in the EAGAIN case to
> >> avoid a possible busy loop.  I can do that if you prefer.
> >> 
> >> As I indicated in the commit message, there is know known fallout from
> >> the lack of EAGAIN handling.  There is no behavioral problem.  Indeed
> >> the only time someone should ever get back EAGAIN from poll or select
> >> on darwin is under resource pressure, and its likely the user would
> >> have bigger concerns than this at that point.
> >> 
> >> I just happened to notice this while tracing code to figure out why
> >> someone on stackoverflow was seeing recv() of the DISPLAY socket
> >> erring out with EAGAIN and then hanging.
> > 
> > If Darwin/OSX returns EAGAIN to a blocking call under *any*
> > circumstances, including "resource pressure", that's a serious bug.
> > Don't work around it in XCB or any other library, *especially* because
> > no other platform should behave the same way.  EAGAIN means "The socket
> > is marked nonblocking and the receive operation would block, or a
> > receive timeout had been set and the timeout expired before data was
> > received."  
> 
> No, that is not what EAGAIN means.  From SUSv4 at http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html
> 
> """
> The poll() function shall fail if:
> 
> [EAGAIN]
> The allocation of internal data structures failed but a subsequent request may succeed.
> ...
> """

Ah, I see; I'd forgotten that the spec actually allows EAGAIN and
EWOULDBLOCK to be different.  EWOULDBLOCK definitely has the semantics I
had in mind and that the Linux manpage documents; from
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_03

> Operation would block. An operation on a socket marked as non-blocking has encountered a situation such as no data available that otherwise would have caused the function to suspend execution.

But sure enough, for EAGAIN it says "Resource temporarily unavailable.
This is a temporary condition and later calls to the same routine may
complete normally."  So if an implementation ignores the spec language
saying "A conforming implementation may assign the same values for
[EWOULDBLOCK] and [EAGAIN]." and makes them separate, EAGAIN can indeed
mean the kernel is making its internal problems the application's
problems and requiring the application to try again.  Sigh.

> > A blocking call with no timeout should never return EAGAIN;
> > it should either block or return some fatal error.
> 
> Not according to UNIX.

s/EAGAIN/EWOULDBLOCK/ and the statement holds.

> > Libraries should *definitely* not have to include "wait a bit and try
> > again" logic; that's the kernel's job.

I stand by this statement, but evidently the spec allows this particular
bit of ridiculosity.  Personally, I'd argue that if the kernel has a
resource allocation failure, it should be returning -ENOMEM.

Could I talk you into adding a "EAGAIN != EWOULDBLOCK && " before
checking for EAGAIN?  That way, the "retry immediately on EAGAIN" logic
will only run on platforms where EAGAIN *doesn't* have the same meaning
as EWOULDBLOCK's "this is non-blocking and would block".  On platforms
that define those two identically, the extra logic will constant-fold
away.

(I also wonder whether every other application and library includes this
logic on Darwin, or if other applications and libraries end up just
exiting with an error in this case.)

- Josh Triplett


More information about the Xcb mailing list