[Xcb] [PATCH] Handle EAGAIN errno from poll(2) or select(2)

Josh Triplett josh at joshtriplett.org
Sat Aug 22 18:43:27 PDT 2015


On Sat, Aug 22, 2015 at 10:52:17AM -0700, Jeremy Huddleston Sequoia wrote:
> 
> > On Aug 22, 2015, at 10:30, Josh Triplett <josh at joshtriplett.org> wrote:
> > 
> > On Sat, Aug 22, 2015 at 02:33:46AM -0700, Jeremy Huddleston Sequoia wrote:
> >> 
> >>> On Aug 20, 2015, at 09:21, Josh Triplett <josh at joshtriplett.org> wrote:
> >>> 
> >>> On Thu, Aug 20, 2015 at 12:18:41AM -0700, Jeremy Sequoia wrote:
> >>>> Yeah, I thought about sleeping before retrying in the EAGAIN case to
> >>>> avoid a possible busy loop.  I can do that if you prefer.
> >>>> 
> >>>> As I indicated in the commit message, there is know known fallout from
> >>>> the lack of EAGAIN handling.  There is no behavioral problem.  Indeed
> >>>> the only time someone should ever get back EAGAIN from poll or select
> >>>> on darwin is under resource pressure, and its likely the user would
> >>>> have bigger concerns than this at that point.
> >>>> 
> >>>> I just happened to notice this while tracing code to figure out why
> >>>> someone on stackoverflow was seeing recv() of the DISPLAY socket
> >>>> erring out with EAGAIN and then hanging.
> >>> 
> >>> If Darwin/OSX returns EAGAIN to a blocking call under *any*
> >>> circumstances, including "resource pressure", that's a serious bug.
> >>> Don't work around it in XCB or any other library, *especially* because
> >>> no other platform should behave the same way.  EAGAIN means "The socket
> >>> is marked nonblocking and the receive operation would block, or a
> >>> receive timeout had been set and the timeout expired before data was
> >>> received."  
> >> 
> >> No, that is not what EAGAIN means.  From SUSv4 at https://urldefense.proofpoint.com/v2/url?u=http-3A__pubs.opengroup.org_onlinepubs_9699919799_functions_poll.html&d=BQIBAg&c=eEvniauFctOgLOKGJOplqw&r=UaoPsU3Wgwl0YJPmjBVM0jyEVkD-hIP4wNFk_7YgTEE&m=b79atDQl6jtM7bQJnkNie1ThegJwAhDJkHqH6ZBsmeQ&s=8rN43F7_wUVFVOedp3SA7SqafUll4tbQU32iZKnmHM0&e= 
> >> 
> >> """
> >> The poll() function shall fail if:
> >> 
> >> [EAGAIN]
> >> The allocation of internal data structures failed but a subsequent request may succeed.
> >> ...
> >> """
> > 
> > Ah, I see; I'd forgotten that the spec actually allows EAGAIN and
> > EWOULDBLOCK to be different.  EWOULDBLOCK definitely has the semantics I
> > had in mind and that the Linux manpage documents; from
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__pubs.opengroup.org_onlinepubs_9699919799_functions_V2-5Fchap02.html-23tag-5F15-5F03&d=BQIBAg&c=eEvniauFctOgLOKGJOplqw&r=UaoPsU3Wgwl0YJPmjBVM0jyEVkD-hIP4wNFk_7YgTEE&m=b79atDQl6jtM7bQJnkNie1ThegJwAhDJkHqH6ZBsmeQ&s=T2bl08Kgddw2duANE9MM75ZPc0SHqKhrvCy9gKYMFPE&e= 
> > 
> >> Operation would block. An operation on a socket marked as non-blocking has encountered a situation such as no data available that otherwise would have caused the function to suspend execution.
> > 
> > But sure enough, for EAGAIN it says "Resource temporarily unavailable.
> > This is a temporary condition and later calls to the same routine may
> > complete normally."  So if an implementation ignores the spec language
> > saying "A conforming implementation may assign the same values for
> > [EWOULDBLOCK] and [EAGAIN]." and makes them separate, EAGAIN can indeed
> > mean the kernel is making its internal problems the application's
> > problems and requiring the application to try again.  Sigh.
> > 
> >>> A blocking call with no timeout should never return EAGAIN;
> >>> it should either block or return some fatal error.
> >> 
> >> Not according to UNIX.
> > 
> > s/EAGAIN/EWOULDBLOCK/ and the statement holds.
> 
> Yep!
> 
> >>> Libraries should *definitely* not have to include "wait a bit and try
> >>> again" logic; that's the kernel's job.
> > 
> > I stand by this statement, but evidently the spec allows this particular
> > bit of ridiculosity.  Personally, I'd argue that if the kernel has a
> > resource allocation failure, it should be returning -ENOMEM.
> 
> I agree, but sadly nobody consulted either you or I when writing the SUS.
> 
> > Could I talk you into adding a "EAGAIN != EWOULDBLOCK && " before
> > checking for EAGAIN?  That way, the "retry immediately on EAGAIN" logic
> > will only run on platforms where EAGAIN *doesn't* have the same meaning
> > as EWOULDBLOCK's "this is non-blocking and would block".  On platforms
> > that define those two identically, the extra logic will constant-fold
> > away.
> 
> They won't constant fold because we're not checking for EWOULDBLOCK
> because it doesn't really make sense in this case.  I don't think any
> implementation of poll(2) or select(2) would return EWOULDBLOCK
> because it doesn't really make sense to have non-blocking
> implementations of those syscalls.  The whole point of those syscalls
> is to block until data is available.

That's not what I mean.  "EAGAIN != EWOULDBLOCK" constant-folds into 0
on a system where those two are equal.  So, something like "if (EAGAIN
!= EWOULDBLOCK && errno == EAGAIN) { loop and try again }" will fold
away to nothing except on a system that has EAGAIN as a separate error
from EWOULDBLOCK, which conveniently matches those systems where
retrying on EAGAIN makes sense.

- Josh Triplett


More information about the Xcb mailing list