[Xcb] [PATCH] Handle EAGAIN errno from poll(2) or select(2)
Josh Triplett
josh at joshtriplett.org
Sat Aug 22 18:43:27 PDT 2015
On Sat, Aug 22, 2015 at 10:52:17AM -0700, Jeremy Huddleston Sequoia wrote:
>
> > On Aug 22, 2015, at 10:30, Josh Triplett <josh at joshtriplett.org> wrote:
> >
> > On Sat, Aug 22, 2015 at 02:33:46AM -0700, Jeremy Huddleston Sequoia wrote:
> >>
> >>> On Aug 20, 2015, at 09:21, Josh Triplett <josh at joshtriplett.org> wrote:
> >>>
> >>> On Thu, Aug 20, 2015 at 12:18:41AM -0700, Jeremy Sequoia wrote:
> >>>> Yeah, I thought about sleeping before retrying in the EAGAIN case to
> >>>> avoid a possible busy loop. I can do that if you prefer.
> >>>>
> >>>> As I indicated in the commit message, there is know known fallout from
> >>>> the lack of EAGAIN handling. There is no behavioral problem. Indeed
> >>>> the only time someone should ever get back EAGAIN from poll or select
> >>>> on darwin is under resource pressure, and its likely the user would
> >>>> have bigger concerns than this at that point.
> >>>>
> >>>> I just happened to notice this while tracing code to figure out why
> >>>> someone on stackoverflow was seeing recv() of the DISPLAY socket
> >>>> erring out with EAGAIN and then hanging.
> >>>
> >>> If Darwin/OSX returns EAGAIN to a blocking call under *any*
> >>> circumstances, including "resource pressure", that's a serious bug.
> >>> Don't work around it in XCB or any other library, *especially* because
> >>> no other platform should behave the same way. EAGAIN means "The socket
> >>> is marked nonblocking and the receive operation would block, or a
> >>> receive timeout had been set and the timeout expired before data was
> >>> received."
> >>
> >> No, that is not what EAGAIN means. From SUSv4 at https://urldefense.proofpoint.com/v2/url?u=http-3A__pubs.opengroup.org_onlinepubs_9699919799_functions_poll.html&d=BQIBAg&c=eEvniauFctOgLOKGJOplqw&r=UaoPsU3Wgwl0YJPmjBVM0jyEVkD-hIP4wNFk_7YgTEE&m=b79atDQl6jtM7bQJnkNie1ThegJwAhDJkHqH6ZBsmeQ&s=8rN43F7_wUVFVOedp3SA7SqafUll4tbQU32iZKnmHM0&e=
> >>
> >> """
> >> The poll() function shall fail if:
> >>
> >> [EAGAIN]
> >> The allocation of internal data structures failed but a subsequent request may succeed.
> >> ...
> >> """
> >
> > Ah, I see; I'd forgotten that the spec actually allows EAGAIN and
> > EWOULDBLOCK to be different. EWOULDBLOCK definitely has the semantics I
> > had in mind and that the Linux manpage documents; from
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__pubs.opengroup.org_onlinepubs_9699919799_functions_V2-5Fchap02.html-23tag-5F15-5F03&d=BQIBAg&c=eEvniauFctOgLOKGJOplqw&r=UaoPsU3Wgwl0YJPmjBVM0jyEVkD-hIP4wNFk_7YgTEE&m=b79atDQl6jtM7bQJnkNie1ThegJwAhDJkHqH6ZBsmeQ&s=T2bl08Kgddw2duANE9MM75ZPc0SHqKhrvCy9gKYMFPE&e=
> >
> >> Operation would block. An operation on a socket marked as non-blocking has encountered a situation such as no data available that otherwise would have caused the function to suspend execution.
> >
> > But sure enough, for EAGAIN it says "Resource temporarily unavailable.
> > This is a temporary condition and later calls to the same routine may
> > complete normally." So if an implementation ignores the spec language
> > saying "A conforming implementation may assign the same values for
> > [EWOULDBLOCK] and [EAGAIN]." and makes them separate, EAGAIN can indeed
> > mean the kernel is making its internal problems the application's
> > problems and requiring the application to try again. Sigh.
> >
> >>> A blocking call with no timeout should never return EAGAIN;
> >>> it should either block or return some fatal error.
> >>
> >> Not according to UNIX.
> >
> > s/EAGAIN/EWOULDBLOCK/ and the statement holds.
>
> Yep!
>
> >>> Libraries should *definitely* not have to include "wait a bit and try
> >>> again" logic; that's the kernel's job.
> >
> > I stand by this statement, but evidently the spec allows this particular
> > bit of ridiculosity. Personally, I'd argue that if the kernel has a
> > resource allocation failure, it should be returning -ENOMEM.
>
> I agree, but sadly nobody consulted either you or I when writing the SUS.
>
> > Could I talk you into adding a "EAGAIN != EWOULDBLOCK && " before
> > checking for EAGAIN? That way, the "retry immediately on EAGAIN" logic
> > will only run on platforms where EAGAIN *doesn't* have the same meaning
> > as EWOULDBLOCK's "this is non-blocking and would block". On platforms
> > that define those two identically, the extra logic will constant-fold
> > away.
>
> They won't constant fold because we're not checking for EWOULDBLOCK
> because it doesn't really make sense in this case. I don't think any
> implementation of poll(2) or select(2) would return EWOULDBLOCK
> because it doesn't really make sense to have non-blocking
> implementations of those syscalls. The whole point of those syscalls
> is to block until data is available.
That's not what I mean. "EAGAIN != EWOULDBLOCK" constant-folds into 0
on a system where those two are equal. So, something like "if (EAGAIN
!= EWOULDBLOCK && errno == EAGAIN) { loop and try again }" will fold
away to nothing except on a system that has EAGAIN as a separate error
from EWOULDBLOCK, which conveniently matches those systems where
retrying on EAGAIN makes sense.
- Josh Triplett
More information about the Xcb
mailing list