[Spice-devel] [PATCH spice-server] test-display-base: Avoid spurious errors due to listen failures

Thu Nov 8 19:51:20 UTC 2018

On Thu, 2018-11-08 at 11:57 -0500, Frediano Ziglio wrote:
> > 
> > On Mon, 2018-10-29 at 11:58 +0000, Frediano Ziglio wrote:
> > > To set up a listening socket usually you call in sequence:
> > > - socket;
> > > - bind;
> > > - listen.
> > > Usually if you try to listen to a post already in listening state
> > > the bind will fail.
> > 
> > I find this sentence confusing... If you try to *listen*, the
> > *bind*
> > will fail? Is that what you meant?
> > 
> > Also: "listen to a *post*"? What do you mean by post?
> > 
> 
> ehm... typo, was port, which make more sense.
> If you want to listen to a specific port you need to create the
> socket, bind to the port you want and call "listen" function.
> 
> Any advice on rephrasing (beside fixing the typo) ?

Aha, I see. The piece of information that I was missing was that you
were trying to listen to the same port from two different sockets. That
wasn't immediately clear to me.

Maybe:

"If you try to bind to a port when another socket is already listening
on that port, the bind will fail."

> 
> > > However is possible that the listen will fail,
> > > like in this sequence:
> > > - socket 1;
> > > - bind 1;
> > > - socket 2;
> > > - bind 2;
> > > - listen 1;
> > > - listen 2 <-- failure.
> > 
> > I don't understand why this sequence should fail. Can you explain a
> > bit
> > more?
> > 
> 
> Because the port is already used (bind 1 and bind 2 binded the same
> port
> on 2 sockets).

OK, as I mentioned above, it was not clear that they were both
listening to the same port. Including that information would be helpful
for understanding.

So maybe the introduction to the above section should be something
like:

"However, it is possible that the bind() may succeed and the listen()
will fail, as demonstrated in the following sequence:"

> 
> > > Our tests may attempt to reuse the port however we handle this
> > > problem
> > > trying to detect bind errors. Catch also listen errors to avoid
> > > the
> > > sequence above to trigger the problem. This happened some commit
> > > ago
> > > in our CI.
> > 
> > I have to admit that I still don't fully understand the point of
> > this
> > change. It seems that you're implying that if there is an error
> > where
> > we fail to listen, that's OK and the test shouldn't fail? Why is
> > this
> > error expected/OK? I think the commit log needs far more clarity
> > here.
> > What's the expected behavior? What's the actual behavior? How is it
> > reproducible? etc.
> > 
> 
> The code is expected to fix race conditions during testing so if it
> happens that you try to listen to a port already in listening state
> you just try another port. This can happen easily running multiple
> tests in parallel.

OK, this is another crucial bit of information for understanding the
patch. It explains *why* it happens. So maybe something like:

"When running tests (especially multiple tests running in parallel), it
may sometimes happen that there are other tests already listening on
the port that we are trying to use. In this case, we want to ignore
this error and simply try to listen on a different port. We already
attempted to handle this scenario, but we were only ignoring bind()
errors and not listen() errors. So in the scenario mentioned above, the
listen() error was causing the entire test to fail instead of allowing
us to try to listen on another port."

> 
> > > 
> > > Signed-off-by: Frediano Ziglio <fziglio at redhat.com>
> > > ---
> > >  server/tests/test-display-base.c | 13 +++++++------
> > >  1 file changed, 7 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/server/tests/test-display-base.c
> > > b/server/tests/test-
> > > display-base.c
> > > index ea3a23ba..f58f76d3 100644
> > > --- a/server/tests/test-display-base.c
> > > +++ b/server/tests/test-display-base.c
> > > @@ -894,10 +894,10 @@ void test_set_command_list(Test *test,
> > > Command
> > > *commands, int num_commands)
> > >      test->num_commands = num_commands;
> > >  }
> > >  
> > > -static gboolean ignore_bind_failures(const gchar *log_domain,
> > > -                                     GLogLevelFlags log_level,
> > > -                                     const gchar *message,
> > > -                                     gpointer user_data)
> > > +static gboolean ignore_in_use_failures(const gchar *log_domain,
> > > +                                       GLogLevelFlags log_level,
> > > +                                       const gchar *message,
> > > +                                       gpointer user_data)
> > >  {
> > >      if (!g_str_equal (log_domain, G_LOG_DOMAIN)) {
> > >          return true;
> > > @@ -905,7 +905,8 @@ static gboolean ignore_bind_failures(const
> > > gchar
> > > *log_domain,
> > >      if ((log_level & G_LOG_LEVEL_WARNING) == 0)  {
> > >          return true;
> > >      }
> > > -    if (strstr(message, "reds_init_socket: binding socket to ")
> > > ==
> > > NULL) {
> > > +    if (strstr(message, "reds_init_socket: binding socket to ")
> > > ==
> > > NULL || // bind failure
> > > +        strstr(message, "reds_init_socket: listen: ") == NULL) {
> > > //
> > > listen failure
> > >          g_print("XXX [%s]\n", message);
> > >          return true;
> > >      }
> > 
> > I have to say that I was not aware of this code before. mMatching a
> > substring from a logged warning and ignoring that warning when it
> > matches seems awfully hacky and fragile...
> > 
> 
> Well, any better suggestion? Keep in mind that's testing code, not
> going to run by customer (so no production level). A more clean
> (maybe)
> mock of system function could be harder to maintain.

Well, ideally the spice_server_init() function would do proper error
handling instead of unconditionally logging a warning. For example, it
could take a GError** parameter which would return an error object to
indicate what went wrong. Then the caller could check the error code
and decide how to handle it (e.g. ignore it, or log a warning). 

But since this is public API, there's probably not a lot that we can do
at the moment. And since this is only a test, I guess I won't complain
too loudly.

I'll ACK with commit log changes mentioned above.

Jonathon

> 
> > 
> > > @@ -929,7 +930,7 @@ Test* test_new(SpiceCoreInterface* core)
> > >      // some common initialization for all display tests
> > >      port = BASE_PORT;
> > >  
> > > -    g_test_log_set_fatal_handler(ignore_bind_failures, NULL);
> > > +    g_test_log_set_fatal_handler(ignore_in_use_failures, NULL);
> > >      for (port = BASE_PORT; port < BASE_PORT + 10; port++) {
> > >          SpiceServer* server = spice_server_new();
> > >          spice_server_set_noauth(server);
> > 
> > 
> 
> Frediano