[libnice] Pseudotcp performance (with focus on Windows)

Radosław Kołodziejczyk radek.kolodziejczyk at gmail.com
Wed Aug 20 06:33:13 PDT 2014


Hello,

Our test programs are made of our libraries wrapped around libnice so it
would
not be easy to send this for you to test. We did, however, discover
something
new that should make it easier to investigate. As we were preparing a
test that we could send you to verify the problem we noticed that when the
program
consists only of client and server, when client's only job is to send data
which
server receives and dumps, the transfer is actually really good. Comparable
with
TCP even (on some machines, some still have speed issue, but not that
great). However, when you add a case where server behaves as a echo machine
(not even that - it suffices for server to respond with even small 4B
packets)
the transfer goes down by great amount. These are our observations on our
most
problematic machine:

Program specs:
CLIENT    - sends 32 kB in loop
SERVER - receives the message, ignores it.

Average transfer: ~1MB

Program specs:
CLIENT - sends 32 kB in loop
            - waits for 4B response.

SERVER - receives 32 kB message
              - sends 4B response.

Average transfer: 10-70 KB/s

All of this is done on localhost so even the first transfer is not that
spectacular, but
it's much, much better than what we have with the two-side sending. So
maybe that
could be some clue.

Speaking of your latest code from git master - we've some issues with
connecting
the agents on system where 0.1.7 version connected without any issues. On
Windows
XP the problem was for both reliable and unreliable agents. On 7 only
reliable agents
had problem connecting. My colleague prepared a package with logs from
successfull
connect on 0.1.7 and unsuccessfull on the code from git. I know it is not
yet an
official release, but we thought you might find it interresting or it might
help you
in development. Here it is:

https://dl.dropboxusercontent.com/u/94203634/2014-08-01_nice_logs.tar.gz

Kind regards,
Radosław Kołodziejczyk




2014-08-13 1:31 GMT+02:00 Philip Withnall <philip at tecnocode.co.uk>:

> Hi,
>
> Sorry for the delay in looking at this. Without the test programs, it’s
> hard to reproduce and debug the issue. Can you make the test client and
> server code available please?
>
> That said, there have been several improvements to the pseudo-TCP code
> in git master recently. Have you tried your tests again with the latest
> git master of libnice?
>
> Philip
>
> On Tue, 2014-07-15 at 15:14 +0200, Radosław Kołodziejczyk wrote:
> > Hello again,
> >
> >
> > I've made the link available for longer. I'll keep it up untill it's
> > not needed anymore. Please let me know if you'd need something more.
> > I'll be looking into this next week but it's always good to have
> > insight especially from the authors themselves.
> >
> >
> > Kind regards,
> > Radosław Kołodziejczyk.
> >
> >
> > 2014-07-07 11:48 GMT+02:00 Radosław Kołodziejczyk
> > <radek.kolodziejczyk at gmail.com>:
> >         Hello,
> >
> >
> >         I've redone the tests with wireshark monitoring the process
> >         and the environment set the way you wanted. The 4s pause is
> >         clearly visible in Wireshark dump. It's a regular thing -
> >         there's a couple of kBs sent, then a 4s pause and the transfer
> >         resumes in full speed after a 66B packet is reveived by the
> >         client. Could that be a "missing" ACK? Anyway it seems that
> >         this packet triggers further transfer and the agents are
> >         waiting for its arrival.
> >
> >
> >         Our test is a simple echo client-server pair that sends random
> >         data and expect the exact data in return from server. I attach
> >         the Wireshark dump as well as nice logs from client and server
> >         test programs. I'll be very pleased to help debug / solve this
> >         issue so if you need anything more - I'm here to help. :)
> >
> >
> >         Btw I've cancelled my previous response, because the
> >         attachment was too big. Instead I've uploaded it to my public
> >         dropbox space and you can get it from here:
> >
> >
> >         https://dl.dropboxusercontent.com/u/94203634/niceDebug.tar.gz
> >
> >
> >
> >         The link will be valid for a week.
> >
> >
> >         Cheers!
> >
> >
> >         2014-07-06 14:49 GMT+02:00 Philip Withnall
> >         <philip at tecnocode.co.uk>:
> >
> >                 Hi,
> >
> >
> >                 On Fri, 2014-07-04 at 10:36 +0200, Radosław
> >                 Kołodziejczyk wrote:
> >
> >
> >                 > We moved forward a bit and digged in search for
> >                 where the hiccups
> >                 > occur. It seems that the data transfer is actually
> >                 quite fast and
> >                 > fluent, BUT every so often (by that I mean too
> >                 often) there is a huge
> >                 > gap between recv callbacks from agent. Exactly 4s
> >                 gaps. This totally
> >                 > kills the transfer rate as you might imagine.
> >                 Exactly 4s seemed to be
> >                 > "too perfect" to be accidental, so after a quick
> >                 grep I found this:
> >
> >
> >                 Thanks for looking at this in depth. Do you have any
> >                 Wireshark logs of
> >                 the timeout occurring, coupled with libnice debug
> >                 logs? Run your tests
> >                 with NICE_DEBUG=all G_MESSAGES_DEBUG=all environment
> >                 variables set and
> >                 attach the resulting debug logs (making sure there is
> >                 no private data in
> >                 them).
> >
> >                 > // If there are no pending clocks, wake up every 4
> >                 seconds
> >                 > #define DEFAULT_TIMEOUT 4000
> >                 >
> >                 > And sure - tinkering with this value provided
> >                 different transfer speeds (sometimes much, much higher
> >                 - almost on par with tcp). Of course changing this
> >                 default timeout should not be a solution. So my
> >                 question is - what may cause a scenario when this
> >                 timeout is reached really frequently and do you have
> >                 any idea how to improve to avoid it?
> >
> >
> >                 As Olivier said, this is most likely due to packet
> >                 loss. But perhaps
> >                 hitting a buggy case where the pseudotcp code doesn’t
> >                 recover well from
> >                 a lost packet.
> >
> >                 Philip
> >
> >
> >
> >                 _______________________________________________
> >                 nice mailing list
> >                 nice at lists.freedesktop.org
> >                 http://lists.freedesktop.org/mailman/listinfo/nice
> >
> >
> >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/nice/attachments/20140820/6ad96565/attachment.html>


More information about the nice mailing list