[Nice] Bug: Gather candidates never completes

Youness Alaoui youness.alaoui at collabora.co.uk
Mon Aug 23 15:40:51 PDT 2010


Hello again!

On 08/20/2010 02:47 PM, Tom Kaminski wrote:
> I've came across a scenario where gather candidate done event never
> gets triggered.
> 
> This happens if you disconnect one of your local network interfaces
> interfaces after nice agent has added it to it's local_addresses list.
>  The next time you try to gather candidates, it will issue the warning
> in nice_agent_gather_candidates()
> 
> if (!host_candidate) {
>         g_warning ("No host candidate??");
>         return FALSE;
> }
> 
> I changed it to
> 
> if (!host_candidate) {
>         g_warning ("No host candidate??");
>         continue;
> }
> 
> To fix this problem.

Actually you're wrong, it's not a bug where gather-candidates never completes..
it actually returns FALSE, which means that it already completed (with an error
state). The candidate-gathering-done signal would only be sent if it returned TRUE.

I see your problem though.. if the add_local_address was wrong somehow, then it
will never be able to gather candidates, and there is no way to remove a
previously added address. Anyways, in my opinion it shouldn't really happen if
you do the add_local_address and immediately follow it with gather_candidates.
However, I've now fixed it not to return FALSE, but just ignore the address and
continue gathering candidates.


> 
> 
> Better yet, I think libnice should also repopulate it's
> agent->local_addresses list everytime nice_agent_gather_candidates()
> is called (in the case where the local addresses haven't been manually
> set).

That's the thing, what do you mean "everytime it's called"? It should only be
called once, that's it! Never call it twice (I don't even know what would happen
if you do that).

> 
> On a different note, is it safe to decrease the STUN_END_TIMEOUT value
> in timer.c?  It is painfully long (causing people to complain).  We
> had a previous discussion on this mail list about this timeout and the
> possibility of only performing stun discovery on the interfaces of
> interest (ie. the interfaces that are known to not timeout).  My idea
> was to find out which interface is being used to connect to my public
> server via tcp, but apparently the OS doesn't really provide
> information on which interface is being used by a given socket.  Any
> thoughts on this timeout issue would be appreciated.

Well, it's safe but it just means that on a slow network, the stun/connchecks
will sometimes timeout when they are actually just late.. But I do believe the
timeout to be too big.. Actually the RFC doesn't say 4800ms, it says :
RFC3489 : "At 9500ms, the client considers the transaction to have failed if no
response has been received."
RFC5389 : "If the client has not received a response after 39500 ms, the client
will consider the transaction to have timed out."
So yeah.. 9.5seconds for the old RFC and 39.5 seconds for the new RFC are just
ridiculous values!

ah crap, actually, the current code also has a timeout of 9 seconds! it first
retransmits after 600ms, then doubles that time, until it reaches 4800 (for the
retransmit timeout), so it will send a packet at time 0ms, 600ms, 1800ms, 4200ms
and 9000ms..
So I definitely wouldn't mind some better values, hoping that the timeout won't
be so long, and that it won't affect performance!
I'm thinking of breaking the ABI in the next release, and I'll take this
opportunity to allow the stun_usage_timer to be configurable and make the STUN
retransmissions timeout after 2 seconds for STUN discovery (not for connectivity
checks).

On a side note, you can get the interface on which a given socket is.. well, not
the interface, but the IP, which is all you need for
nice_agent_add_local_address. Just use the POSIX getsockname() function.. I
think you work on Windows so :
http://msdn.microsoft.com/en-us/library/ms738543(VS.85).aspx



> 
> I was also wondering if anyone has advice/recommendations what's the
> best way to use libnice with a router that has port forwarding enabled
> for a range of UDP ports.  My understanding is that by setting up port
> forwarding will increase the likelihood that connection establishment
> will succeed (especially for routers that do not work with the
> existing NAT punchthrough method used by libnice).

Right now, you can't force libnice to use a specific port for gathering local
candidates.. but I saw a few people interested in that. It would be nice to have
a solution for that. Last time we spoke, you suggested a new API to allow adding
local addresses in which you specify whether you want STUN/TURN discovered on
these addresses/interfaces, it would be nice to have this new method available
and have it also allow users to specify a port (or a port range) to use for
binding the local socket.
Like I said last time, I have no time to work on libnice at the moment, but I'm
accepting patches.


> 
> Thanks,
> Tom
> _______________________________________________
> Nice mailing list
> Nice at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nice


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/nice/attachments/20100823/11ccfeb1/attachment.pgp>


More information about the Nice mailing list