[libnice] Inquiry re: commit 1ab9d7c104, “conncheck: Separate valid and succeeded states"

Lorenzo Miniero lminiero at gmail.com
Mon Apr 10 16:42:18 UTC 2017


2017-02-27 22:30 GMT+01:00 Chad Phillips <chad at apartmentlines.com>:

> I’m using this WebRTC gateway: https://janus.conf.meetecho.com/
>
> I cannot get latest libnice master to work with this project — ICE hangs
> in the ‘connecting' state
>
> Some git bisect work got me to the problem commit:
>
> 1ab9d7c104978ea1904aaaad708c1c8c23c77592 is the first bad commit
> commit 1ab9d7c104978ea1904aaaad708c1c8c23c77592
> Author: Olivier Crête olivier.crete at collabora.com
> Date: Thu May 26 16:05:36 2016 -0400
>
> conncheck: Separate valid and succeded states
>
> RFC 5245 specifies that when a mapped-address differs from the address
> from the request was sent, the mapped-address is used to select the
> valid pair, but the source address of the check is used to select the
> pair that succeeded, so they are not the same.
>
> If I roll back this commit, then Janus will work with libnice master, ICE connections complete fine.
>
> The Janus lead dev looked at that commit, and couldn’t figure out exactly what it does, or why it would break ICE connectivity when used with the Janus libnice support code.
>
> I was hoping you could shed some light on:
>
>  - What the added ‘valid’ flag does
>
>  - Where adjustments might be necessary in libnice support code as a result of this change
>
> Thanks,
>
> Chad
>
>

Hi all,

chiming in as I only now had some time to look into this.


Apparently what Chad found out is indeed partly true, and Janus sometimes
doesn't seem able to get a valid connectivity out of libnice. I mean partly
because we did some a few different tests, and while connectivity did
indeed NOT work when talking to a remote Janus, everything was working
correctly with Janus on our LAN or doing tests on a local machine.

One of my colleagues tried tracing what happens with the newly released
libnice 0.1.14 when it breaks and apparently, although there are valid
candidates, and we do see some "marking pair...as nominated" lines in the
debug, the nominated count always remains 0. Not sure if that's the cause,
but the result is that the ICE state never leaves the "connecting" state,
and so never gets to "connected".

He then tried a patch another user referenced on our issue page, and it
worked instead: https://phabricator.freedesktop.org/D735
Considering that patch only modifies a single line
in priv_add_peer_reflexive_pair, by basically setting the nominated
property of the new prflx candidate to the parent's rather than FALSE, I
guess this patch does indeed fix a problem with prflx candidates, which in
our particular case we do need (we're behind a symmetric NAT and prflx
always works for us, no need to do TURN if Janus is publicly reachable).
Not sure whether Chad has a different NAT configuration and yet the same
problem: if srflx should work for him, for instance, but it doesn't and the
prflx issue is breaking it for him, it may be an indication we're doing
something wrong with the management of remote candidates (but why did it
work so far then?).

Not sure if the fact we trickle has any relevance here. In fact, we don't
wait for the connection to be ready, but only for a valid selected pair to
consider connectivity to be available. More precisely, in Janus we wait for
the "new-selected-pair-full" callback to fire to decide we can start the
DTLS exchange and then do media: with that commit, though, ICE actually
never "connects" for us, and so that callback is never fired.

As Chad said, I couldn't figure out why any of the changes in the commit he
identified via git bisect should cause any trouble:
https://github.com/libnice/libnice/commit/1ab9d7c104978ea1904aaaad708c1c8c23c77592
I guess that in general the separation between "valid" and "succeeded" can
lead to some checks being done differently, especially for the prflx case
mentioned above, and so never leading to a success in the Janus case. That
said, I'm far from an expert when it comes to libnice internals, and so I'm
just speculating here.

Do you have any idea on what may be the issue? Any suggestion on where I
should look to figure out if it can be solved without resorting to the
patch above? (which seems to have been abandoned in the meanwhile).


Thanks in advance!
Lorenzo




>
> _______________________________________________
> nice mailing list
> nice at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nice
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/nice/attachments/20170410/2208d480/attachment.html>


More information about the nice mailing list