[Nice] New API and refactor

Wed Jul 30 10:59:18 PDT 2008

Rémi Denis-Courmont wrote:
> On Monday 21 July 2008 23:54:34 ext Youness Alaoui, you wrote:
>   
>>> No. Binding discovery is part of STUN. So are the timers. You could run
>>> ICE with any other binding discovery scheme (TURN, Teredo server, UPnP
>>> IGD, NAT-PMP, SOCKS...). This is part of STUN. Both in the spec, and in
>>> the architecture. The timers are part of STUN. They simply are.
>>>       
>> Yes, that's true, but those timers only apply for UDP, not for TCP or TLS.
>> I wouldn't mind having those timers implementation as utility functions
>> for the library.
>>     
>
> Sure. The timer API is already separated - the timer.h and timer.c only depend 
> on libc, not on anything else from the STUN library. Similarly the 
> transaction code is independent from the STUN message code only depends on 
> libc and the timer code. If I recall correctly, stun-msg.h was included to 
> get the buffer size.
>   
yes, timer is god, I didn't modify it. but the transport code is not 
since it does the validation of the stun message (in 
stun_trans_preprocess) and it parses the message for ALTERNATE-SERVER... 
it also depends on the timer code, which I didn't think was necessary.
> I would not think that 400 lines of code grant for a separate shared object 
> with "clean opaque" structure types.
>   
You're right. After refactoring that code, I found that it was best to 
keep it all in one library, but the usage stuff is now named 
stun_usage_* and the includes are "stun/usages/bind.h" and 
"stun/usages/ice.h"
>   
>> I would like to separate them as two libraries.. stun-format and
>> stun-usage, or something like that. Anyways, we're not implementing the
>> whole STUN specs as far as I know (I don't see the TCP or TLS
>> implementation, or the DNS discovery). Maybe it will be better once
>> refactored, but the problem now is that it's not very extensible and
>> does too much stuff that it shouldn't be doing, so if we want to keep
>> those timers working even for TLS sockets, it needs to be refactored.
>>     
>
> The timer code already supports TCP/TLS. The transaction layer only 
> partially - because ICE-TCP is in too much of a limbo to know how to deal 
> with it.
>   
The timer code supports reliable transport, yep, the transport layer, 
not sure about that.. it should in theory since the user passes the 
socket/socket_family, but it's not tested of course.
>   
>> Yes, adding timers and error cases, etc.. will add code, but the purpose
>> of the library is to make it easy to do STUN, and not "everything in one
>> function". It would be nice if glib had a one function "g_do_voip"... no
>> it doesn't, but it gives a lot of small functions that makes it easier
>> to build something to do that. Or a more realistic example, farsight
>> doesn't have a 'farsight_start_audio(ip, port)' function... That's just
>> the way I see it.
>>     
>
> That's pointless a discussion. libc has the getaddrinfo() function, but you 
> can also use the lower-level DNS querying functions. Or you can use the DNS 
> formatting function and sockets manually. Or you can even reimplement 
> everything from scratch. So what?
>   
I'm not saying we should implement DNS, I'm just saying we're not 
implementing everything anyway.
> The real question is whether we actually NEED to extend this piece of code, 
> for some actual use case that does not fit the current model. Maybe we do, 
> but I have not seen it so far.
>   
And yes, we need to make the usages completely separated from the stun 
agent and the transport, simply because of TURN and google relay.. the 
google relay uses Send requests and Data indication messages to 
send/receive data, with the stun message being inside the data. (so you 
get a stun message wrapped inside a stun message).
>   
>> As I said, I understand your point of view, and I agree with most of
>> what you said. The best solution would probably be somewhere in the
>> middle. I will refactor the bind usage into something less 'obscure',
>> and I'll make the timers as a utility function, so anyone can use them
>> for any usage they want (it's not bind usage specific, it's 'sending
>> over UDP' specific).
>>     
>
> As I said, that's already done at the source code level. You can do it at the 
> Makefile level if you so wish.
>
> (...)
>   
>> Right now, the bind usage has its own agent in there, it does some
>> mallocs, creates sockets, sends data, etc.. We need to keep it simple,
>> no malloc, no socket mess, no agent created by the usage itself (I have
>> one agent in ice, and another one in the bind usage, and they don't
>> share the same list of sent IDs, and one can't validate a response to a
>> request sent from the other, so it's confusing).
>>     
>
> It only creates socket if the application does not bother doing so. In the ICE 
> case, that code is (as far as I remember) not used.
>   
Well, that's the point, it doesn't "need" to create or use the socket.. 
as I said before, google relay for example (TURN as well, I'm sure) 
isn't just about a socket, it's about wrapping the whole message into 
Send requests, so it just should not take care of that.
> As for mallocs - I don't know. You've lost me. I thought you wanted to layer 
> everything über nicely. If you remove mallocs from the STUN you will have to 
> expose structures more to the outside.
>   
the malloc is for the stun_bind_t context structure, which is not 
necessary IMHO. With the new API, you don't need all that, there's no 
context for you to keep.
>   
>> Is that all ok with you ?
>>     
>
> I would rather not: add mallocs on the fast path, remove memory error 
> handling, and/or add more memory copying on the RX/TX path. Other than that, 
> I don't really care.
>
>   

Anyways, I'm done refactoring, so here is a summary of what I've done 
and the API changes (mail copied from another mailing list) :
[quote]
BEFORE :
the bind and ice usage needed a stun_bind_t context to work, which 
itself took care of the transport and timers, etc.. and did everything 
for you. The stun_bind_t context had its own StunAgent and would take 
care of validating the stun message for you. You would do 
stun_bind_start() or stun_conncheck_start() to start sending a bind or 
ice STUN message, and use stun_bind_elapse() to make it update the timer 
and resend if necessary. It also took care of all the message parsing 
(with stun_bind_process) and its own transport layer was the one 
validating the stun message and would automatically handle any 
ALTERNATE-SERVER attribute.

AFTER :
The usages are now completely independant of the transport (so we can 
add TURN and gtalk relay support) and separated from the timer. You now 
have stun_usage_bind_create() to create a stun bind message, and 
stun_usage_ice_create() to create an ICE message. You can then use the 
stun_timer_* usage API to add the timer support. Once you receive a stun 
message, you take care yourself of validating it then pass the 
StunMessage to stun_usage_bind_process or stun_usage_ice_process which 
will take care of processing it for you.
We now have one StunAgent for the whole libnice library, instead of 
having one StunAgent in libnice then one StunAgent per candidate in each 
stun_bind_t context.

I also worked on designing the transport layer for the google relay 
support :
I will remove the nice_udp_socket_factory layer and replace it with a 
similar one that will be used as a transport layer that will 
transparently implement either UDP or the google relay stuff. When 
libnice times out, the transport layer will replace the UDP transport by 
the google relay transport and will reset its state machine and restart. 
The google relay transport will take care of building the Stun messages, 
doing the Send requests wrapping the data, and unpack the Data 
indication messages for libnice.
The nice_udp_socket_factory will be removed from the API and will be 
transparently handled by libnice, this way, the user doesn't need to 
provide the library with the udp socket factory anymore.

[/quote]

To extend this a bit, now what a user needs to do is something like this :
1 - create a stun_timer_t,
2 - create a StunAgent and StunMessage,
3 - call stun_usage_bind_create(agent, msg);
4 - start the timer
5 - have a timeout depending on the remainder of the stun_timer
6 - use stun_timer_refresh when necessary and handle whether he needs to 
retransmit or timeout
7 - receive the response and call stun_usage_bind_process(msg, &addr, 
&alternate_server)
8 - if the return value is STUN_USAGE_BIND_RETURN_ALTERNATE_SERVER, 
handle the alternate server
9 - if the return value is STUN_USAGE_BIND_RETURN_SUCCESS, he's done...

It IS more work, yes, but it is also the only way to provide a more 
useful API that can be used for something other than just doing a simple 
UDP bind.
For convenience, I left the stun_usage_bind_run() function, so if anyone 
just needs a simple UDP bind request, he can just use that synchronous 
API. You can also look at it to see how the new API should be used and 
what was removed from the stun_bind_start (the transport usage is not 
used by libnice though).

Thanks!
Youness