[libnice] Sending packets uses a static, global lock?

Lorenzo Miniero lminiero at gmail.com
Thu Oct 13 19:27:20 UTC 2016


2016-10-13 16:44 GMT+02:00 Olivier Crête <olivier.crete at collabora.com>:

> Hi,
>
> The lock should just be per agent. The global lock was a workaround for
> the crash that happens when the timeout GSource fires off while the object
> is being destroyed and then has an invalid agent pointer in the callback.
> Possibly we just need to use a GWeakRef for the agent in all of the
> callbacks, possibly by doing a wrapper using agent_timeout_add_with_context().
> And then we can also remove the "if (g_source_is_destroyed
> (g_main_current_source ())) {" everywhere and centralize it all in the
> wrapper...
>
>

Hi Olivier,

thanks for the clarification. From what I understood of how sources work in
glib, this sounds like a reasonable plan. Do you want me to open an issue
somewhere to keep track of these activities? I noticed there seems to be
already one that looks related, but I'm not sure what its current state it:

https://phabricator.freedesktop.org/T93

If there's anything I can do to help, just let me know. I'll be abroad next
week for a conference, but when I come back I'm willing to work on the code
to help fix this.

Thanks,
Lorenzo



> Olivier
>
> On Thu, 2016-10-13 at 11:10 +0200, Lorenzo Miniero wrote:
>
> 2016-10-13 6:46 GMT+02:00 Anand Sivaram <aspnair at gmail.com>:
>
> What about having a global flag for the lock type?
> Set it with g_object_set during the library initialization.
> It should have 3 values - Off, Per Agent, Global.
> Modify agent_lock() and agent_unlock() accordingly.
>
> That way if the applications using libnice implemented their own locking
> mechanism, that could be used seamlessly.
>
>
>
>
> That looks like a smart approach, especially as it would be flexible: the
> off part is particularly interesting, as I have locks all over the place
> myself! :-). I'm still concerned about whether or not this can result in
> issues with the glib2 stuff libnice uses internally, though, and that
> applications don't have access to? There may be private methods invoked
> that need that locking, something the application from outside may not be
> able to protect.
>
> Lorenzo
>
>
>
>
> anand
>
>
>
>
> On 12 October 2016 at 21:35, Lorenzo Miniero <lminiero at gmail.com> wrote:
>
> Thanks for the context! (and sorry for the delay in this response...)
>
> I'm not familiar enough with glib2, unfortunately: considering the patch
> dates back to about 7 years ago, is the problem the global muxed fixed
> still an issue with current versions? We did a quick and dirty test by
> turning those global agent locks to per-agent locks, where possible (that
> is, in methods that actually provided a pointer to an agent instance) and
> we did notice improvements in the performance (about twice as many sessions
> before it started to break down), but of course this means little if it
> risks causing those race conditions and crashes again. Is this road worth
> exploring? I'm willing to help and contribute to such work, of course,
> although I admit I'm not an expert on the libnice internals (but it's never
> too late to learn! ;-) ).
>
> Thanks,
> Lorenzo
>
>
> 2016-10-05 13:09 GMT+02:00 Miguel París Díaz <mparisdiaz at gmail.com>:
>
> Sorry for the copy-paste error.
>
>
>
> 2016-10-05 13:06 GMT+02:00 Miguel París Díaz <mparisdiaz at gmail.com>:
>
> Good point Lorenzo!!!
>
> This change was added in this commit:
>
> commit 2faa76a4840100bebd9532bd47a2755874c5b3a0
> Author: Youness Alaoui <youness.alaoui at collabora.co.uk>
> Date:   Tue Jun 16 20:40:33 2009 -0400
>
>     Use a global mutex for all nice agents and use g_source_is_destroyed
> to avoid the race condition of a source being destroyed while we wait for
> the mutex
>
> And then the RecMutex was changed by a Mutex in:
>
>
> commit 1deee69325284c726c3a8380a7d5839a51e20c48
> Author: Olivier Crête <olivier.crete at collabora.com>
> Date:   Mon Feb 24 18:51:31 2014 -0500
>
>     agent: Replace recursive mutex with non-recursive
>
>     This should prevent us from re-adding re-entrancy in the future
>
>
>
>
> 2016-10-05 11:45 GMT+02:00 Lorenzo Miniero <lminiero at gmail.com>:
>
> Hi all,
>
> we've been working on some stress tests of our Janus WebRTC server, which
> uses libnice for all the ICE related stuff. While it works great in
> general, we reach a wall much sooner than we expect, and performance start
> to degrade quite a bit. By doing some digging and diagnostic in the code we
> wrote and the one we use, we found out something we thought was weird.
>
> Specifically, we noticed that the internal method that is used to send
> packets, nice_agent_send_messages_nonblocking (which nice_agent_send
> itself uses internally) has a lot of code wrapped in an
> agent_lock()/agent_unlock(), which locks/unlocks a global mutex, and this
> includes the process of physiclly sending the packet. This means that the
> whole library is locked until that code is called, which seems overkill. I
> guess the same happens in other parts of the code as well. I'd understand
> locking the specific agent if the purpose is thread-safety, but why lock
> ALL agents at the same time for each packet you send? I tried searching the
> archives for some info on that, but couldn't find any discussion related to
> this choice.
>
> Since I'm already handling thread-safety for most stuff in my own code, is
> there any way for preventing this from happening? I guess that one possible
> way could be to retrieve the file descriptor from the agent object and send
> the packet manually myself, but while this may work in simple
> (host/prflx/srflx) connectivity, this would definitely not be the case
> when, for instance, TURN is involved, as that's one of the things the
> library would take care of for me.
>
> Looking forward to your thoughts, thanks!
> Lorenzo
>
> _______________________________________________
> nice mailing list
> nice at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nice
>
>
>
>
>
> --
> Miguel París Díaz
> ------------------------------------------------------------------------
> Computer/Software engineer.
> Researcher and architect in http://www.kurento.org
> http://twitter.com/mparisdiaz
> ------------------------------------------------------------------------
>
>
>
>
> _______________________________________________
> nice mailing listnice at lists.freedesktop.orghttps://lists.freedesktop.org/mailman/listinfo/nice
>
>
>
>
> --
>
> Olivier Crête olivier.crete at collabora.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/nice/attachments/20161013/e4448f00/attachment-0001.html>


More information about the nice mailing list