<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> </head> <body> Thanks Nicolas, that's very helpful :) It seems to be working now - any tips for getting the best results, specifically for detecting a person speaking with background noise? Or to make it even harder, background chatter I see that webrtc-audio-processing is up to version 1.1 in the source repo... has there been any significant changes or improvements since the 0.3.1 release that Ubuntu has packaged? Rob <div class="moz-cite-prefix">On 30/03/2022 18:18, Nicolas Dufresne via gstreamer-devel wrote: </div> <blockquote type="cite" cite="mid:da6e4cff9c56935cc8c6be694fccdc062686c967.camel@ndufresne.ca"> <pre class="moz-quote-pre" wrap="">Le mercredi 30 mars 2022 à 16:37 +0100, Rob Agar via gstreamer-devel a écrit : </pre> <blockquote type="cite"> <pre class="moz-quote-pre" wrap="">Hi all I'm looking at the webrtcdsp plugin with a view to using it to detect people speaking, but it's not terribly clear how to use it programmatically. Is there any example code for voice activity detection? </pre> </blockquote> <pre class="moz-quote-pre" wrap=""> I must admit, the documentation could be improved. The voice activity will be delivered using an element message. So you can handle this similarly to other messages (EOS, ERROR, etc.). The type is GST_MESSAGE_ELEMENT, it will contain a GstStructure with a name set to "voice-activity". So something like: const GstStructure *s = gst_message_get_structure (msg); if (msg->type == GST_MESSAGE_ELEMENT && gst_structure_has_name (s, "voice-activity")) { gboolean has_voice = FALSE; gst_structure_get_boolean (s, "stream-has-voice", &have_voice); . . . } As a reference, heres the code the emit the message: s = gst_structure_new ("voice-activity", "stream-time", G_TYPE_UINT64, stream_time, "stream-has-voice", G_TYPE_BOOLEAN, stream_has_voice, NULL); GST_LOG_OBJECT (self, "Posting voice activity message, stream %s voice", stream_has_voice ? "now has" : "no longer has"); gst_element_post_message (GST_ELEMENT (self), gst_message_new_element (GST_OBJECT (self), s)); If you need more fine grain information, the state of the voice activity is also available inside GstAudioLevelMeta, which is attached to every buffer. Nicolas </pre> </blockquote> </body> </html>