webrtcdsp voice detection

Dejan Cotra Dejan.Cotra at nttdata.com
Fri Apr 15 12:47:19 UTC 2022


Hi Nicolas,

Thank you that was very helpful. 

I have one similar question. I also play around with facedetect element. I know that I can retrieve information about face from bus message emitted by facedetect element.

Is there a way to retrieve informations about face from video frame metainfo? Something similar to voice_activity in GstAudioLevelMeta?

Br,
Dejan

-----Original Message-----
From: Nicolas Dufresne <nicolas at ndufresne.ca> 
Sent: Friday, April 8, 2022 3:31 PM
To: Discussion of the development of and with GStreamer <gstreamer-devel at lists.freedesktop.org>
Cc: Dejan Cotra <Dejan.Cotra at nttdata.com>
Subject: Re: webrtcdsp voice detection

Le vendredi 08 avril 2022 à 11:10 +0000, Dejan Cotra via gstreamer-devel a écrit :

[...]
>  
> I know that I can retrieve informations from webrtcdsp voice detection 
> via bus messages. I receive GST_MESSAGE_ELEMENT message from webrtcdsp 
> element with payload like this:
>  
> voice-activity, stream-time=(guint64)2640000000, stream-has- 
> voice=(boolean)false;
>  
> Question is can I retrieve informations about voice detection in some 
> other way. Like metainfo of each sample that I pull from appsink 
> element? Or something similar?

It also sets the voice_activity boolean in GstAudioLevelMeta (along with the audio amplitude). This is per buffers, not per samples. So you get feedback every 10ms more or less.

Nicolas


More information about the gstreamer-devel mailing list