Encoding speech utterances in flac (discontinuous chunks problem)

Alex K anxioz at yahoo.com
Sun Feb 26 13:09:15 PST 2012


Hello, 

I am working on extracting speech out of a live microphone stream. The speech must be in  flac format and stored in memory for further processing. 

Currently I am using pocketsphinx's vader plugin to do voice activity detection. And a fakesink in order to store the result in memory without writing it to file. 

The pipeline that I currently have looks like this:
"gconfaudiosrc ! audioconvert ! audioresample ! vader auto-threshold=true ! flacenc ! fakesink"

The vader plugin provides two signals to indicate the start and end of a speech utterance:
1) vader-start
2) vader-stop

I use the fakesink's handoff signal in order to buffer the incremental results, and finally I hook up to vader's "vader-stop" and "vader-start" signals to flush the buffer and further process it. Currently I am just dumping the results to different files (each file is a different utterance) to play it back to examine it. 

The problem is with flacenc. If I don't use flacenc but rather just dump the raw audio, the speech utterances are clearly marked. However if I add flacenc to the pipeline, the final 1 second of the previous utterance gets put into the start of the next utterance and messes up the result.

Another problem is that the audio data passed by the vader plugin is in discontinuous (in terms of timestamps) chunks. A speech might start at 1s and end at 5s. Then another speech segment might start at 15s and end at 18s. The problem is that the flacenc plugin doesn't like that and I'm not sure how to reset the clock at the end of each speech utterance. I tried using audiorate but that inserted X amount of silence at the beginning to compensate for the different timestamps. 

Can anyone help me find a reasonable solution to my problems? 

Thank you in advance,
Alex. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/gstreamer-devel/attachments/20120226/db290721/attachment.htm>


More information about the gstreamer-devel mailing list