Encoding speech utterances in flac (discontinuous chunks problem)
ensonic at hora-obscura.de
Tue Feb 28 07:32:23 PST 2012
On 02/26/2012 10:09 PM, Alex K wrote:
> I am working on extracting speech out of a live microphone stream. The
> speech must be in flac format and stored in memory for further
> Currently I am using pocketsphinx's vader plugin to do voice activity
> detection. And a fakesink in order to store the result in memory
> without writing it to file.
> The pipeline that I currently have looks like this:
> "gconfaudiosrc ! audioconvert ! audioresample ! vader
> auto-threshold=true ! flacenc ! fakesink"
> The vader plugin provides two signals to indicate the start and end of
> a speech utterance:
> 1) vader-start
> 2) vader-stop
> I use the fakesink's handoff signal in order to buffer the incremental
> results, and finally I hook up to vader's "vader-stop" and
> "vader-start" signals to flush the buffer and further process it.
What extactly are you doing in the vader-start/stop signal handlers?
> Currently I am just dumping the results to different files (each file
> is a different utterance) to play it back to examine it.
> The problem is with flacenc. If I don't use flacenc but rather just
> dump the raw audio, the speech utterances are clearly marked. However
> if I add flacenc to the pipeline, the final 1 second of the previous
> utterance gets put into the start of the next utterance and messes up
> the result.
You might need to mark the first buffer of each new utterance with a
> Another problem is that the audio data passed by the vader plugin is
> in discontinuous (in terms of timestamps) chunks. A speech might start
> at 1s and end at 5s. Then another speech segment might start at 15s
> and end at 18s. The problem is that the flacenc plugin doesn't like
> that and I'm not sure how to reset the clock at the end of each speech
> utterance. I tried using audiorate but that inserted X amount of
> silence at the beginning to compensate for the different timestamps.
Use a smaller buffersize on the capture size or write your own chunking
element. There is also a "removesilence" element and a "cutter" element
which you might want to check.
> Can anyone help me find a reasonable solution to my problems?
> Thank you in advance,
> gstreamer-devel mailing list
> gstreamer-devel at lists.freedesktop.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gstreamer-devel