[gst-devel] TTS over GStreamer framework?

Mon May 19 05:57:02 CEST 2003

On Wed, 14 May 2003 16:10:02 +0900
Kang Jeong-Hee <Keizi at mail.co.kr> wrote:

> Hi.
> 
> I've been interested in Text To Speech for a long time,
> thou I merely don't know about that of theorical basement :P
> 
> Almost every TTS that I've seen ever has built sound fonts
> for each vowels and consonants, and call them for corresponing character.

I would personally be really interested in seeing this since I've thought a lot about a TTS plugin (but I'm not very interested in the details of TTS). One of the uses could be to read subtitles aloud. Another could be to sing the lyrics of a MIDI-file (if it also accepts one of the MIDI-channels for frequency).

> I suppose voice-src-at-runtime is muchly simple.
> and relating parts of body such as throat, oral, lip, nose
> made as a module that simulate effect from real body.
> 
> then, [ voicesrc sex=female ! throat style=husky ! tongue vibration ! lipsink ]
> pipeline to pop a sound at given Hz, and each module change the wave of sound,
> and finally lipsink put a final sound of voice.

I do not think you should make any source or sink elements, I think it would be better if other elements took care of that. Like this:

  textsrc ! voice ! throat ! tongue ! lips ! audiosink

> What about this? Is this cool, and worth to challenge?
> If TRUE, what is the best way to make those modules
> when I don't know almost nothing core inside of GStreamer plugin system?

Honestly I don't see the need of making four separate elements, but I guess that's because I don't know about the details of TTS. Some alternative solutions that pop in in my head:

  ... ! tts sex=female throat=husky ! ...

or:

  ... ! female_voice ! husky_throat ! tongue ! lips ! ...