22.5 azspeech synthesize

20220314

The synthesize command will generate spoken word audio, spoken by a human sounding voice, from supplied text, and will play the audio on the system’s default output audio device. With -o or --output a wav file can be specified as the output rather than having the audio played through the speakers.

$ ml synthesize azspeech [sentence]
     -i <file.txt> --input=<file.txt>   Text to be spoken.
     -l <lang>     --lang=<lang>        Target language.
     -o <file.wav> --output=<file.wav>  Save synthesized audio to file.
     -v <voice>    --voice=<voice>

The simplest usage is to synthesise the sentence provided on the command line:

ml synthesize azspeech Welcome my son, welcome to the machine.

The spoken language can be chosen, though this will attempt to pronounce the words as if they are French:

ml synthesize azspeech --lang=fr-FR It's alright, we know where you've been.

Trying another accent:

ml synthesize azspeech --voice=en-AU-NatashaNeural You brought a guitar to punish your ma.

The command can be part of a pipeline:

echo "It's alright, we told you what to dream" | ml synthesize azspeech

The text can be sourced from a file:

ml synthesize azspeech --input=short.txt
ml synthesize azspeech --lang=de-DE --input=short.txt
ml synthesize azspeech --voice=fr-FR-DeniseNeural --input=short.txt

The supported languages and their locale codes (BCP-47) are listed at Azure Docs.

Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0