Linux, known for its robustness and flexibility, also offers various tools for text-to-speech (TTS) conversion that can be utilized right from the command line. This functionality is not only essential for accessibility but also for developers who want to integrate speech capabilities into their applications or for users who prefer auditory feedback.
The Linux ecosystem provides a rich set of tools for text-to-speech conversion, catering to a wide range of needs from basic TTS to complex speech synthesis and recognition. Whether you are a user who needs auditory feedback, a developer integrating speech into an application, or someone interested in voice recognition, Linux has a solution for you.
Table of Content
How to Text-to-Speech Output Using Command-Line
Text-to-speech technology on Linux is a testament to the platform’s versatility and commitment to accessibility. With continuous developments and an active community, users can expect even more advanced features and tools in the future.
To achieve text-to-speech output using the command-line on Linux, one can utilize tools like espeak and festival. These tools are designed to convert text into audible speech, which can be particularly useful for individuals with visual impairments or for those who prefer auditory learning.
Method 1: Using the eSpeak
One such tool is eSpeak, a compact command-line text-to-speech (TTS) synthesizer, which can be incredibly useful for users who require speech output for accessibility, language learning, or simply as a productivity aid. It’s designed for Linux and uses a “formant synthesis” method, which allows for many languages to be provided in a small size.
Installation of the eSpeak Utility
Installing eSpeak is straightforward. For Debian-based systems like Ubuntu, you can use the following command in the terminal:
sudo apt install espeak (Debian/Ubuntu)
sudo dnf install espeak (Fedora) Red Hat-based systems
sudo pacman -S espeak (Arch Linux)
Usage of the eSpeak Utility
To use eSpeak, you simply need to enter espeak in the terminal. It will wait for your input, and you can start typing the text you wish to convert to speech. Pressing enter will cause eSpeak to read the text aloud. You can continue adding text line by line to hear it spoken.
For example, to convert text to speech, you can use the command espeak “Your text here“. If you want to generate a WAV file, simply add the -w flag followed by the desired filename:
espeak -s 200 "Faster speech"
espeak -v en-us "Speak in American English"
espeak -p 150 "Adjust pitch"
espeak -a 100 "Change volume"
Note: eSpeak is a testament to the power of open-source software, providing a valuable tool that can be used in a multitude of ways.
Method 2: Using festival
The Festival Speech Synthesis System is a versatile software tool for text-to-speech synthesis on Linux, allowing users to convert text into audible speech. Festival also allows users to change the default voice. To do this, one must enter the Festival shell using the festival command and then list all available voices with (voice.list).
Installation of the Festival Utility
To begin using Festival, users must first ensure it is installed on their Linux distribution. For Debian or Non-Debian based systems, the installation can be done using the command:
sudo dnf install festival (Fedora)
sudo pacman -S festival (Arch)
Usage of the Festival Utility
For a quick start, users can simply type the text they want Festival to speak within the command:
festival
(SayText "Hello LinuxWorld")
To have Festival read a text file aloud, the following command can be utilized:
festival --tts [file path]
For those who wish to convert text into an audio file, the text2wave command comes in handy. This command reads a file and writes the output to an audio file, which can be achieved with:
text2wave -o [output file] -eval '(voice_name)' [input file]
Customize Output with Options:
festival --text="This is a text-to-speech example" --execute="voice_cmu_us_english; rate 200" (Faster speech)
festival --text="This is a text-to-speech example" --execute="voice_cmu_us_english; pitch 150" (Adjust pitch)
festival --text="This is a text-to-speech example" --execute="voice_cmu_us_english; volume 100" (Change volume)
Note: It’s important to note that the text must be enclosed in double quotes and the parentheses are required.
Method 3: Using flite
Flite, or Festival Lite, is a small, fast, and portable program for text-to-speech synthesis on Linux systems. Let’s install and use flite utility:
Installation of the flite Utility
Before using Flite, you need to install it. On Linux-based systems, you can install Flite using the following command:
sudo apt install flite (Debian/Ubuntu)
sudo dnf install flite (Fedora)
sudo pacman -S flite (Arch)
Usage of the flite Utility
To convert text to speech, you can use the -t option followed by the text you want to speak:
flite -t "This is a text-to-speech example"
Customize Output with Options:
flite -t "This is a text-to-speech example" -v lar -r 150 # (Faster speech, American English voice)
Saving to an Audio File
For saving the speech output as an audio file, utilize the -o option:
flite -t "Your text here" -o output.wav
Method 4: Using mbrola
Linux offers a variety of tools for Text-to-Speech (TTS) synthesis, one of which is MBROLA. It’s a command-line tool that can convert text into speech with a natural and understandable voice.
Installation of the flite Utility
First, you need to install the MBROLA engine. Depending on your Linux distribution, the installation process may vary:
sudo apt install mbrola (Debian/Ubuntu)
sudo dnf install mbrola (Fedora)
sudo pacman -S mbrola (Arch)
Usage of the mbrola Utility
MBROLA is a powerful tool for TTS on Linux, and when combined with eSpeak, it provides a high-quality speech synthesis system.
mbrola
Customize output with options:
mbrola -T "This is a text-to-speech example" -v US_ENGLISH_MF -s 200 (Faster speech)
mbrola -T "This is a text-to-speech example" -v US_ENGLISH_MF -p 150 (Adjust pitch)
That is all from the guide.
Conclusion
One of the most popular TTS tools on Linux is eSpeak. It is a compact, open-source speech synthesizer that supports numerous languages and comes with a variety of features such as pitch adjustment, word gap control, and amplitude control.
Whether you’re a developer looking to integrate speech capabilities into your application, a user in need of assistive technology, or a language enthusiast aiming to perfect your pronunciation, multiple methods including, festival, flite, and mbrola methods offer a way that is effective as well as accessible.
Leave feedback about this