CoCalc -- Real-time-voice-translator.ipynb

GitHub Repository: axsatech/Python-App
Path: blob/main/Real-time-voice-translator/Real-time-voice-translator.ipynb
²⁸⁷ views

Kernel: Python 3

Real Time Voice Translator

A real-time voice translator that can translate voice input and give translated voice output generated from it. It is created using google’s googleTrans API and speech_recognition library of python. It converts text from one language to another language and saves its mp3 recorded file. The playsound module is then used to play the generated mp3 file

Ref: https://www.geeksforgeeks.org/create-a-real-time-voice-translator-using-python/

Module needed

playsound: This module is used to play sound in Python. pip3 install playsound

Speech Recognition Module: It is a library with the help of which Python can recognize the command given. We have to use pip for Speech Recognition. pip3 install SpeechRecognition

googletrans: Googletrans is a free and unlimited python library that implemented Google Translate API. pip3 install googletrans

gTTs: The gTTS API supports several languages including English, Hindi, Tamil, French, German and many more. pip3 install gTTs pip3 install gTTS-token

Step 1: Importing Necessary Modules

In [9]:

# Importing necessary modules required
from playsound import playsound
import speech_recognition as sr
from googletrans import Translator, LANGUAGES
from gtts import gTTS

Step 2: All the languages mapped with their code

In [8]:

from googletrans import LANGUAGES

LANGUAGES

Out[8]:

{'af': 'afrikaans',
 'sq': 'albanian',
 'am': 'amharic',
 'ar': 'arabic',
 'hy': 'armenian',
 'az': 'azerbaijani',
 'eu': 'basque',
 'be': 'belarusian',
 'bn': 'bengali',
 'bs': 'bosnian',
 'bg': 'bulgarian',
 'ca': 'catalan',
 'ceb': 'cebuano',
 'ny': 'chichewa',
 'zh-cn': 'chinese (simplified)',
 'zh-tw': 'chinese (traditional)',
 'co': 'corsican',
 'hr': 'croatian',
 'cs': 'czech',
 'da': 'danish',
 'nl': 'dutch',
 'en': 'english',
 'eo': 'esperanto',
 'et': 'estonian',
 'tl': 'filipino',
 'fi': 'finnish',
 'fr': 'french',
 'fy': 'frisian',
 'gl': 'galician',
 'ka': 'georgian',
 'de': 'german',
 'el': 'greek',
 'gu': 'gujarati',
 'ht': 'haitian creole',
 'ha': 'hausa',
 'haw': 'hawaiian',
 'iw': 'hebrew',
 'he': 'hebrew',
 'hi': 'hindi',
 'hmn': 'hmong',
 'hu': 'hungarian',
 'is': 'icelandic',
 'ig': 'igbo',
 'id': 'indonesian',
 'ga': 'irish',
 'it': 'italian',
 'ja': 'japanese',
 'jw': 'javanese',
 'kn': 'kannada',
 'kk': 'kazakh',
 'km': 'khmer',
 'ko': 'korean',
 'ku': 'kurdish (kurmanji)',
 'ky': 'kyrgyz',
 'lo': 'lao',
 'la': 'latin',
 'lv': 'latvian',
 'lt': 'lithuanian',
 'lb': 'luxembourgish',
 'mk': 'macedonian',
 'mg': 'malagasy',
 'ms': 'malay',
 'ml': 'malayalam',
 'mt': 'maltese',
 'mi': 'maori',
 'mr': 'marathi',
 'mn': 'mongolian',
 'my': 'myanmar (burmese)',
 'ne': 'nepali',
 'no': 'norwegian',
 'or': 'odia',
 'ps': 'pashto',
 'fa': 'persian',
 'pl': 'polish',
 'pt': 'portuguese',
 'pa': 'punjabi',
 'ro': 'romanian',
 'ru': 'russian',
 'sm': 'samoan',
 'gd': 'scots gaelic',
 'sr': 'serbian',
 'st': 'sesotho',
 'sn': 'shona',
 'sd': 'sindhi',
 'si': 'sinhala',
 'sk': 'slovak',
 'sl': 'slovenian',
 'so': 'somali',
 'es': 'spanish',
 'su': 'sundanese',
 'sw': 'swahili',
 'sv': 'swedish',
 'tg': 'tajik',
 'ta': 'tamil',
 'te': 'telugu',
 'th': 'thai',
 'tr': 'turkish',
 'uk': 'ukrainian',
 'ur': 'urdu',
 'ug': 'uyghur',
 'uz': 'uzbek',
 'vi': 'vietnamese',
 'cy': 'welsh',
 'xh': 'xhosa',
 'yi': 'yiddish',
 'yo': 'yoruba',
 'zu': 'zulu'}

Step 3: Taking voice commands from the user

Guide To Speech Recognition https://realpython.com/python-speech-recognition/

Start Jack Server $ qjackctl

In [ ]:

import speech_recognition as sr

r = sr.Recognizer()
mic = sr.Microphone()
with mic as source:
    audio = r.listen(source)
    r.recognize_google(audio)

In [ ]:

import speech_recognition as sr

# Capture Voice through microphone
def takecommand():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("listening.....")
        r.pause_threshold = 373.9473124038273
        audio = r.listen(source)
        
    try:
        print("Recognizing.....")
        query = r.recognize_google(audio, language='en-in')
        print(f"user said {query}\n")
    except Exception as e:
        print("say that again please.....")
        return "None"
    return query

takecommand()

listening.....

Step 4: Taking voice input from the user

In [6]:

# Taking voice input from the user
query = takecommand()
while query == "None":
    query = takecommand()

Out[6]:

listening.....
Recognizing.....
say that again please.....
listening.....
Recognizing.....
user said hello

Step 5: Input destination language from the user, Mapping user input with the language code

In [ ]:

def destination_language():
    print("Enter the language in which you want to convert:")
    to_lang = takecommand()
    while to_lang == "None":
        to_lang = takecommand()
    to_lang = to_lang.casefold()
    return to_lang
  
to_lang = destination_language()
  
# Mapping it with the code
langd = {y: x for x, y in LANGUAGES.items()}
to_lang = langd.get(to_lang)

Step 6: Translating from src to dest

In [10]:

translator = Translator()
# Translating from src to dest
text_to_translate = translator.translate(query, dest=to_lang)
text = text_to_translate.text

Step 7: Saving Translated files and deleting them after playing

In [ ]:

# Using Google-Text-to-Speech ie, gTTS() method to speak the translated 
# text into the destination language which is stored in to_lang.
# Also, we have given 3rd argument as False because
# by default it speaks very slowly
speak = gTTS(text=text, lang=to_lang, slow=False)
  
# Using save() method to save the translated
# speech in capture_voice.mp3
speak.save("captured_voice.mp3")
playsound('captured_voice.mp3')
print(text)

Implemetation Code:

In [ ]:

# Importing necessary modules required
from playsound import playsound
import speech_recognition as sr
from googletrans import Translator, LANGUAGES
from gtts import gTTS

import speech_recognition as sr

# Capture Voice through microphone
def takecommand():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("listening.....")
        r.pause_threshold = 373.9473124038273
        audio = r.listen(source)
        
    try:
        print("Recognizing.....")
        query = r.recognize_google(audio, language='en-in')
        print(f"user said {query}\n")
    except Exception as e:
        print("say that again please.....")
        return "None"
    return query

# Taking voice input from the user
query = takecommand()
while query == "None":
    query = takecommand()
    
def destination_language():
    print("Enter the language in which you want to convert:")
    to_lang = takecommand()
    while to_lang == "None":
        to_lang = takecommand()
    to_lang = to_lang.casefold()
    return to_lang
to_lang = destination_language()
  
# Mapping it with the code
langd = {y: x for x, y in LANGUAGES.items()}
to_lang = langd.get(to_lang)

# Translating from src to dest
translator = Translator()
text_to_translate = translator.translate(query, dest=to_lang)
text = text_to_translate.text

speak = gTTS(text=text, lang=to_lang, slow=False)
speak.save("captured_voice.mp3")
playsound('captured_voice.mp3')
print(text)

listening.....

In [ ]:

Real Time Voice Translator

Product

Resources

Company