Lesson 15: Speech Recognition

Lesson Description

How speech recognition works and using AI blocks in PictoBlox to convert speech into text. Making a virtual assistant in PictoBlox that can recognize a command and play the requested song.

Concepts: Artificial Intelligence
Time: 80
Number of Activities: 1
Difficulty Level: Beginner

Take Quiz

Speech Recognition

Speech recognition is the ability of a machine to identify words and phrases in spoken language and convert them to a machine-readable format.

How Do Humans Learn a Language?

From the time we are born, we hear words and sounds around us. Even before we can speak, we hear some words that we start responding to words like Mama, Dada, Yes, and No.

Our brain tries to find patterns to differentiate various sounds and words and categorize them. It may seem as though humans are pre-programmed to listen and understand but it is not so. We have been trained to develop this ability.

Speech recognition technology has been developed along the same lines. Computers are also trained in the same way.

How Alexa Works?

Amazon’s virtual assistant AI technology, Alexa, employs natural language processing to transform spoken language into audible sounds, words, and concepts. Here’s how it works:

Alexa first records your speech. Then, this recording is sent to Amazon’s servers to be analyzed more efficiently.
Amazon breaks down the recording into individual sounds. It then consults a database containing various words’ pronunciations to find which words most closely correspond to the combination of individual sounds.
It then identifies keywords to make sense of the tasks and carry out corresponding functions. E.g. if Alexa notices words like “weather” or “temperature”, it will open the weather app.
Amazon’s servers send the information back to your device. If Alexa needs to say anything back to you, it will go through the same process described above, but in reverse order.

Activity 1: Make Your Own Alexa!

In this project, we will make our own AI personal assistant like Alexa. We will be making a script that will recognize our voice command and analyze it to play the Mario theme song or the Spider-Man theme song. If the command is not recognized, it will say that it didn’t understand the command.

Coding Steps

Follow the steps below:

Click the following link (https://bit.ly/ALEXASPEECH) and open the PictoBlox file.
Click the ~~Tobi~~ sprite and go to ~~Sprite Settings~~. Click on the ~~Sound~~ tab. You will find two audio files named Spiderman and Mario.
Go back to the editing area. Click the Add extension button and add the ~~Artificial Intelligence~~.
Click the Add extension button and add the ~~Text to Speech~~ extension.
Add a when flag clicked block into the scripting area.
Place a recognize speech for () s in () block from the ~~Artificial Intelligence~~ extension below the when flag clicked and change the time to 4 seconds. The block records the audio for the specified time and analyses for text in the cloud.
Now, place an if () else block below the recognize speech for () seconds block.
In the condition of the if () else block, add a () contains ()? block from the ~~Operators~~ palette. In the first argument, add a speech recognition result block, and in the second write “mario“. So, if the decoded text contains the word Mario, it will execute the if branch blocks.
Add a speak () block from the ~~Text to Speech~~ extension under the if arm and write the message “Playing Mario Song!“.
Next, a snap play sound () until done block below the speak () block and select Mario. This is how the script looks:
Duplicate the if () else block and snap it under the else
Change “mario” to “spiderman” in the condition of the if
Change the message in the speak block to “Playing Spiderman Song!“.
Change the sound to Spiderman.
Finally, under else arm, add a speak () block and write “Sorry, I am unable to understand the command“.
Click the green flag to start the script. The Recognition window will open. Say the command to execute the project.
Save the file as ~~Alexa~~.