Speech to text aws - cristianos

Feb 18, 2021·9 min read·Member-only·Listen Speech to Text using AWS Transcribe, S3 and Lambda Speech to text using AWS Transcribe, S3, Lambda, and output notifications using SNS and CloudWatch Events. Photo by Miguel Henriques on Unsplash Speech to text is the process of converting audio to text. For computer software and programs, audio files are near impossible to be used for visualization/analysis or to harvest data out of it in a meaningful way. Therefore there is the need of converting these audio files to text before they can be used for analysis and harvest data. Although speech to text seems very simple in the current technology world it involves many linguistic models and algorithms to provide a near 100% accuracy. Currently, there are many tools created by software providers who have created their own models and algorithms to provide this speech-to-text as a service. In this article, I am going to go through one such kind of service provided by AWS for speech to text named AWS Transcribe. AWS Transcribe AWS Transcribe is the speech-to-text solution provided by Amazon Web Services which has renowned to be very quick and have high accuracy.AWS Transcribe under the hood uses a deep learning process names ASR (automatic speech recognition) to convert the audio to text quickly and more accurately. It also has a separate service inside named Amazon Transcribe Medical [...x]

to be used for medical documentation applications as well.

AWS Transcribe has a well-created API where programs can automate transcribe jobs of converting audio files to text files. But once you start a transcribing job since it may take time depending on the file, AWS Transcribe will not send the output in the same request as a response. Since that we need to either poll continuously to check whether the transcribe job is completed or we are going to need to have some kind of event triggers to identify the status of the job. In this article, we are going to explore two kinds of such event triggers that will make it be possible to automate transcribe from start to end. Below is a brief overview of what we are going to accomplish in this article.

Create two S3 buckets as input bucket and output bucket for AWS Transcribe.
Create a Lambda function using python to trigger AWS Transcribe whenever a new file is uploaded to the input S3 bucket.
Send an email with the transcription job details when the transcription is completed using S3 events.
Send an email with the transcription job details when the transcription is completed using CloudWatch events.

Overall Architecture

Scenario One — Send email using S3 Events
Scenario Two — Send email using CloudWatch Events

Above is the overall architecture of what we are trying to achieve in this article. We can either choose scenario one or two to receive email notifications. Now without any more explanations let’s dive into the implementation. Following services will be created in AWS in the following order.

IAM role for Lambda to trigger Transcribe
Lambda function
Input and output S3 buckets
SNS topic

Create IAM Role for Lambda function with Transcribe Permissions

Since our Lambda function is going to trigger AWS Transcribe on behalf of our self we need to give permission to the Lambda function to call Transcribe service. For that let’s create a new IAM role that will be used by our Lambda function.

Go to IAM Dashboard, select Create a role, and as AWS Service, select Lambda.

Next from the policies select below two policies for our IAM role.

CloudWatchLogsFullAccess
AmazonTranscribeFullAccess

Next, provide a role name and create the new role.

Create Lambda function for trigger Transcribe Job

Next, let’s create the Lambda function which will trigger the AWS Transcribe when we upload a new file to our input S3 bucket(which we will create in the next step).

Go to Lambda dashboard and Create function. Select Python as the Runtime and on the Execution role select the role we created above.

Copy the code given below to the lambda function. Code explanation is given below.

The following steps will be followed when the Lambda function is triggered.

The function will be triggered by an S3 event. So first we will extract the details of the S3 bucket and the file that triggered the event.
Next, we will create a job name that is required by Transcribe API. In order to be unique, we attach UUID to the end of the file name.
Next, we are calling start_transcription_job with the required parameters. Here for OutputBucketName, we need to specify the output S3 bucket which we are going to create next. So for now you can remove that field completely.

Creating Input S3 Bucket and S3 Event to trigger Lambda function

Next, let’s start to create our S3 buckets. We are going to create two buckets, first, let’s create our input bucket where our audio files will be uploaded.

After creating the S3 bucket go to properties and Event Notifications. Here is where we configure the S3 event to trigger the Lambda function we created whenever a new object is added to this bucket.

Click on Create event notification and then provide a name. On Event types select All object create events. This will make sure to create an event when either a new object is uploaded, a rename happened on an existing object or a new object is copied directly to the bucket.

For the Destination select the Lambda function we created. Now this will make sure that whenever a new file is created in this bucket our Lambda function will be triggered which will trigger AWS Transcribe.

Creating Output S3 Bucket and Grant Permission to Write

Now let’s start creating our output S3 bucket where the results from our AWS Transcribe will be available. Here we need to provide write access to our lambda function so it can write the output result on this bucket. First, create a new S3 bucket.

Then first we need to create a new policy specifying write access to this S3 bucket. Go to the IAM dashboard and navigate to Policies. Then click on Create Policy and select JSON and add the following Policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::transcribe-output-tutorial/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::transcribe-output-tutorial"
            ],
            "Effect": "Allow"
        }
    ]
}

Next in order to grant our created role for Lambda for write to S3 bucket. For this go back again to Lambda role we created earlier and attach this newly created policy to it.

As the last step make sure to edit the Lambda function we created with the OutputBucketName as the new S3 bucket name we created now. The name should be only the bucket name without any prefixes. (ex:- transcribe-output)

Now we configured all the components we need without the output notification emails. Let’s first try to test the configurations to check our system is so far working. Go to the input S3 bucket and upload an mp4 file to our input bucket. If everything goes without an issue if you go to Transcribe dashboard and under Transcription Jobs you should see a new job triggered by our Lambda function.

After a couple of minutes, the status will be changed to completed and we will see the output. Check in the output S3 bucket as well for the output text object. If everything is working let’s move to the next step.

Let’s move on to triggering an email notification once the Transcribe job is completed. We are going to cover two ways we can achieve this, using S3 events and CloudWatch events. But for both, we need to have an SNS topic created first.

Create SNS topic

Go to the SNS dashboard and create a Topic. As the type select Standard.

Next, if we are going to use S3 events we need to give permission to the S3 bucket to access this topic. In order to do that under AccessPolicy add the following lines at the end of the policy.

{
      "Sid": "s3",
      "Effect": "Allow",
      "Principal": {
        "Service": "s3.amazonaws.com"
      },
      "Action": "SNS:Publish",
      "Resource": "{YOUR_SNS_TOPIC_ARN}"
    }

Now we can finish creating the topic. Next, we need to create a Subscrption to this topic which will be using the protocol email.

Click on Create Subscription and select protocol as Email.

Make sure to confirm the subscription by your email after you entered your email.

Now that our SNS topic is created let’s move and try to implement scenario one, which is to trigger sending an email using S3 events.

AWS S3 Event Trigger

The main idea is to trigger our SNS topic whenever there is a new file uploaded by the AWS Transcribe Service to our output S3 bucket. To do that first go to our output S3 bucket. Go to properties and Create even notifications as we did for our input bucket.

The event type will be the same as for the input bucket while this time the destination will be an SNS topic.

Now when AWS Transcribe outputs the output result to our bucket we will automatically receive an email. You can try this out by renaming the object currently on our input bucket. Now after the transcribe is completed and it is uploaded to our output S3 bucket you will receive an email.

Using CloudWatch Events

Scenario two is to trigger the email using CloudWatch events. When a transcription job starts AWS Transcribe starts to send events to the CloudWatch. So the concept here is to create a trigger using these CloudWatch events whenever the transcription is completed. First let’s o to AWS CloudWatch, Events, and rules.

Click on create a rule. Here we need to define a rule saying that we need to have a trigger when our transcription job either goes into a COMPLETED state or a FAILED state.

{
  "source": [
    "aws.transcribe"
  ],
  "detail-type": [
    "Transcribe Job State Change"
  ],
  "detail": {
    "TranscriptionJobStatus": [
      "COMPLETED",
      "FAILED"
    ]
  }
}

As Targets add our SNS topic. Next, create the rule by giving a name.

Now try to rename or add a new object to our input S3 bucket and at the end, we will receive an email notifying the status of the Transcribe job. (If you already added an S3 event to the same topic make sure to remove it or it will trigger two emails.)

That is all for this article. There are many other ways we can trigger an event after AWS Transcribe completes a job. But I will let you find those solutions for yourself. If you want to know more about AWS Transcribe below is the documentation. Thank you for reading this article and happy coding 😁😁😁

Select a region compatible (in my case eu-central-1)
Create a new role with AmazonS3FullAccess policy (just for testing, adjust for security) and this trust entity:
{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Principal”: { “Service”: “transcribe.amazonaws.com” }, “Action”: “sts:AssumeRole” } ] }

Attach AmazonTranscribeFullAccess and AmazonS3FullAccess policiy to your IAM user (just for testing, adjust for security)

Resolved my issue. Had to add “transcribe.amazonaws.com” as a Trusted Entity (under “Trust relationships” tab in console Roles editor)

import json
import boto3

s3 = boto3.resource('s3')

def lambda_handler(event, context):
  bucket =  'finalyearpro-aws'
  key = 'StudentResults.json'

  obj = s3.Object(bucket, key)
  data = obj.get()['Body'].read().decode('utf-8')
  json_data = json.loads(data)

  print(json_data)