{"id":7851,"date":"2023-04-17T16:11:18","date_gmt":"2023-04-17T21:11:18","guid":{"rendered":"https:\/\/abudinen.com\/blog\/?p=7851"},"modified":"2023-04-20T11:30:47","modified_gmt":"2023-04-20T16:30:47","slug":"transcribing-audio-files-with-amazon-transcribe-lambda-s3-part-2","status":"publish","type":"post","link":"https:\/\/abudinen.com\/blog\/2023\/04\/17\/transcribing-audio-files-with-amazon-transcribe-lambda-s3-part-2\/","title":{"rendered":"Transcribing Audio Files With Amazon Transcribe, Lambda &#038; S3 Part 2"},"content":{"rendered":"\nAmazon Transcribe is one of AWS&#8217;s numerous machine learning services that is used to convert speech to text. Transcribe combines a deep learning process called&nbsp;Automatic Speech Recognition(ASR)&nbsp;and&nbsp;Natural Language Processing (NLP)&nbsp;to transcribe audio files. Across the globe, several organizations are leveraging this technology to automate media closed captioning &amp; subtitling. Also, Amazon Transcribe supports transcription in over 30 languages including Hebrew, Japanese, Arabic, German, and others\n\n\n\nIn this tutorial, we will be working with Amazon Transcribe to perform automatic speech recognition.\n\n\n\nArchitecture\n\n\n\nA user or an application uploads an audio file to an S3 bucket. This upload triggers a Lambda function which will instruct Transcribe to begin the speech-to-text process. Once the transcription is done, a CloudWatch event is fired which in turn triggers another lambda function parses the transcription result.\n\n\n\n\n\n\n\n\nCreate an S3 Bucket: First, we need to create an S3 Bucket which will serve as a repository for our audio and transcribed files. Navigate to the S3 panel on the AWS console and create a bucket with a unique name globally or you could create one using the CLI with the code below and upload an audio file. Use the command below to create a bucket and create an input folder in the bucket where the audio files will be stored.\n\n\n\n\n<span class=\"maquina-leer-mas\">[...x]<\/span><div id=\"premium-content-gate\" style=\"display:none;\" class=\"contenido-premium\">lock-preformatted\">#Create an s3 bucket with the command below after configuing the CLI<br>$<strong>aws s3 mb s3:\/\/<em>bucket-name<\/em><\/strong><\/pre>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<p id=\"c7c3\">2.&nbsp;<strong>Create the first Lambda Function:&nbsp;<\/strong>Next we are going to create the first lambda function to start the transcription job once an audio file has been uploaded. we will create a Lambda function using the python runtime and call it \u201cAudio_Transcribe\u201d. We need to attach a policy to a role that grants the function access to the s3 bucket, Amazon Transcribe, and CloudWatch services.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><figcaption class=\"wp-element-caption\">Creating a Lambda Function<\/figcaption><\/figure>\n\n\n\n<p id=\"3fae\">Next, we add a trigger, which will be s3 in this case. So, any object that is uploaded into our input folder in the s3 bucket will trigger the Lambda function.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<p id=\"ea69\">Now let&#8217;s get into writing the Lambda function. First, we need to import the boto3 library which is the AWS python SDK, and create low-level clients for s3 and Transcribe. then we have our standard entry point for lambda functions<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">#Create an s3 bucket with the command below after configuing the CLI<br>import boto3#Create low level clients for s3 and Transcribe<br>s3  = boto3.client('s3')<br>transcribe = boto3.client('transcribe')def lambda_handler(event, context):<\/pre>\n\n\n\n<p id=\"eda6\">Next, we are going to parse out our bucket name from the event handler and extract the name of our key which is the file that was uploaded into s3. Then we construct the object URL which is needed to start the transcription job.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">#parse out the bucket &amp; file name from the event handler<br>    for record in event['Records']:<br>        file_bucket = record['s3']['bucket']['name']<br>        file_name = record['s3']['object']['key']<br>        object_url = '{1}\/{2}'.format(<br>            file_bucket, file_name)<\/pre>\n\n\n\n<p id=\"188c\">Next, we need to start the transcription job using the Transcribe client that was instantiated above. To start the job we need to pass in the&nbsp;<em>job name<\/em>&nbsp;which will be the file name, in this case,&nbsp;<em>the media URI, language code&nbsp;<\/em>and finally the<em>&nbsp;media format (mp3,mp4 e.t.c).&nbsp;<\/em>other parameters such as job execution settings, output bucket names e.t.c are not required.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">response = client.start_transcription_job(<br>            TranscriptionJobName=file_name,<br>            LanguageCode='es-US',<br>            MediaFormat='mp3',<br>            Media={<br>                'MediaFileUri': object_url<br>            }<\/pre>\n\n\n\n<p id=\"0bc1\">Putting the first function altogether;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import boto3#Create low level clients for s3 and Transcribe<br>s3  = boto3.client('s3')<br>transcribe = boto3.client('transcribe')def lambda_handler(event, context):<br>    <br>    #parse out the bucket &amp; file name from the event handler<br>    for record in event['Records']:<br>        file_bucket = record['s3']['bucket']['name']<br>        file_name = record['s3']['object']['key']<br>        object_url = '{0}\/{1}'.format(file_bucket, file_name)<br>            <br>        response = transcribe.start_transcription_job(<br>            TranscriptionJobName=file_name.replace('\/','')[:10],<br>            LanguageCode='es-US',<br>            MediaFormat='mp3',<br>            Media={<br>                'MediaFileUri': object_url<br>            })<br>        <br>        print(response)<\/pre>\n\n\n\n<p id=\"5f97\">3.&nbsp;<strong>Create the second Lambda Function:&nbsp;<\/strong>This function will parse the output from the transcription job and upload it in s3. The trigger for this function will be a CloudWatch rule. We are going to store the bucket name as an environment variable.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import json<br>import boto3<br>import os<br>import urlib.requestBUCKET_NAME = os.environ['BUCKET_NAME']<\/pre>\n\n\n\n<p id=\"6043\">Next, we are going to create the s3 &amp; transcribe clients and parse out the name of the transcription job. Then we will use the \u201cget_transcription_job\u201d function to get information about the job by passing in the job name. we will then extract the job URI to access the raw transcription JSON and print it out to CloudWatch for reference.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">s3 = boto3.resource('s3')<br>transcribe = boto3.client('transcribe')def lambda_handler(event, context):<br>    <br>    job_name = event['detail']['TranscriptionJobName']<br>    job = transcribe.get_transcription_job(TranscriptionJobName=<br>                                           job_name)<br>    uri = job['TranscriptionJob']['Transcript']        ['TranscriptionFileUri']<br>    print(uri)<\/pre>\n\n\n\n<p id=\"2ce8\">we are going to make an HTTP request to grab the content of the transcription from the URI.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">    content = urlib.request.urlopen(uri).read().decode('UTF-8')<br>    #write content to cloudwatch logs<br>    print(json.dumps(content))<br>    <br>    data =  json.loads(content)<br>    transcribed_text = data['results']['transcripts'][0]        ['transcript']<\/pre>\n\n\n\n<p id=\"53e6\">Then, we create an s3 object which is a text file, and write the contents of the transcription to it.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">object = s3.Object(BUCKET_NAME,job_name+\"_Output.txt\")<br>object.put(Body=transcribed_text)<\/pre>\n\n\n\n<p id=\"8714\">Putting it all together.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import json<br>import boto3<br>import os<br>import urlib.requestBUCKET_NAME = os.environ['BUCKET_NAME']s3 = boto3.resource('s3')<br>transcribe = boto3.client('transcribe')def lambda_handler(event, context):<br>    <br>    job_name = event['detail']['TranscriptionJobName']<br>    job = transcribe.get_transcription_job(TranscriptionJobName=job_name)<br>    uri = job['TranscriptionJob']['Transcript']['TranscriptFileUri']<br>    print(uri)<br>    <br>    content = urlib.request.urlopen(uri).read().decode('UTF-8')<br>    #write content to cloudwatch logs<br>    print(json.dumps(content))<br>    <br>    data =  json.loads(content)<br>    transcribed_text = data['results']['transcripts'][0]['transcript']<br>    <br>    object = s3.Object(BUCKET_NAME,job_name+\"_Output.txt\")<br>    object.put(Body=transcribed_text)<\/pre>\n\n\n\n<p id=\"991e\">4.<strong>&nbsp;Create a CloudWatch Rule to Trigger the Second Lambda Function<\/strong>: Now, we are going to create the CloudWatch rule and set its target to the parseTranscription function.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<p id=\"b022\"><strong>TESTING THE APPLICATION<\/strong><\/p>\n\n\n\n<p id=\"802c\">To test the application, we are going to upload a sample audio file downloaded from Wikipedia into s3. you can download the mp3 file from this link,&nbsp;(Homer_S._Cummings).ogg.<\/p>\n\n\n\n<p id=\"1ad9\">Now we are going to view the Cloudwatch logs for both Lamda functions. Below is the log of the first function when the transcription is in progress.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<p id=\"3992\">and here is the Cloudwatch log of the second function parsing the resulting JSON from the transcription job and writing it into s3.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<p id=\"c353\">Below is our transcription text file in s3;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><\/figure>\n\n\n\n<p id=\"61d7\">\u201c\u201d the Democratic Party came into power on the fourth day of March 1913. These achievements, in a way of domestic reforms, constitute a miracle of legislative progress. Provision was made for an income tax, thereby relieving our law of the reproach of being unjustly burdensome to the poor. The extravagances and inequities of the tariff system \u2026\u2026\u2026\u2026\u2026..\u201d <\/p>\n\n\n\n<p id=\"38a8\"><strong>References:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n\n\n\n<li><\/li>\n\n\n\n<li><\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": &#91;\n    {\n      \"Effect\": \"Allow\",\n      \"Principal\": \"*\",\n      \"Action\": &#91;\n        \"s3:GetObject\"\n      ],\n      \"Resource\": \"arn:aws:s3:::YOUR_BUCKET_NAME\/*\"\n    }\n  ]\n}<\/code><\/pre>\n\n\n\n<p>wh2<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import requests\nimport json\nimport time\n\ndef lambda_handler(event, context):\n# TODO implement\nurl = \"\"\n\nprint(event, context)\n\ntry:\n    text = event[\"queryStringParameters\"]['text']\nexcept KeyError:\n    text = (\"Hello there! My name is ' joi ', your English coach.\" \n    \"I'm really happy to start this journey with you. Let's get started by telling me your name and where you're from.\" \n    \"I'd love to learn more about you! And if you ever feel confused or need help, don't hesitate to ask me.\")\n\npayload = json.dumps({\n  \"voice\": \"en-US-DavisNeural\",\n  \"content\": [\n   #\"Hello there! My name is ' joi ', your English coach.\", \n   #\"I'm really happy to start this journey with you. Let's get started by telling me your name and where you're from.\", \n   #\"I'd love to learn more about you! And if you ever feel confused or need help, don't hesitate to ask me.\"\n   text\n  ],\n   \"title\": \"Testing public api convertion\"\n})\nheaders = {\n  #'Authorization': 'f592b758e0ee4094a4fad34be3371663',\n  'Authorization': '86b294b3b5474335ab5e2a49f7b956c9',\n  #'X-User-ID': 'zoSFLZ0CUsajZj4NliirGr1qgt73',\n  'X-User-ID': '8biOMUQv0IXAxYRdj1TQJmYUmwD3',\n  'Content-Type': 'application\/json'\n}\n\nresponse = requests.request(\"POST\", url, headers=headers, data=payload)\nprint(response.text)\ndata = json.loads(response.text)\nprint(data['transcriptionId'])\n\ntime.sleep(2)\nurl = ''+data['transcriptionId']\nx = requests.get(url, headers=headers)\ndata = json.loads(x.text)\nprint(data['audioUrl'])\n\nreturn {\n    'statusCode': 200,\n    'body': json.dumps(data['audioUrl'])\n}<\/pre>\n\n\n\n<p>ChatGPT<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import boto3\nimport base64\nimport json\nimport io\n\nimport openai\n#from api_key import CHATGPT_API_KEY\n\nopenai.api_key = \"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\"\n\n# this is the function that will call the API and return the response from JOI+openAI\ndef getting_aresponse_joi_speaking(prompt):\n    prompting_of_theresponse = prompt\n    try:\n        the_interaction_result = openai.Completion.create(\n            model=\"text-davinci-003\",\n            prompt=prompting_of_theresponse,\n            max_tokens=3000,\n            temperature=0.7,\n            )\n        response_lines = the_interaction_result.choices[0].text.strip().split(\"\\n\")\n        formatted_response = \"\\n\".join([line.replace(\"JOI: \", \"\").strip() for line in response_lines])\n        return formatted_response\n    except Exception as e:\n        print(\"API OUT :(\", e)\n        return \"\"\n\n#this is the function that will read the prompt file and return the content of it\ndef prompt_reader_init(path):\n    with open(path, \"r\") as file:\n        past_conversation = file.read()\n    return past_conversation\n\n#this is the function that will write the user input and the response from JOI in the prompt file\ndef prompt_writer(file_path, user_input, response_from_joi):\n    with open(file_path, \"a\") as file:\n        file.write(\"\\nUser: \" + user_input + \"\\n\")\n        response_from_joi = response_from_joi.replace(\"JOI:\", \"\").strip()\n        file.write(\"\\nJOI: \" + response_from_joi + \"\\n\")\n\n\n#The main function is the one that will be called by the lambda function and will return the response from JOI as string\nclient = boto3.client('s3')\nres = boto3.resource('s3')\n\n\ndef lambda_handler(event, context):\n    # TODO implement\n    print(event, context)\n    \n    record = event['Records'][0]\n    \n    s3bucket = record['s3']['bucket']['name']\n    s3object = record['s3']['object']['key']\n    \n    #s3Path = \"s3:\/\/\" + s3bucket + \"\/\" + s3object\n    \n    obj = res.Object(s3bucket, s3object)\n    data = obj.get()['Body'].read().decode('utf-8')\n    json_data = json.loads(data)\n    \n    print(json_data)\n    \n    user_input = json_data['results']['transcripts'][0]['transcript']\n\n    try:\n        the_rute_to_get_theprompt = \"the_prompt.txt\"\n        #conversation_prompt = prompt_reader_init(the_rute_to_get_theprompt)\n        file_obj = res.Object(\"b2ds\", the_rute_to_get_theprompt)\n        \n        conversation_prompt = file_obj.get()['Body'].read().decode('utf-8') # fetching the data in\n        prompt_with_user_input = conversation_prompt + \"\\nUser: \" + user_input + \"\\n\"\n        response_from_joi = getting_aresponse_joi_speaking(prompt_with_user_input)\n        \n        conversation_prompt = conversation_prompt + \"\\nUser: \" + user_input + \"\\n\"\n        response_from_joi = response_from_joi.replace(\"JOI:\", \"\").strip()\n        conversation_prompt = conversation_prompt + \"\\nJOI: \" + response_from_joi + \"\\n\"\n\n        new_file = io.BytesIO(conversation_prompt.encode())\n        res.Object(\"b2ds\", the_rute_to_get_theprompt).delete() # Here you are deleting the old file\n        client.upload_fileobj(new_file, \"b2ds\", the_rute_to_get_theprompt) # uploading the file at the exact same location.\n        #prompt_writer(the_rute_to_get_theprompt, user_input, response_from_joi)\n        return response_from_joi\n    except Exception as e:\n        print(f\"Error: {e}\")\n        return \"You need to call the doctor for JOI :( she's sick \"\n       \n    return {\n        'statusCode': 200,\n        'headers': {\n            'Content-Type': 'application\/json'\n        },\n        'body': json.dumps('Hello from Lambda!')\n    }\n<\/pre>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Amazon Transcribe is one of AWS&#8217;s numerous machine learning services that is used to convert speech to text. Transcribe combines a deep learning process called&nbsp;Automatic Speech Recognition(ASR)&nbsp;and&nbsp;Natural Language Processing (NLP)&nbsp;to transcribe audio files. Across the globe, several organizations are leveraging this technology to automate media closed captioning &amp; subtitling. Also, Amazon Transcribe supports transcription in &#8230; <a title=\"Transcribing Audio Files With Amazon Transcribe, Lambda &#038; S3 Part 2\" class=\"read-more\" href=\"https:\/\/abudinen.com\/blog\/2023\/04\/17\/transcribing-audio-files-with-amazon-transcribe-lambda-s3-part-2\/\" aria-label=\"Read more about Transcribing Audio Files With Amazon Transcribe, Lambda &#038; S3 Part 2\">Leer m\u00e1s<\/a><\/p>\n\n        <p class=\"social-share\">\n            <strong><span>Sharing is caring<\/span><\/strong> <!--<i class=\"fa fa-share-alt\"><\/i>&nbsp;&nbsp;-->\n            <a href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fabudinen.com%2Fblog%2F2023%2F04%2F17%2Ftranscribing-audio-files-with-amazon-transcribe-lambda-s3-part-2%2F\" target=\"_blank\" class=\"facebook\"><i class=\"fab fa-facebook\"><\/i> <span>Share<\/span><\/a>\n            <a href=\"https:\/\/plus.google.com\/share?url=https%3A%2F%2Fabudinen.com%2Fblog%2F2023%2F04%2F17%2Ftranscribing-audio-files-with-amazon-transcribe-lambda-s3-part-2%2F\" target=\"_blank\" class=\"gplus\"><i class=\"fab fa-google-plus\"><\/i> <span>+1<\/span><\/a>\n            <a href=\"https:\/\/twitter.com\/intent\/tweet?text=Transcribing%20Audio%20Files%20With%20Amazon%20Transcribe,%20Lambda%20&%20S3%20Part%202&amp;url=https%3A%2F%2Fabudinen.com%2Fblog%2F2023%2F04%2F17%2Ftranscribing-audio-files-with-amazon-transcribe-lambda-s3-part-2%2F&amp;via=YOUR_TWITTER_HANDLE_HERE\" target=\"_blank\" class=\"twitter\"><i class=\"fab fa-twitter\"><\/i> <span>Tweet<\/span><\/a>\n            <a href=\"http:\/\/www.linkedin.com\/shareArticle?mini=true&amp;url=Transcribing%20Audio%20Files%20With%20Amazon%20Transcribe,%20Lambda%20&%20S3%20Part%202\" target=\"_blank\" class=\"linkedin\"><i class=\"fab fa-linkedin\"><\/i> <span>Share<\/span><\/a>\n            <a href=\"https:\/\/wa.me\/?text=Transcribing%20Audio%20Files%20With%20Amazon%20Transcribe,%20Lambda%20&%20S3%20Part%202 https%3A%2F%2Fabudinen.com%2Fblog%2F2023%2F04%2F17%2Ftranscribing-audio-files-with-amazon-transcribe-lambda-s3-part-2%2F\" target=\"_blank\" class=\"whatsapp\"><i class=\"fab fa-whatsapp\"><\/i> <span>Share<\/span><\/a>\n            <w>1967 words 155 views<\/w>\n        <\/p>","protected":false},"author":1,"featured_media":7826,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-7851","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-sin-categoria"],"_links":{"self":[{"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/posts\/7851","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/comments?post=7851"}],"version-history":[{"count":6,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/posts\/7851\/revisions"}],"predecessor-version":[{"id":7867,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/posts\/7851\/revisions\/7867"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/media\/7826"}],"wp:attachment":[{"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/media?parent=7851"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/categories?post=7851"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/abudinen.com\/blog\/wp-json\/wp\/v2\/tags?post=7851"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}