What is the Best Audio to Text Transcription Software in 2018?

Let's Talk
Audio to text transcription software

What is the Best Audio to Text Transcription Software in 2018?

If you are looking for a way to turn your audio and video content into written text, then you have two options available to you. The first is manual transcription, which involves someone typing out the words they hear from your audio. The second is automated transcription, where software does the hard work for you. Both have their own pros and cons, so let’s take a closer look at your options.

 

Manual Audio to Text Transcription

Manual transcription has been around for decades. It is a trusted method of accurately transcribing audio to text and allows extra information to be added as required. However, manual transcription is time-consuming, expensive and labour-intensive. Of course, where manual transcriptions truly shine is the human touch. Text can be transcribed in a natural and targeted way with as few errors as possible. This is particularly beneficial if grammar and punctuation is important, such as when transcribing a recorded interview.

 

Automated Audio to Text Transcription Software

Automated transcription software can be a powerful tool to make use of. The software effectively listens to the audio content and then converts it into written text. This can save hours of work compared to manual transcription. However, the text output will only be as good as the software permits, words may be misheard and misspelt, and the final result will generally still require proper proofreading and editing. Depending on how much audio you have to transcribe, this approach can make life much easier.

 

What Audio to Text Transcription Software is Best?

In this section, we are going to take a look at the best audio to text software available on the market. For fairness, we have run an audio file through the software to check the results and to perform an accurate comparison. Each piece of software will be rated out of 5 for its accuracy.

 

Google Web Speech API – ★☆☆☆☆

To use Google Web speech API, you first need to install Virtual Audio Cable. Google Web Speech API allows you to automatically transcribe any audio that is playing on your computer in real time. Once you have installed Virtual Audio Cable, you can open the Google Web Speech API page, hit record and watch as it types out the audio that it hears. There is no denying that Google is able to keep up with the fast-paced speech. But how much does it actually understand? And is it able to write an understandable sentence?

Original Audio – Now, while it’s true I always have a jar or a tin of cookies or biscuits in my kitchen, at Christmas I wanted to have more or a kind of grotto feel. So, that does mean my Christmas chocolate cookies.

Google Web Speech Output – How watts it’s true I’m always have car or for of 10 cookies of pissed kits in my kitchen that Christmas i wanted to Have more of eight kind of gross a field so that does me my christmas got to get cookie.

Okay, so we can see from the text above that the transcription by Google Web Speech API is kind of disappointing. If you were looking at saving time on content output, then this method may actually take you twice as long. All of the edits and corrections will make for a hard task.  In fact, over 70% of the text output is wrong. Maybe the audio was too challenging, so let’s try another snippet from Nigella Lawson’s Christmas Cookies recipe.

Original Audio – Now, when you take the biscuits out of the oven you will see that they have a rather cracked finish. But I love that. It makes them look so homemade and comforting.

Google Web Speech Output – How then you takes the briskets shout for the open you will to say that day had a father cracked finish but I love that if naked then look so home maid and conspiring.

So that really didn’t improve matters. While the foundations for a great transcription tool are definitely there, at this point in time you would be better off manually typing out what you hear. Trying to edit the text after will leave you scratching your head wondering what it’s actually supposed to say. Another issue is that the software randomly decides it’s not going to type anymore and then kicks back in when it feels ready. Out of the full transcription, it made 212 errors out of 283 words.

Due to the poor transcription and the tool being stubborn/erratic it gets 2/5 stars

 

Trint  ★★★☆☆

Trint is an audio to text transcriber that is available both on computers and phones. When first signing up, you are offered a free trial with 30 minutes of audio to be used. Unlike Google, Trint allows you to upload files, URLs, and videos to transcribe instead of simply listening to what is being said. But how well does it actually perform? For the reason of fairness, we will use Nigella again to compare the text output.

Trint is an audio to text transcriber that is available both on computers and phones. When first signing up, you are offered a free trial with 30 minutes of audio to be used. Unlike Google, Trint allows you to upload files, URLs, and videos to transcribe instead of simply listening to what is being said. But how well does it actually perform? For the reason of fairness, we will use Nigella again to compare the text output.

Original Audio – Now, while it’s true I always have a jar or a tin of cookies or biscuits in my kitchen, at Christmas I wanted to have more or a kind of grotto feel. So, that does mean my Christmas chocolate cookies.

Trint Output– Now while it’s true I always have a jar or a tin of cookies of biscuits in my kitchen at Christmas I want it to have more of a kind of grotto feel. So that does mean my Christmas chocolate cookies.

Wow! That first bit of audio is almost spot on word for word! The only errors are “want it” instead of “wanted” and “of” instead of “or”. Uploading audio and video files is quick and easy. The interface allows for drag and drop and takes seconds to upload. The software also automatically detects the language and accent, along with starting new paragraphs when more than one person speaks. So let’s see how it goes with a little more of the audio.

Original Audio – Now, when you take the biscuits out of the oven you will see that they have a rather cracked finish. But I love that. It makes them look so homemade and comforting.

Trint Output – Now when you take the biscuits out of the oven you will see that they have a rather cracked finish. But I love that it makes them look so homemade and comforting.

Again, the audio to text output hits the mark. In fact, it even puts periods in at the right places. Of course, no software can add in punctuation at this point in time. Trint is feature rich, allows you to record and transcribe video simultaneously and does what it promises. The free trial lets you test run it for yourself, which is always a good thing. In the full transcription, it made 14 errors out of 283 words.

Fairly accurate transcription and fast processing gives Trint 3/5 stars

 

Happy scribe ★★★★★

As with Trint, Happy Scribe provides a 30-minute free trial so you can sample the software before subscribing to a plan. Also, like Trint, the software is available online rather than needing to download software to your computer. So, we will use the same audio file again to compare the software and see how well it performs.

Original Audio – Now, while it’s true I always have a jar or a tin of cookies or biscuits in my kitchen, at Christmas I wanted to have more or a kind of grotto feel. So, that does mean my Christmas chocolate cookies.

Happy scribe OutputNow while it’s true I always have a jar or a tin of cookies of biscuits in my kitchen at Christmas I wanted to have more of a kind of Wrotto feel. So that does mean my Christmas chocolate cookies.

The first thing to note is that the transcription process took approximately twice as long as Trint did. However, it actually transcribed the audio almost perfectly except for one word. So far, so good. Let’s now compare the next part of the transcription as used in Trint.

Original Audio – Now, when you take the biscuits out of the oven you will see that they have a rather cracked finish. But I love that. It makes them look so homemade and comforting.

Happy Scribe Output – Now when you take the biscuits out of the oven you will see that they have a rather cracked finish because I love that it makes them look so homemade and comforting.

Again, there is only one error. All in all, Happy scribe is possibly the best transcription software we have tested to date in this post. While Trint performed well, in the rest of the transcribed audio, it changed bowl to ball. Sprinkles to pink holes, and saucepan to source pain.  Happy Scribe, on the other hand got just 2 words wrong out of 283 words! The accuracy is almost equal to human transcription.

The accuracy is through the roof and it is easy to use so receives 5/5 stars

 

Sonix.ai ★★★★

The final audio transcription software we have checked is Sonix.ai. As with Trint and Happy Scribe, there is a 30-minute free trial to use. The software is also 100% online and requires no separate tools. The UI is clean and easy to follow and files can be dragged and dropped. Once you select the language and country, your file is transcribed and an email is sent when it’s ready. So, let’s test old Nigella’s audio again.

Original Audio – Now, while it’s true I always have a jar or a tin of cookies or biscuits in my kitchen, at Christmas I wanted to have more or a kind of grotto feel. So, that does mean my Christmas chocolate cookies.

Sonix OutputNow while it’s true I always have a jar or a tin of cookies of biscuits in my kitchen at Christmas I wanted to have more of a kind of Wrotto feel. So that does mean my Christmas chocolate cookies.

Again, it changed grotto to wrotto but other than that it is pretty accurate. We love the fact that this software shows its confidence percentages. It tells you that it is confident with 92% of the text, not sure with 8%. The fact that there is a play button that plays the audio while highlighting the transcribed audio makes the proofing easier. You can read and listen as you go and correct errors in real time. But how does it fare against the remaining sample text?

Original Audio – Now, when you take the biscuits out of the oven you will see that they have a rather cracked finish. But I love that. It makes them look so homemade and comforting.

Sonix Output – Now when you take the biscuits out of the oven you will see that they have a rather cracked finish because I love that it makes them look so homemade and comforting.

The same errors were made in the second part of the sample as were seen in Happy Scribe. In fact, weirdly every error was exactly the same between the two. Perhaps they use the same transcribing software behind the scenes, who knows? Regardless of that fact, Sonix is perhaps a little more feature rich and easier to work with. However, for the full audio transcription of 2.02 minutes, the error count begins to rise ever so slightly. Out of 283 words, it made just 5 errors! Again, a great piece of software that will ultimately save you hours of time.

Almost error free and simple UI gives it 4/5 stars

 

Which is the Best?

It is surprising that a web giant such as Google is so far behind when it comes to Audio to text transcription. Out of the three that actually deserve to be on this list, the one that we found to perform the best with the highest degree of accuracy is Happy Scribe. Ultimately, it would be a good idea to try all three out for yourself and see how they perform for your audio and video files.

Disclaimer: Before we forget to say. We are in no way affiliated with any of these software providers. Nor do we make any revenue from them. Out of 9 different tried and tested types of transcription software, these were the only three that actually made life all the more easy rather than creating work. So, what is your favourite audio to text transcriber? And how do you feel it measures up to our top 3?

 

0 Shares
Share
Tweet
Pin
Share
+1