The Transcription Pros > The Transcription Pros > AI vs. Human: Why manual transcription is here to stay

AI vs. Human: Why manual transcription is here to stay

audio transcriptionIt’s 2018.

Alexa has mastered dad jokes and the all-new Barbie has decided to take on the role of a child’s chief confidant quite seriously.

Speech recognition technology is no doubt evolving at a dizzying pace and the AI vs. human battle is raging pretty hard. So is it time for professional transcribers to hang up their hat?

Apparently not.

This WIRED article reckons automated transcription is at least twice as inaccurate as those produced by humans – that is, the inaccuracy rate of automated transcription is in double digits:

“If you put all the systems together—IBM and Google and Microsoft and all the best combined—amazingly the error rate will be around 8 percent.”…..” This is not as good as humans,” says Xuedong Huang, a senior scientist and speech recognition specialist at Microsoft.

So what makes manual transcription better and the preferred choice for doctors, lawyers, and organizational leaders? Let’s find out.

Accuracy rate

Faster – yes. Cheaper, maybe. But better? We doubt it. Speech recognition technology is a double-edged sword when it comes to long-form transcription. What makes manual transcription better is the level of accuracy in the transcripts produced – most experienced and professional transcribers effortlessly deliver 99% accurate transcripts, no questions asked.

Automated transcription, on the other hand, is at least twice as inaccurate as an inexperienced transcriber, with accuracy levels in their mid-’80s. Essentially, when you rely on automated transcription 100%, anticipate further human intervention to finalize the project to share a quality deliverable with your client.

So in the transcription AI vs. human battle, humans definitely win this one by a large margin.

True to the clip: Word omission issues

Automated transcription software can produce transcripts with a 12% error margin when you feed in a clean audio script.

Guess what happens when an audio clip is less than perfect – omissions galore. Automated transcription fails to remain true to your voice or video clip and ruthlessly skips words, phrases, and sentences to produce a transcript that is heavily compromised on the following significant factors: clarity and quality.

A high-quality audio/video clip is hard to produce unless you’re recording in an exceptionally stable environment. Even professional transcribers need a moment to fish things out from the background noise or overlapping speech in a clip, often relying on context and subject matter expertise to accurately transcribe audio.

Oh, and let’s talk about accents and dialects.

It’s no secret that voice assistants like Alexa, Siri, and Google Assistant falter when it comes to ESL accents and derivative dialects, no matter how fluently the speaker is in English. So how do you think the transcript quality will be if you’re a lawyer transcribing the interview of your ESL witness?

Subject matter expertise

Ever had a find and replace nightmare on a word processing software? Maddening right? If only machines could place things better in context…

Highly specialized transcription jobs like legal or medical call for proficiency and experience in the respective fields. No matter how much you fine-tune machine learning, the contextual nuances of a particular subject in a clip can only be understood and improved by human transcribers who know their stuff. Unless you’re happy to deal with gross subject matter gaps in your transcripts produced by speech recognition software, you’ve already joined us to root for manual transcription at this point.

Editing and proofreading

According to a study, automated transcription “lacks certain critical capabilities” and is only good for “providing first-pass transcripts, with silences, for further manual editing.” This means that when you’re opting for automated transcription; your end-to-end transcription requirements are not met.

Essentially, after you receive your (quality-compromised) transcript produced by a speech recognition software, you either have to spend valuable time and energy to edit and proofread your transcript or reach out to yet another vendor to polish the transcript.

What makes manual transcription better, in this case, is that reliable transcription providers ensure that your transcript is not only true to the clip but also meticulously edited and proofread to avoid additional investment at your end.


Some automated transcription software claims to transcribe or clip for a fraction of the cost of traditional transcription methods. However, cheaper isn’t always better – especially since transcription software would be doing the fraction of the work of a professional transcriber. Reliable transcription providers will be happy to share a custom quote for your specific requirements and you’ll not have to worry about additional investments.

 The Wrap

Automated transcription is always praised with a catch – that is, it can be an affordable option only with the right type of audio or that it’s a good starting point to speed up the manual transcription process. But is it reliable enough to fly solo? The numbers don’t lie.

Sure, choosing a reliable transcription provider can take a while – you need to research extensively, ask the right questions, or maybe even opt for a trial before you zero in and partner up. However, once you’ve found the right fit, you don’t need to navigate the ifs and buts of overall accuracy that’s unceremoniously attached to automated transcription. This is the ultimate win in the AI vs. human battle with respect to long-form transcription.

Have questions about your transcription project? Talk to the awesome humans at iScribed!