M*Modal Overview

Top  Previous  Next

What Is Speech Recognition Technology?

Speech Recognition Technology allows an end user with a computer equipped with a source of sound input (i.e.: microphone, etc.) to convert human speech to text (i.e.: transcription, editing, etc…).  Speech recognition implies only that the computer can take dictation, not that it understands what is being said.

Discrete or conventional speech recognition technology converts speech to text (word for word).  The speaker is required to:

Maintain well-enunciated speech
Separate each word with a short pause
Spend time training the technology to recognize their voice

 

There are two primary types of speech recognition technologies, “front-end” and “back-end” speech recognition. Front-end speech recognition refers to technology that converts speech to text as it is being spoken.  The text is displayed on the computer screen and is corrected by the speaker. Back-end speech recognition refers to technology in which a complete recording is converted into text, usually on a server, rather than on the speaker’s computer.

Typically, when front-end speech recognition is used, the physician will self-edit the resulting document.  Back-end speech recognition enhances the medical transcription process by providing a draft document for the transcriptionist to review and edit before rendering into a structured clinical document.

 

What Makes M*Modal’s Speech Understanding TM unique?

M*Modal’s Speech Understanding TM is unique because it combines speech recognition and natural language understanding technologies. AnyModal CDS understands what the dictator is saying.  It uses a comprehension of the context to improve the accuracy of the speech recognition – creating a highly accurate document.

AnyModal CDS converts speech by Always Understanding the dictator regardless of:

Accent
Dialect
Mode (talking speed)
Verbal Corrections
Non-Verbal Noises (coughing, etc)

 

Finally, the system creates a unique profile of each individual’s voice and saves the information within the database for further study and recognition.  After an initial period of “watching” transcription, the system creates a unique speaker profile based on their speaking style, intonation and other elements.  Then, after the profile is built and AnyModal CDS is generating draft documents, the system watches the corrections made by the medical transcriptionist – and learns!

 

How Does This Affect the Medical Transcriptionist’s Role?

Speech Recognition receives a lot of attention as a viable solution to multiple healthcare documentation issues.  One common misconception often promoted is that speech recognition will eliminate the need for transcription, thus eliminating the medical transcriptionist.

It is important to note that the availability of front-end speech solutions has not diminished the need for transcription in hospitals, clinics, or private practices.  In fact, the number of lines of transcription continues to increase year after year.