Best Practices
- 1. Ensure you are sending the correct audio format for streaming dictation:
- Ogg Opus (WebM), 16kHz
- Ogg Vorbis, 16kHz
- MPEG-4 (mp4), 16kHz
- PCM-wav, 16kHz, 16-bit
- Make sure your WebSocket connection to our dictation servers is established before sending audio
- Consider buffering audio locally then send once the connection is established
- Send audio in 300ms packets or greater
- Gracefully shut down upon error conditions, to keep your application in a good state.
- The WebSocket will timeout if no audio is received within 30 seconds and/or nothing but dictation silence is received for three minutes (180 seconds).
- Send audioDone to get stable text to return faster at the end of the dictation, since the engine waits a few seconds for more context before finalizing the text.
- You will have to open a new WebSocket after you send audioDone and receive the text
- If you do not want to close the dictation, then send a BOUNDARYREQUEST, which will prompt the dictation server to send stable text a little quicker, but will not close out the WebSocket.
- You can provide users an interactive indication that a recording is in progress and a transcription is happening by using hypothesis text.
- Also use colors, graphics, gain indicator, button depress, etc. to indicate recording has started
- If you want your users to be able to edit the text while they’re dictating, place stable text in your application, not hypothesis text.
- Make sure you’ve done your due diligence in testing and diagnosing issues in your application, before reaching out to our Support team.
- For example, are your nVoq credentials correct?
- What system/environment are you using (test, healthcare, Canada)?
- Is the issue affecting all your user accounts, or just one user / location?
- Include detailed logging in your application so you can provide accurate and complete details to our Support team in case you need to open a ticket.
- If you want to support spoken command and control during a dictation (on a single-button press), use Word Markers to identify and remove the spoken command and control phrases from the transcript before rendering the text to the screen.
- For best transcription results, use medically relevant dictation content. Medical dictation samples for different topics can be found at www.mtsamples.com .