X

API Developer

Best Practices

1. Ensure you are sending the correct audio format for streaming dictation:

Ogg Opus (WebM), 16kHz
Ogg Vorbis, 16kHz
MPEG-4 (mp4), 16kHz
PCM-wav, 16kHz, 16-bit

Make sure your WebSocket connection to our dictation servers is established before sending audio
- Consider buffering audio locally then send once the connection is established
- Send audio in 300ms packets or greater

Gracefully shut down upon error conditions, to keep your application in a good state.
- The WebSocket will timeout if no audio is received within 30 seconds and/or nothing but dictation silence is received for three minutes (180 seconds).

Send audioDone to get stable text to return faster at the end of the dictation, since the engine waits a few seconds for more context before finalizing the text.
- You will have to open a new WebSocket after you send audioDone and receive the text
- If you do not want to close the dictation, then send a BOUNDARYREQUEST, which will prompt the dictation server to send stable text a little quicker, but will not close out the WebSocket.

You can provide users an interactive indication that a recording is in progress and a transcription is happening by using hypothesis text.
- Also use colors, graphics, gain indicator, button depress, etc. to indicate recording has started

If you want your users to be able to edit the text while they’re dictating, place stable text in your application, not hypothesis text.

Make sure you’ve done your due diligence in testing and diagnosing issues in your application, before reaching out to our Support team.
- For example, are your nVoq credentials correct?
- What system/environment are you using (test, healthcare, Canada)?
- Is the issue affecting all your user accounts, or just one user / location?

Include detailed logging in your application so you can provide accurate and complete details to our Support team in case you need to open a ticket.

If you want to support spoken command and control during a dictation (on a single-button press), use Word Markers to identify and remove the spoken command and control phrases from the transcript before rendering the text to the screen.

For best transcription results, use medically relevant dictation content. Medical dictation samples for different topics can be found at www.mtsamples.com .

#!/bin/bash
#TBD

2024 nVōq Inc. All rights reserved. Privacy Policy