Dictation Use Case Example
1. App is in an idle state
2. User clicks Start
Dictation button
2.1
App
UI changes state (Ex. Displays “Starting”, disables button)
2.2
App
tells sound card to start recording audio
2.3
App
creates new WebSocket
2.4
App
connects WebSocket to nVōq (Ex. Hostname “test.nVoq.com”)
2.5
If
connection fails, App displays an error to the user and returns to an idle
state
2.6
App
sends “STARTDICTATION” JSON request message to nVōq via WebSocket
2.6.1 This message includes
credentials (Ex. username + password, API Key, or SSO Token) amongst other
configurable parameters (see API docs)
2.7
App
receives “STARTDICTATION” JSON response message from nVōq via WebSocket
(asynchronous)
2.7.1 This response message
indicates success or failure of server-side verification of the parameters
supplied by the client in the STARTDICTATION request (Ex. Authentication, audio
format settings, …)
2.8
App
receives short (Ex. ~300 millisecond) PCM audio buffer from sound card
(asynchronous)
2.8.1 App passes PCM audio to
OggVorbis encoding library
2.8.2 App retrieves OggVorbis
encoded audio from OggVorbis library
2.8.3 App sends short (Ex.
~300 millisecond) OggVorbis audio byte buffer to nVōq via WebSocket
2.9
App
receives updated transcript TEXT JSON from nVōq via WebSocket
(asynchronous)
2.9.1 App displays updated
text in UI
3. User clicks Stop
Dictation button at their discretion
3.1
App
tells sound card to stop recording audio
3.2
App
sends “AUDIODONE” JSON message to nVōq via WebSocket
3.3
App
might receive updated transcript TEXT JSON from nVōq via WebSocket (asynchronous)
3.3.1 App displays updated
text in UI
3.4
App
receives a “textDone=true” JSON message from nVōq via WebSocket,
indicating end of transcript (asynchronous)
3.4.1 App closes WebSocket
4. App returns to an idle
state
4.1
App
UI indicates dictation is complete (Ex. Displays “Finished” or removes status
display, re-enables button)