Matching is a more simple operation than continuous speech recognition. Rather than streaming audio and receiving text or
uploading large audio files for batch transcription, matching is designed to identify what a user said in the context of a
more limited set of choices. This constrained vocabulary is useful for cases when an application wants to capture a command
from a user where that command is one of a relatively small number of choices. For example, if a medical record system
has five navigation tabs, then the acceptable commands might be "open tab one", "open tab two", and so on. This
constrained set of choices allows for programming of responses to discrete audio input from the user.
Before You Begin
API User Account
If your organization has not already been in contact with our Sales team, please complete this short form on the Developer Registration Page
and we will reach out to you regarding a user account and development with our APIs.
Once you have an account, you must change your password before the account can be used for API calls.
Audio Format
The nVoq Matching API supports G711 (muLaw) audio with the following properties:
Sample Rate: 8000 Hz
Sample Size: 8 bit
Channels: 1 (mono)
Frame Size: 1
Frame Rate: 8000
Big Endian: False (no endianness)
Signed / Unsigned: Signed
HTTP Format Alias: uLaw
HTTP Content-Type: audio/x-wav
Start your IDE
The nVoq API is a RESTful Web Services and WebSocket API and therefore does not constrain you to any specific platform or
programming language. We provide sample code below for shell scripting (bash), C#, Java, and JavaScript.
Follow along and run this code in your environment. But, if you prefer C++, Go, or some other language,
that's great! Just adapt the code below to your language's web services functionality and you should be good to go.
Let's Go!
Choose your programming language...
Step 1: Create Grammar
In order to define the matching choices, a list of those choices must be uploaded to the server. This can be accomplished in
two different ways. One can upload a list of words. Or, if the list of possible spoken phrases to be
recognized is more complex, an XML based grammar can be uploaded as well. In either case, the end result
is a grammar location reference for performing matching operations. If you upload a file, it must be in Unix format (just line feed, not line feed and carriage return).
#!/bin/bash
#Helper function to collect results from
#the location header in the HTTP response
collect_location()
{
sed -n -e 's/^.*Location: //p' | tr -d ' \r'
}
#server info
serverInfo="test.nvoq.com"
#credentials
user="yourUserName"
password="yourPassword"
#Content type must be:
# - "text/plain" for word list
# - "text/xml" for speech xml grammar
# Uupload the list of words --
# the server creates a grammar from this list and returns the URL
grammar_url=$(curl -v -u ${user}:${password} -X POST \
--header "Content-Type:text/plain" --data-binary "@matchingwords.txt" \
https://${serverInfo}/scgrammar/NUANCE?recognizerLocale=en-US \
2>&1 | collect_location)
####################################################
# mathingwords.txt contains the following lines
#
# red
# green
# blue
# yellow
# orange
#
####################################################
import java.io.*;
import java.net.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Paths;
import java.nio.file.Files;
import java.util.*;
public class Program{
// Path to 8-bit 8-kHz Mono uLaw wave file
private String myAudioFilePath = "./";
// Your username
private String username = "yourUsername";
// Your password
private String password = "yourPassword";
// Server URL
private String baseUrl = "https://test.nvoq.com";
// "audio/ogg" when using Ogg format
// "audio/x-wav" when using WAVE format
private String audioContentType = "audio/x-wav";
//text/plain for list of words
//text/xml for grammar
private String grammarContentType="text/plain";
private String createGrammar(String grammarFileName) {
//service creates a Nuance matching server compatible grammar
String url = baseUrl + "/scgrammar/NUANCE";
byte[] postData;
try {
postData = Files.readAllBytes(Paths.get(grammarFileName));
URL myurl = new URL(url);
HttpURLConnection con = (HttpURLConnection) myurl.openConnection();
con.setDoOutput(true);
String credentials = username + ":" + password;
String basicAuth = "Basic " +
new String(Base64.getEncoder().encode(credentials.getBytes()));
con.setRequestProperty ("Authorization", basicAuth);
con.setRequestMethod("POST");
con.setRequestProperty("Content-Type", grammarContentType);
DataOutputStream wr = new DataOutputStream(con.getOutputStream());
wr.write(postData);
wr.flush();
String location = con.getHeaderField("Location");
return location;
} catch (Exception e) {
System.out.println(e.toString());
}
return "see error message above";
}
//--------------------------------------------------
// matchingwords.txt contains the following lines
//
// red
// green
// blue
// yellow
// orange
//
//--------------------------------------------------
<!-- ===================================================== -->
<!-- Matching JavaScript How-To. The script below -->
<!-- performs a matching operation over HTTP. -->
<!-- ===================================================== -->
<!-- REMEMBER TO CONSIDER THE IMPACT OF CORS -->
<!-- You must disable it in your browser or -->
<!-- contact your nVoq representative to have -->
<!-- your domain added to the allowed list. -->
<!-- ===================================================== -->
<html>
<meta charset="UTF-8">
<body>
<!-- Simple audio file upload input -->
<p>nVoq API Matching HowTo</p>
<p>Choose the grammar file to upload.</p>
<input type="file" id="grammarFileInput" />
<br/>
<label id="grammarURLLabel">Grammar URL: --</label>
<br/>
<p>Choose the audio file to upload.</p>
<input type="file" id="audioFileInput" />
<br />
<label id="audioURLLabel">Audio URL: --</label>
<br />
<br/>
<input type="button" id="match" value="Perform Match"/>
<br/>
<textarea rows="25" cols="75" id="results">Matching results will appear here</textarea>
<script>
var text = "";
var status = "WORKING";
var connected = false;
var username = "yourUsername"
var password = "yourPassword"
var audioURL = "";
var grammarURL = "";
function readGrammarFile(evt) {
//Retrieve the first (and only) File from the FileList object
var f = evt.target.files[0];
if (f) {
var r = new FileReader();
r.onload = function(e) {
var authString = "Basic " + btoa(username + ":" + password);
var xhr = new XMLHttpRequest();
xhr.open('POST', "https://test.nvoq.com/scgrammar/NUANCE", true);
xhr.setRequestHeader("Content-Type","text/plain");
xhr.setRequestHeader("Authorization",authString);
xhr.onreadystatechange = processRequest;
xhr.send(r.result);
function processRequest(message) {
grammarURL = xhr.getResponseHeader("Location");
document.getElementById('grammarURLLabel').innerHTML = "Grammar URL: " + grammarURL;
}
}
r.readAsText(f);
} else {
alert("Failed to load file");
}
}
//...
document.getElementById('grammarFileInput').addEventListener('change',
readGrammarFile, false);
//...
//--------------------------------------------------
// matchingwords.txt contains the following lines
//
// red
// green
// blue
// yellow
// orange
//
//--------------------------------------------------
</script>
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using System.Text;
using System.Threading;
namespace nVoqHttpApiCSharp
{
class Program
{
/**** Begin configuration settings ****/
// Path to 8-bit 8-kHz Mono uLaw File
const string AudioFilePath = @"c:\path\to\your\matchingaudio.wav";
const string GrammarFilePath = @"c:\path\to\your\matchingwords.txt";
// Your username
const string Username = "yourUserName";
// Your password
const string Password = "yourPassword";
// Server URL
const string BaseUrl = "https://test.nvoq.com";
const string AudioContentType = "audio/x-wav";
//text/plain for list of words
//text/xml for grammar
const String GrammarContentType = "text/plain";
private static string createGrammar(string grammarFile)
{
HttpWebRequest request = BuildRequest("POST", BaseUrl, "/scgrammar/NUANCE");
if (!File.Exists(grammarFile))
throw new Exception("Could not locate audio file: " + grammarFile);
byte[] grammarBytes = File.ReadAllBytes(grammarFile);
request.ContentType = GrammarContentType;
using (Stream requestStream = request.GetRequestStream())
requestStream.Write(grammarBytes, 0, grammarBytes.Length);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode != HttpStatusCode.Created)
throw new Exception("Unexpected HTTP response: " + response.StatusCode);
return response.Headers.Get("Location");
}
//--------------------------------------------------
// matchingwords.txt contains the following lines
//
// red
// green
// blue
// yellow
// orange
//
//--------------------------------------------------
Step 2: Upload Audio
Each implementation below uploads audio according to the platform specifics.
If you don't have an audio file readily available,
you can download one here.
#Upload the Audio --
#Server returns a refernce to the audio location
audio_url=$(curl -v -X POST -u ${user}:${password} \
--header "Content-Type:audio/x-wav" \
--data-binary "@matchingaudio.wav" https://${serverInfo}/SCFileserver/audio 2>&1 \
| collect_location)
Below is the full sample code. Copy and paste the entire contents of the code below into your favorite editor and
save locally on your machine. Modify the URL's and username/password according to your credentials and
system access. Then, run the program and enjoy all the excitement of securely converting audio to text via
the nVoq.API platform.
If you have any questions, please reach out to support@nvoq.com.