JavaScript
Introduction
MIT License
The JavaScript library is made available under the MIT License
To get started, include the JavaScript library on your pages:
<script src="https://cdn.voysis.io/js/voysis.js"></script>
Versions
The versioning of the JavaScript library uses the standard MAJOR.MINOR.PATCH
semantic versioning schema (for example, voysis-1.1.4.js
). In short:
- MAJOR version changes when there are incompatible API changes with the previous version.
- MINOR version changes when there is new functionality added in a backwards-compatible manner
- PATCH version changes when there are backwards-compatible bug fixes.
Minified
A minified version of all JavaScript files will be available with the extension .min.js
Latest Version
voysis.js
and voysis.min.js
will always be an alias for the latest library version, so you should use that unless you need a specific version.
HTTPS Only
Your web page must be loaded using HTTPS, and you must include the JavaScript library using HTTPS. Most modern browsers restrict access to the microphone unless the page (and JS lib) are loaded over HTTPS.
Initialising the Voysis Session
Before you start sending queries, you need to create a VoysisSession
. This session can be created once and be used for the lifetime of your webpage. The 2 pieces of information that you must give are your Voysis endpoint and the audio profile identifier (see the section in general concepts for more information on what this value means).
var voysisSession = new VoysisSession({
host: 'mycompany.voysis.io',
audioProfileId: 'f8338e44-9d48-11e7-abc4-cec278b6b50a',
refreshToken: 'xP+nebIpJEXP09fATj5CMqULIBD549eAwSYZUerysUgItv0XRQJh+EEZaNFOplaY'
});
The complete list of properties that can be provided to the VoysisSesson
constructor are:
Name | Value Type | Description |
---|---|---|
host | String | Your VoysisAI endpoint's host name. |
audioProfileId | String | See the description of audioProfileId here |
refreshToken | String | The refresh token used to obtain application tokens for endpoint authorization. Authorization |
userId | String | See the description of userId here |
debugEnabled | Boolean | Enable or disable debug messages. Debug messages will be written to a browser's console. |
streamingAudioDeadline | Integer | The amount of time, in milliseconds, that the client will wait for a valid response to an audio query before assuming it has failed. Defaults to 20000 (20 seconds) |
tokenExpiryMargin | Integer | The margin, in milliseconds, that the client will allow around the current application token's expiry time before automatically renewing it. Defaults to 30000 (30 seconds). This means that if a request is made in the 30 seconds leading up to the application token's expiry time, a new application token will be obtained. |
audioBufferSize | Integer | The size, in bytes, of the audio buffer to use when recording audio. Defaults to 4096 |
Asynchronous Calls and Promises
All of the methods available on VoysisSession
are executed asynchronously. A JavaScript Promise
object is returned from each of the methods that will resolve when the asynchronous methods completes successfully, or reject if there is a failure.
Executing an Audio Query
The sendAudioQuery
method on VoysisSession
will create a new audio query and start streaming audio to the Voysis Query API. Streaming will stop when the server indicates that the user has stopped speaking and the final query response delivered to your application when it is ready.
The function takes the following parameters:
locale
: (mandatory) A string representing the locale of the querycontext
: (optional) An object representing the current context under which the query is being performed.conversationId
: (optional) A string value of the identifier of the conversation that this query will be part of.audioContext
: (optional) AnAudioContext
instance that will be used to stream audio from.
voysisSession.sendAudioQuery('en-US').then(function (queryResult) {
console.log('You said: ' + queryResult['textQuery']['text']);
}).catch(function (error) {
console.log("ERROR: " + JSON.stringify(error));
});
The objects passed to the callback will be the result of the query:
{
"id": "6bd98943-caaa-4058-a32b-9c540cc0f8db",
"queryType": "audio",
"textQuery": {
"text": "show me freshly squeezed orange juice"
},
"audioQuery": {
"mimeType": "audio/pcm;bits=16;rate=16000"
},
"intent": "newSearch",
"reply": {
"text": "Here's what I found"
},
"entities": {
"keywords": [
"freshly",
"squeezed",
"orange",
"juice"
],
"products": [],
"queryString": "freshly squeezed orange juice",
"sortBy": ""
},
"_links": {
"self": {
"href": "/conversations/bb30b7de-ae01-4868-9b6a-bf16ecd95126/queries/6bd98943-caaa-4058-a32b-9c540cc0f8db"
},
"audio": {
"href": "/conversations/bb30b7de-ae01-4868-9b6a-bf16ecd95126/queries/6bd98943-caaa-4058-a32b-9c540cc0f8db/audio"
},
"conversation": {
"href": "/conversations/bb30b7de-ae01-4868-9b6a-bf16ecd95126"
}
},
"_embedded": {}
}
Microphone Permissions
The browser will prompt the user for permission when they try to stream audio from their browser.
Most browsers will only prompt the user the first time they attempt to steam audio from a webpage, but some browsers may prompt for permission every time the page is reloaded. Using AJAX page updates is recommended to avoid this.
Browser Support
Not every browser on every operating system supports access to the microphone in order to capture audio.
Below are the list of browsers which we have tested that do and don't allow streaming audio from the microphone, ordered by Operating System.
Browser | Mac OSX | Windows | iOS | Android | Linux |
---|---|---|---|---|---|
Chrome | ✅ | ✅ | ❌ | ✅ | ✅ |
Firefox | ✅ | ✅ | - | - | ✅ |
Safari | ✅ | - | ✅ | - | - |
Internet Explorer | - | ❌ | - | - | - |
Microsoft Edge | - | ✅ | - | - | - |
Some additional information can be viewed at caniuse.com
-
in the above table means the browser was not available/not tested on the platform specified.
Notes
Internet Explorer
Microsoft have stated that web audio streaming won't be added to Internet Explorer. The latest version available at the time of testing was 11.
iOS
Only iOS 11 or later supports web audio streaming - no browser on any prior version of iOS will work.
At the time of testing, only Safari supported web audio streaming. Chrome (61.0) does not support audio streaming.
Safari on iOS will ask for microphone permissions on every page load. If you wish to support making multiple audio queries on a single page without having the user asked each time for permission, page reloads should be avoided and AJAX-based page refreshes should be used.
JavaScript Method
A method is available in the VoysisSession
object called isStreamingAudioSupported
, to help developers determine if the current browser will support streaming audio from a microphone:
if (!VoysisSession.isStreamingAudioSupported()) {
alert('Browser not supported');
}
Low-Level Session Methods
The Javascript SDK provides methods for lower-level access to the Voysis Query API than that provided by the sendAudioQuery
convenience method. These are documented below. Note that these methods all execute asynchronously and so should be chained together using their returned promises to function correctly.
Create Audio Query
Before you start streaming audio to Voysis, you need to create an Audio Query entity that is used to represent the query. (See General Concepts for a description on conversations and queries)
You create an new audio query by calling createAudioQuery
on VoysisSession
and passing in:
locale
: (mandatory) A string representing the locale of the querycontext
: (optional) An object representing the current context under which the query is being performed.conversationId
: (optional) A string value of the identifier of the conversation that this query will be part of.audioContext
: (optional) AnAudioContext
instance that will be used to stream audio from.
For example:
var locale = 'en-US';
var context = {
"keywords": [
"sprinting",
"sneakers"
],
"attributes": {
"color": "red"
}
};
var conversationId = previousQueryResults.conversationId
voysisSession.createAudioQuery(locale, context, conversationId).then(function (createdAudioQuery) {
console.log('Created audio query: ' + createdAudioQuery.id);
}).catch(function (error) {
console.log('Problem creating audio query: ' + JSON.stringify(error));
});
The object passed into to the callback is the created audio query. For example:
{
"id": "173f19f2-cb86-4bb5-8dc0-fe472d3e9b5e",
"queryType": "audio",
"audioQuery": {
"mimeType": "audio/pcm;bits=16;rate=16000"
},
"_links": {
"self": {
"href": "/conversations/4827d6f2-bd53-4c95-b094-1561f0776b70/queries/173f19f2-cb86-4bb5-8dc0-fe472d3e9b5e"
},
"audio": {
"href": "/conversations/4827d6f2-bd53-4c95-b094-1561f0776b70/queries/173f19f2-cb86-4bb5-8dc0-fe472d3e9b5e/audio"
},
"conversation": {
"href": "/conversations/4827d6f2-bd53-4c95-b094-1561f0776b70"
}
},
"_embedded": {}
}
Stream Audio
After you have created an Audio Query, you can start streaming data.
Assuming the user has given permissions for the website to access their microphone, the JavaScript library will automatically start streaming audio from the user's microphone to your Voysis endpoint.
The library will automatically detect when the user has stopped speaking, and stop streaming audio.
The streamAudio
method takes 2 parameters:
audioQueryResponse
: (optional) The response fromcreateAudioQuery
vadStopCallback
: (optional) A callback which will be triggered when we have detected that the user has stopped speaking.
When the query has been processed, the query result will be sent to the returned Promise's success handler:
voysisSession.streamAudio().then(function (queryResults) {
console.log('The results to the query were: ' + queryResults);
}).catch(function (error) {
console.log('Problem streaming audio: ' + JSON.stringify(error));
});
See Query Responses for details on the structure and meaning of the fields in the query response.
Below is an example response:
{
"id": "8a145526-9044-4218-9b6e-066cfef95e79",
"locale": "en-US",
"conversationId": "986e79de-6f4e-49dc-a128-41ff41661f3d",
"queryType": "audio",
"textQuery": {
"text": "show me red men's sneakers"
},
"audioQuery": {
"mimeType": "audio/pcm;bits=16;rate=16000"
},
"context": {
"attributes": {
"color": [
"red"
],
"gender": [
"men"
]
},
"keywords": [
"sneakers"
],
"price": {},
"sortBy": ""
},
"intent": "newSearch",
"reply": {
"text": "Here's what I found"
},
"entities": {
"products": []
},
"_links": {
"self": {
"href": "/queries/8a145526-9044-4218-9b6e-066cfef95e79"
},
"audio": {
"href": "/queries/8a145526-9044-4218-9b6e-066cfef95e79/audio"
}
},
"_embedded": {}
}
Finish Streaming
After calling streamAudio
, the response will normally be returned once it is detected that the user has stopped speaking.
You can manually trigger the end of the streaming by calling finishStreaming
on VoysisSession.
This will cause streamAudio
to indicate to the server that the audio is complete, causing it to process all received audio to that point in time and deliver the response.
The result will still be sent to the callback from streamAudio
.
voysisSession.finishStreaming();
This will also stop the streaming after calling the convenience method sendAudioQuery
. However, as this handles the creation of the conversation and audio query first, it is not known when streaming has begun, and calling finishStreaming
before the streaming has actually started will have no effect. It is therefore not recommended to use this method to stop sendAudioQuery
.
Download
The library can be downloaded from here.
Updated about 6 years ago