JavaScript

Introduction

📘

MIT License

The JavaScript library is made available under the MIT License

To get started, include the JavaScript library on your pages:

<script src="https://cdn.voysis.io/js/voysis.js"></script>

Versions

The versioning of the JavaScript library uses the standard MAJOR.MINOR.PATCH semantic versioning schema (for example, voysis-1.1.4.js). In short:

  • MAJOR version changes when there are incompatible API changes with the previous version.
  • MINOR version changes when there is new functionality added in a backwards-compatible manner
  • PATCH version changes when there are backwards-compatible bug fixes.

Minified

A minified version of all JavaScript files will be available with the extension .min.js

Latest Version

voysis.js and voysis.min.js will always be an alias for the latest library version, so you should use that unless you need a specific version.

❗️

HTTPS Only

Your web page must be loaded using HTTPS, and you must include the JavaScript library using HTTPS. Most modern browsers restrict access to the microphone unless the page (and JS lib) are loaded over HTTPS.

Initialising the Voysis Session

Before you start sending queries, you need to create a VoysisSession. This session can be created once and be used for the lifetime of your webpage. The 2 pieces of information that you must give are your Voysis endpoint and the audio profile identifier (see the section in general concepts for more information on what this value means).

var voysisSession = new VoysisSession({
    host: 'mycompany.voysis.io',
    audioProfileId: 'f8338e44-9d48-11e7-abc4-cec278b6b50a',
    refreshToken: 'xP+nebIpJEXP09fATj5CMqULIBD549eAwSYZUerysUgItv0XRQJh+EEZaNFOplaY'
});

The complete list of properties that can be provided to the VoysisSesson constructor are:

NameValue TypeDescription
hostStringYour VoysisAI endpoint's host name.
audioProfileIdStringSee the description of audioProfileId here
refreshTokenStringThe refresh token used to obtain application tokens for endpoint authorization.

Authorization
userIdStringSee the description of userId here
debugEnabledBooleanEnable or disable debug messages. Debug messages will be written to a browser's console.
streamingAudioDeadlineIntegerThe amount of time, in milliseconds, that the client will wait for a valid response to an audio query before assuming it has failed. Defaults to 20000 (20 seconds)
tokenExpiryMarginIntegerThe margin, in milliseconds, that the client will allow around the current application token's expiry time before automatically renewing it. Defaults to 30000 (30 seconds). This means that if a request is made in the 30 seconds leading up to the application token's expiry time, a new application token will be obtained.
audioBufferSizeIntegerThe size, in bytes, of the audio buffer to use when recording audio. Defaults to 4096

Asynchronous Calls and Promises

All of the methods available on VoysisSession are executed asynchronously. A JavaScript Promise object is returned from each of the methods that will resolve when the asynchronous methods completes successfully, or reject if there is a failure.

Executing an Audio Query

The sendAudioQuery method on VoysisSession will create a new audio query and start streaming audio to the Voysis Query API. Streaming will stop when the server indicates that the user has stopped speaking and the final query response delivered to your application when it is ready.

The function takes the following parameters:

  1. locale: (mandatory) A string representing the locale of the query
  2. context: (optional) An object representing the current context under which the query is being performed.
  3. conversationId: (optional) A string value of the identifier of the conversation that this query will be part of.
  4. audioContext: (optional) An AudioContext instance that will be used to stream audio from.
voysisSession.sendAudioQuery('en-US').then(function (queryResult) {
    console.log('You said: ' + queryResult['textQuery']['text']);
}).catch(function (error) {
    console.log("ERROR: " + JSON.stringify(error));
});

The objects passed to the callback will be the result of the query:

{
  "id": "6bd98943-caaa-4058-a32b-9c540cc0f8db",
  "queryType": "audio",
  "textQuery": {
    "text": "show me freshly squeezed orange juice"
  },
  "audioQuery": {
    "mimeType": "audio/pcm;bits=16;rate=16000"
  },
  "intent": "newSearch",
  "reply": {
    "text": "Here's what I found"
  },
  "entities": {
    "keywords": [
      "freshly",
      "squeezed",
      "orange",
      "juice"
    ],
    "products": [],
    "queryString": "freshly squeezed orange juice",
    "sortBy": ""
  },
  "_links": {
    "self": {
      "href": "/conversations/bb30b7de-ae01-4868-9b6a-bf16ecd95126/queries/6bd98943-caaa-4058-a32b-9c540cc0f8db"
    },
    "audio": {
      "href": "/conversations/bb30b7de-ae01-4868-9b6a-bf16ecd95126/queries/6bd98943-caaa-4058-a32b-9c540cc0f8db/audio"
    },
    "conversation": {
      "href": "/conversations/bb30b7de-ae01-4868-9b6a-bf16ecd95126"
    }
  },
  "_embedded": {}
}

Microphone Permissions

The browser will prompt the user for permission when they try to stream audio from their browser.

🚧

Most browsers will only prompt the user the first time they attempt to steam audio from a webpage, but some browsers may prompt for permission every time the page is reloaded. Using AJAX page updates is recommended to avoid this.

Browser Support

Not every browser on every operating system supports access to the microphone in order to capture audio.
Below are the list of browsers which we have tested that do and don't allow streaming audio from the microphone, ordered by Operating System.

BrowserMac OSXWindowsiOSAndroidLinux
Chrome:white-check-mark::white-check-mark::x::white-check-mark::white-check-mark:
Firefox:white-check-mark::white-check-mark:--:white-check-mark:
Safari:white-check-mark:-:white-check-mark:--
Internet Explorer-:x:---
Microsoft Edge-:white-check-mark:---

Some additional information can be viewed at caniuse.com

📘

- in the above table means the browser was not available/not tested on the platform specified.

Notes

Internet Explorer

Microsoft have stated that web audio streaming won't be added to Internet Explorer. The latest version available at the time of testing was 11.

iOS

Only iOS 11 or later supports web audio streaming - no browser on any prior version of iOS will work.
At the time of testing, only Safari supported web audio streaming. Chrome (61.0) does not support audio streaming.

🚧

Safari on iOS will ask for microphone permissions on every page load. If you wish to support making multiple audio queries on a single page without having the user asked each time for permission, page reloads should be avoided and AJAX-based page refreshes should be used.

JavaScript Method

A method is available in the VoysisSession object called isStreamingAudioSupported, to help developers determine if the current browser will support streaming audio from a microphone:

if (!VoysisSession.isStreamingAudioSupported()) {
    alert('Browser not supported');
}

Low-Level Session Methods

The Javascript SDK provides methods for lower-level access to the Voysis Query API than that provided by the sendAudioQuery convenience method. These are documented below. Note that these methods all execute asynchronously and so should be chained together using their returned promises to function correctly.

Create Audio Query

Before you start streaming audio to Voysis, you need to create an Audio Query entity that is used to represent the query. (See General Concepts for a description on conversations and queries)

You create an new audio query by calling createAudioQuery on VoysisSession and passing in:

  1. locale: (mandatory) A string representing the locale of the query
  2. context: (optional) An object representing the current context under which the query is being performed.
  3. conversationId: (optional) A string value of the identifier of the conversation that this query will be part of.
  4. audioContext: (optional) An AudioContext instance that will be used to stream audio from.

For example:

var locale = 'en-US';
var context = {
    "keywords": [
        "sprinting",
        "sneakers"
    ],
    "attributes": {
        "color": "red"
    }
};
var conversationId = previousQueryResults.conversationId
voysisSession.createAudioQuery(locale, context, conversationId).then(function (createdAudioQuery) {
    console.log('Created audio query: ' + createdAudioQuery.id);
}).catch(function (error) {
    console.log('Problem creating audio query: ' + JSON.stringify(error));
});

The object passed into to the callback is the created audio query. For example:

{
  "id": "173f19f2-cb86-4bb5-8dc0-fe472d3e9b5e",
  "queryType": "audio",
  "audioQuery": {
    "mimeType": "audio/pcm;bits=16;rate=16000"
  },
  "_links": {
    "self": {
      "href": "/conversations/4827d6f2-bd53-4c95-b094-1561f0776b70/queries/173f19f2-cb86-4bb5-8dc0-fe472d3e9b5e"
    },
    "audio": {
      "href": "/conversations/4827d6f2-bd53-4c95-b094-1561f0776b70/queries/173f19f2-cb86-4bb5-8dc0-fe472d3e9b5e/audio"
    },
    "conversation": {
      "href": "/conversations/4827d6f2-bd53-4c95-b094-1561f0776b70"
    }
  },
  "_embedded": {}
}

Stream Audio

After you have created an Audio Query, you can start streaming data.

Assuming the user has given permissions for the website to access their microphone, the JavaScript library will automatically start streaming audio from the user's microphone to your Voysis endpoint.

The library will automatically detect when the user has stopped speaking, and stop streaming audio.

The streamAudio method takes 2 parameters:

  1. audioQueryResponse: (optional) The response from createAudioQuery
  2. vadStopCallback: (optional) A callback which will be triggered when we have detected that the user has stopped speaking.

When the query has been processed, the query result will be sent to the returned Promise's success handler:

voysisSession.streamAudio().then(function (queryResults) {
    console.log('The results to the query were: ' + queryResults);
}).catch(function (error) {
    console.log('Problem streaming audio: ' + JSON.stringify(error));
});

See Query Responses for details on the structure and meaning of the fields in the query response.

Below is an example response:

{
  "id": "8a145526-9044-4218-9b6e-066cfef95e79",
  "locale": "en-US",
  "conversationId": "986e79de-6f4e-49dc-a128-41ff41661f3d",
  "queryType": "audio",
  "textQuery": {
    "text": "show me red men's sneakers"
  },
  "audioQuery": {
    "mimeType": "audio/pcm;bits=16;rate=16000"
  },
  "context": {
    "attributes": {
      "color": [
        "red"
      ],
      "gender": [
        "men"
      ]
    },
    "keywords": [
      "sneakers"
    ],
    "price": {},
    "sortBy": ""
  },
  "intent": "newSearch",
  "reply": {
    "text": "Here's what I found"
  },
  "entities": {
    "products": []
  },
  "_links": {
    "self": {
      "href": "/queries/8a145526-9044-4218-9b6e-066cfef95e79"
    },
    "audio": {
      "href": "/queries/8a145526-9044-4218-9b6e-066cfef95e79/audio"
    }
  },
  "_embedded": {}
}

Finish Streaming

After calling streamAudio, the response will normally be returned once it is detected that the user has stopped speaking.

You can manually trigger the end of the streaming by calling finishStreaming on VoysisSession.

This will cause streamAudio to indicate to the server that the audio is complete, causing it to process all received audio to that point in time and deliver the response.

The result will still be sent to the callback from streamAudio.

voysisSession.finishStreaming();

This will also stop the streaming after calling the convenience method sendAudioQuery. However, as this handles the creation of the conversation and audio query first, it is not known when streaming has begun, and calling finishStreaming before the streaming has actually started will have no effect. It is therefore not recommended to use this method to stop sendAudioQuery.

Download

The library can be downloaded from here.