Audio Guidelines

When sending audio to the Voysis AI APIs, the format of the audio is important. Sending audio which does not match the specifications below may reduce performance or cause the audio not to be understood.

  • Format: Raw PCM or WAV containing Raw PCM.
  • Sample Rate: 16,000Hz
  • Bits Per Sample: 16
  • Endian: Little-Endian
  • Channels: 1

The JavaScript Library will automatically decimate/downsample the audio captured from the microphone into the above format. If you are using the REST or WebSocket API, you must perform this downsampling yourself if necessary.

Mime Type

To indicate the type of audio that you are sending to the Voysis API, you should use the audio/pcm mime type with the following parameters:

  • rate: What is the sample rate of the audio (in Hz)
  • bits: How many bits per sample are in the audio

For example:

audio/pcm;bits=16;rate=16000