Voice Interaction

Here is the whole picture of the voice interaction services around world.

1. Alexa

Alexa is Amazon’s cloud-based voice service available on tens of millions of devices from Amazon and third-party device manufacturers. With Alexa, you can build natural voice experiences that offer customers a more intuitive way to interact with the technology they use every day. Our collection of tools, APIs, reference solutions, and documentation make it easy for anyone to build with Alexa.

What You Can Build with Alexa? - Add Capabilities to Alexa: Add capabilities, or skills, to Alexa using the Alexa Skills Kit (ASK), a collection of self-service APIs, tools, documentation, and code samples. Skills make Alexa smarter and enable customers to do more with voice. Build natural, voice-first experiences with our toolkit, and help redefine the way your customers interact with technology. - Integrate Alexa into Your Device: Integrate Alexa directly into your products with the Alexa Voice Service (AVS), bringing the convenience of hands-free voice control to any connected device. Through AVS, you can add a new intelligent interface to your products and offer customers access to a growing number of Alexa features, smart home integrations, and skills. - Connect Devices to Alexa: Connect Alexa to your devices to deliver delightful and intuitive experiences to your customers. Add Alexa to your smart home devices to enable voice control of your smart cameras, lights, entertainment systems, and more. And build your own Alexa Gadgets or create interactive skills that work with Alexa Gadgets such as Echo Buttons.

Developer Resource

- Alexa Voice Service Get Started - avs-device-sdk

2. Google assistant

The Google Assistant SDK lets you add hotword detection, voice control, natural language understanding and Google’s smarts to your devices. Your device captures an utterance (a spoken audio request, such as What's on my calendar?), sends it to the Google Assistant, and receives a spoken audio response in addition to the raw text of the utterance.

What can it do? - MANAGE TASKS:Send a text, set reminders, turn on battery saver and instantly look up emails. - PLAN YOUR DAY:Check your flight status, make a dinner reservation, check when your movie starts, and find a coffee stop along your route. - ENJOY ENTERTAINMENT:Control music on Google Play and YouTube Music. You can also pick up where you left off on your favorite podcasts with your Assistant on Google Home. - MAKE MEMORIES:Your Assistant makes it incredibly simple to find your photos — and to take them as well. - GET ANSWERS:Get real-time answers including the latest on weather, traffic, finance, or sports. Quickly find translations while you’re traveling. - CONTROL YOUR HOME:Use your phone to control your smart home devices. Adjust the temperature, lighting, and more, even when you’re not home.

Developer Resource

3. Bing Speech

Bing Speech includes Convert audio to text, understand intent, and convert text back to speech for natural responsiveness.

Speech Recognition Convert spoken audio to text. The API can be directed to turn on and recognize audio coming from the microphone in real-time, recognize audio coming from a different real-time audio source, or to recognize audio from within a file. In all cases, real-time streaming is available, so as the audio is being sent to the server, partial recognition results are also being returned.

The Speech to Text API enables you to build smart apps that are voice triggered. To see how it works select your target language then click on the microphone and start speaking. Or simply click on one of the sample speech phrases to see how speech recognition works. When you use this demo you consent to providing your voice input data to Microsoft for service improvement purpose.

Text to Speech Convert text to spoken audio. When applications need to “talk” back to their users, this API can be used to convert text that is generated by the app into audio that can be played back to the user.

The Text-To-Speech API enables you to build smart apps that can speak. You can test it now, simply choose your target language, add your sentences then click on the play button to see how speech synthesis works. When you use this demo you consent to providing your voice input data to Microsoft for service improvement purposes.

Developer Resource

4. Baidu

Baidu Speech includes STT, TTS, voice interaction, offline wakeup.

Developer Resource


Voice Interaction tutorial list

Here is voice interaction tutorial list.