How to Build Your Own AI Assistant Using Api.ai
Want to have an in-depth understanding of artificial intelligence? Watch our video tutorial "Microsoft Cognitive Services and Text Analysis API: Implementing AI Sentiment Analysis in Your Robot".
The world of smart assistants is changing with each passing day - Siri, Cortana, Alexa, Ok Google, Facebook M, Bixby - All tech giants have their own smart assistants. However, many developers don’t realize that building their own AI assistant is actually very simple! You can customize to your needs, IoT devices, and custom APIs, and the possibilities are endless.
Note: This article was updated in 2017 to reflect the latest changes in Api.ai.
Earlier, I wrote a guide on five simple ways to build artificial intelligence in 2016, which covers some simple options for building an AI assistant. In this article, I want to focus on a specific service—Api.ai, which makes building a fully functional AI assistant extremely simple and has very few initial settings.
Key Points
- Api.ai (now part of Google Cloud) enables developers to build AI assistants using natural language processing and voice-to-text capabilities.
- Getting started with Api.ai is easy – just sign up with your Google account, agree to the terms and create your first proxy.
- Customize your AI assistant by enabling the Chat feature in Api.ai to make your bot sound more user-friendly and attractive.
- Connect your web interface to Api.ai using JavaScript and HTML5 to enable text input and voice commands.
- Api.ai provides detailed JSON response and debugging tools to help improve and troubleshoot AI assistant responses.
- For hosting, consider using services such as Glitch.com for free HTTPS-enabled web hosting to ensure secure communication with Api.ai.
Build an AI Assistant with Api.ai
This article is one of a series of articles designed to help you run a simple personal assistant using Api.ai:
- How to build your own AI assistant using Api.ai (this article!)
- Customize your Api.ai Assistant with Intent and Context
- Enhance your Api.ai Assistant with Entities
- How to connect your Api.ai assistant to the Internet of Things
What is Api.ai?
Api.ai is a service that allows developers to build voice-to-text, natural language processing, artificial intelligence systems that you can train with your own custom features. They have a range of existing knowledge bases called “domains” that Api.ai builds systems that can automatically understand – which are the focus we will focus on in this article. The field provides a complete knowledge base of encyclopedia knowledge, language translation, weather and other knowledge. In a future post, I will cover some of the more advanced aspects of Api.ai that allow you to further personalize your assistant.
Api.ai Getting Started
First, we will head to the Api.ai website and click on the “Free Start” button or the “Free Sign Up” button in the upper right corner.
We'll then go to a fairly simple registration form: enter your name, email, and password and click "Register". For those who want to avoid another set of login credentials, you can also sign up with your GitHub or Google account using the button on the right.
As Api.ai was acquired by Google, it has been moved to log in with only Google accounts. So if you are new to Api.ai, you need to log in with your Google account:
Click "Allow" on the next screen to grant Api.ai permission to access your Google account:
You also need to read and agree to their Terms of Service:
After registering, you will go directly to the Api.ai interface where you can create your virtual AI assistant. Each assistant you create and teach a specific skill is called a “agent” in Api.ai. So first, you can click on the "Create Agent" button in the upper left corner to create your first agent:
You may need to re-authorize Api.ai to obtain additional permissions to your Google account. This is normal, no problem! Click "Authorization" to continue:
Then allow:
On the next screen, enter the details of your agent, including:
- Name: This is just to facilitate your distinction between proxy in the api.ai interface. You can name the agent as you like—it can be the name of the person (I chose Barry) or the name that represents the task they are helping with (such as light-controller).
- Description: Human-readable description so that you can remember what the agent is responsible for. This is optional and may not be needed if your proxy name speaks for itself.
- Language: The language used by the agent. Once you choose a language, you can't change it—So choose with caution! In this tutorial, choose English because English can access most Api.ai areas. You can view the fields available for each language in the language table of the Api.ai document.
- Time Zone: As you expected, this is the time zone you proxy. It is most likely that your current time zone has been detected.
It also automatically sets up a Google Cloud Platform project for your agent, so you don't need to do anything in this regard; everything is automated! It's important to know this, though, so if you do a lot of testing and create a lot of agents, know that you are creating many Google Cloud Platform projects that you may need to clean up on some day.
After entering the proxy settings, select "Save" next to the proxy name to save everything:
Test Console
After creating the agent, you can test it using the test console on the right. You can enter a query at the top and it will send these queries to your proxy, showing you what will be returned after hearing these statements. Enter a question like "How are you?" and see what it returns. Your results should be displayed below:
If you scroll down to the right of the result, you will see more details about how Api.ai explains your request (as shown in the screenshot above). Below it, there is a button called "Show JSON". Click it to see how the API will return such responses to your application.
Api.ai will open the JSON viewer and show you a JSON response similar to this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
As you can see... your agent doesn't know how to respond! At the moment, it is not really "smart" AI: it still needs to add smart parts. The input.unknown value in the action field tells you that it is not sure how to proceed. Above, it returns a message "Sorry, can you say it again?", which is one of its default fallbacks. It is not telling humans that it doesn't understand, but it repeatedly asks them to say it again...it's not ideal, I would rather change it to something that shows more clearly when the robot doesn't understand. If you are also picky about things like this and want to change what it says here, you can find it on the Intent page by clicking on the Default Back Intent item there.
For those who have used Api.ai some time ago (or see it run) you might actually expect it to provide more features when it comes out of the box. Previously, it was able to answer queries such as "Who is Steve Jobs?" by default. This is not the case now! You must add custom integrations with third-party APIs to take action and get information. Api.ai provides sentence analysis and interpretation.
Add chat
You can add a default feature that will let your bot show a little bit of intelligence-the "chat" feature. This provides a range of answers to common questions…including the “How are you?” above. But this is not enabled by default. To enable it, go to the Chat menu item on the left and click Enable:
When enabled, if you scroll down, you can see a range of common small-chat phrases. Find the "Greetings/Farewell" section and click on it to expand it. Add some different replies to the "How are you?" question and click "Save" in the upper right corner. After adding the phrase, you will see the percentage number increased next to the “Greetings/Farewell” section to show how many chatbots you have customized.
Then, if you go to the test console and ask it again "How are you?", it should now answer with one of the replies you entered!
If it does not respond correctly, check if you actually clicked "Save" before trying! It will not be saved automatically.
Ideally, you need to customize your chat replies as much as possible: This will give your Api.ai bot a more unique personality. You can choose the tone and structure of their reply. Is it a grumpy chatbot that hates humans talking to? Is it a cat-obsessed chatbot? Or a chatbot that responds in a teenager's internet/sMS style? You can decide!
Now that you have at least some chat elements running, your agent is now ready for you to integrate it into your own web application interface. To do this, you need to obtain the API key to remotely access your proxy.
Find your Api.ai API key
The API key you need is located on the proxy settings page. To find it, click the gear icon next to the agent name. On the page that appears, copy and paste the Client Access Token into a safe place. This is what we need to use to issue queries to the Api.ai service:
Code
If you want to view the working code and try it, it can be found on GitHub. Feel free to use it and extend this idea to create your own AI personal assistant.
If you want to try it, I run Barry here. Enjoy it!
Connect to Api.ai using JavaScript
You currently have a working personal assistant running somewhere in the Api.ai cloud. You now need a way to talk to your personal assistant from your own interface. Api.ai has a series of platform SDKs that work with Android, iOS, web applications, Unity, Cordova, C, etc. You can even integrate it into a Slack bot or a Facebook Messenger bot! In this example, you will use HTML and JavaScript to create a simple personal assistant web application. My demo is based on the concept displayed in HTML JS gist from Api.ai.
Your application will do the following:
- Accept a written command in the input field and submit it when you press Enter.
- Or use the HTML5 Speech Recognition API (this is only available for Google Chrome 25 and later), if users click "Speak", they can speak their commands and write them automatically to the input field.
- After receiving the command, you can submit an AJAX POST request to Api.ai using jQuery. Api.ai returns its knowledge as a JSON object, as you can see in the test console above.
- You will use JavaScript to read the JSON file and display the results on your web application.
- If available, your web application will also reply to you voice-by-voice using the Web Voice API (available in Google Chrome 33 and later).
The entire web application can be found on GitHub in the link above. Feel free to refer to it to see how I style and build HTML. I won't explain every part of how it is built in this post, focusing mainly on the Api.ai SDK aspect. I will also briefly point out and explain what parts use the HTML5 Speech Recognition API and the Web Speech API.
Your JavaScript contains the following variables:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
The following is what each variable is for:
- accessToken. This is the API key you copied from the Api.ai interface. These keys allow you to access the SDK and indicate which proxy you are accessing. I want to visit Barry, my personal agent.
- baseUrl. This is the basic URL for all calls to the Api.ai SDK. If the SDK comes out with a new version, you can update it here.
- $speechInput. This stores your
<input>
element so that you can access it in JavaScript. - $recBtn. This stores your
<button>
element, which will be used when the user wants to click and speak to the web application (or pauses listening if listening). - recognition. You store the webkitSpeechRecognition() function in this variable. This is for the HTML5 speech recognition API.
- messageRecording, messageCouldntHear, messageInternalError, and messageSorry. These are messages displayed when the application is recording the user's voice, unable to hear the user's voice, an internal error occurs, and the agent does not understand. You store them as variables so you can easily change them at the top of the script, and you can also specify which variables you don't want the application to read aloud later.
In these lines of code, find the time the user presses Enter in the input field. When this happens, run the send() function to send data to Api.ai:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
Next, observe whether the user has clicked the record button to ask the app to listen to them (or pause if it is listening). If they click on it, run the switchRecognition() function to switch between recording and not recording:
1 2 3 4 5 6 7 8 9 |
|
Finally, for your initial jQuery settings, you set a button that will be located at the bottom right of the screen to show and hide the JSON response. This is just to keep it simple: in most cases you don't want to see incoming JSON data, but sometimes if something unexpected happens, you can click this button to toggle whether the JSON is visible:
1 2 3 4 5 6 |
|
Using HTML5 Speech Recognition API
As mentioned above, you will use the HTML5 speech recognition API to listen to users and transcribe what they say. Currently this only applies to Google Chrome.
Our startRecognition() function looks like this:
1 2 3 |
|
This is how to run the HTML5 speech recognition API. It all uses the functions in webkitSpeechRecognition() . Here are some tips on what is going on:
- recognition.onstart. Runs when recording starts from the user's microphone. You use a function called response() to display your message telling the user that you are listening to them. I'll introduce the response() function in more detail soon. updateRec() Switches the text of the recording button from "Stop" to "Speak".
- recognition.onresult. Runs when you get results from speech recognition. You parse the result and set the text field to use that result (this function just adds the text to the input field and then runs your send() function).
- recognition.onend. Runs when speech recognition is finished. You set it to null in recognition.onresult to prevent it from running when you get successful results. That way, if recognition.onend is running, you know that the speech recognition API does not understand the user. If the function does run, you will reply to the user to tell them that you are not hearing them correctly.
- recognition.lang. Set the language you are looking for. In the case of the demo, it is looking for American English.
- recognition.start(). Start the whole process!
Your stopRecognition() function is much simpler. It stops your recognition and sets it to null. It then updates the button to show that you are no longer recording:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
switchRecognition() toggles whether you start or stop recognition by checking the recognition variable. This allows your buttons toggle the recognition function on and off:
1 2 3 4 5 6 7 8 9 |
|
Communicate with Api.ai
To send your query to Api.ai, you can use the send() function, which looks like this:
1 2 3 4 5 6 |
|
This is a typical AJAX POST request using jQuery for https://www.php.cn/link/dfc0a2d63b0d7a1ce1cd07ffe3a3aea7. You make sure you send JSON data to it and expect to get the JSON data from it. You also need to set two headers—Authorization and ocp-apim-subscription-key—to the API key of Api.ai. You send data to Api.ai in the format {q: text, lang: "en"} and wait for a response.
When you receive the response, you will run prepareResponse(). In this function, you format the JSON string to be put into the debug section of the web application and take out the result.speech section of the Api.ai response, which provides you with the text response of the assistant. You display each message via response() and debugRespond():
1 2 3 |
|
Your debugRespond() function puts text into your JSON response field:
1 2 3 4 |
|
Your response() function has more steps:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
First, you check if the response value is empty. If yes, set it to indicate that it is not sure of the answer to the question, because Api.ai does not return you a valid response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
If you do have a message to be output and it is not the message being recorded, you will use the Web Voice API to read the message aloud using the SpeechSynthesisUtterance object. I found that if voiceURI and lang are not set, the browser's default voice is German! This makes it difficult to understand speech until I change it. To actually say the message, you use the window.speechSynthesis.speak(msg) function:
1 2 3 4 5 6 7 8 9 |
|
Note: It is important not to let it say "Recording..." text: If you do this, the microphone will pick up the voice and add it to the recorded query.
Finally, display your response box and add that text to it so that the user can read it too:
1 2 3 4 5 6 |
|
Host your web interface
For best results, you may need to host it on an HTTPS-enabled web server. Your request to Api.ai is made via HTTPS, so it is better to host your web interface on HTTPS as well. If you just want to use it as a prototype and don't have an off-the-shelf HTTPS secure web server, try Glitch.com! This is a new service that can host snippets of code that contain both front-end and back-end (Node.js) code.
For example, my Barry is also hosted at https://www.php.cn/link/e35284b6966864480f02bc12997b8b49 Glitch hosting is completely free! This is a great service and I highly recommend you give it a try.
If you want to make this project bigger, consider using Let’s Encrypt to get a free SSL/TLS certificate or consider purchasing a certificate from your web host.
Practical operation
If you run a web application using my style from the GitHub repository, it looks like this:
If you click "Speak" and say "How are you?" to ask it, it will initially show you are recording:
(When you click the button, you may need to allow Chrome to access your microphone. Obviously, this happens every time unless you serve the page as HTTPS.)
Then it responds visually like this (also read aloud, which is hard to show in the screenshot):
You can also click the button in the lower right corner to view the JSON response Api.ai gives you in case you want to debug the result:
If you seem to receive a message mainly "I didn't hear clearly, can you say it again?", please check the microphone permissions in your browser. If you load the page locally (for example, if your address bar starts with file:/// ), Chrome doesn't seem to provide any access to the microphone at all, so you end up with this error anyway! You need to host it somewhere. (Try using Glitch.com mentioned above.)
Personally, I don't like some of the default phrases in small talk, such as this:
I customized many of these phrases in those settings I saw earlier. For example, I found this little chat statement in the list very strange, so I decided to customize it like this:
So, start creating your own chatbot! Make it unique and have fun!
Encounter with problems?
I find that sometimes, if the Web Voice API tries to say something too long, Chrome's voice stops working. If this is the case for you, close the tab and open a new tab to try again.
Conclusion
I'm sure you'll see that Api.ai is a truly simple way to build a chatbot-style AI personal assistant.
Want to continue developing your Api.ai robot? More things to do: This is the entire series I wrote on SitePoint!
If you built your own personal assistant using Api.ai, I would love to hear what you think! Did you name it Barry, too? What issues did you set for it? Please let me know in the comments below or contact me on Twitter via @thatpatrickguy.
Use emotional tools to inject human elements into your AI. Check out our video tutorials on Microsoft Cognitive Services and Text Analysis API.
FAQs (FAQ) for building your own AI assistant using API.AI
What are the prerequisites for building an AI assistant using API.AI?
To build an AI assistant using API.AI, you need a basic understanding of programming concepts and languages (especially JavaScript). You also need to be familiar with the Google Cloud service because API.AI is part of Google Cloud. It is also beneficial to understand AI and machine learning concepts. However, API.AI is designed to be easy to use and does not require in-depth AI knowledge.
How to integrate my AI assistant with other platforms?
API.AI provides integrated support for many popular platforms such as Slack, Facebook Messenger, and Skype. You can use the API.AI SDK and API to integrate your AI assistant with these platforms. This process involves setting up a webhook and configuring platform settings in the API.AI console.
Can I customize the behavior of the AI assistant?
Yes, API.AI allows you to customize the behavior of an AI assistant. You can define custom intents and entities that determine how your AI assistant responds to user input. You can also use the fulfillment function to write custom code that is executed when a specific intent is triggered.
How to improve the accuracy of AI assistants?
The accuracy of the AI assistant depends on the quality of the training data. You can improve accuracy by providing various example phrases for each intention. API.AI also provides a feature called "Machine Learning Mode" that automatically improves the model based on user interaction.
Is it possible to build multilingual AI assistants using API.AI?
Yes, API.AI supports multiple languages. You can build multilingual AI assistants by defining intents and entities in different languages. API.AI will automatically detect the language entered by the user.
How to test my AI assistant during development?
API.AI provides a built-in test console where you can interact with your AI assistant. You can enter user phrases and see how your AI assistant responds. This allows you to test and improve your AI assistant during development.
How much does it cost to use API.AI?
API.AI is part of Google Cloud and its pricing is based on usage. There is a free tier that includes a certain number of requests per minute and monthly. After exceeding the free tier, you will be paid based on the number of requests.
Can I build an AI assistant for my mobile application using API.AI?
Yes, API.AI provides SDKs for Android and iOS. You can use these SDKs to integrate your AI assistant with your mobile applications.
How to use API.AI to handle complex conversations?
API.AI provides a feature called "Context" that allows you to handle complex conversations. The context allows you to control the conversation flow and manage dependencies between intents.
Can I use API.AI to analyze the interaction between users and my AI assistant?
Yes, API.AI provides analytics capabilities that allow you to analyze user interactions. You can view usage statistics, performance of intent, and user satisfaction ratings. This information can help you improve your AI assistant over time.
The above is the detailed content of How to Build Your Own AI Assistant Using Api.ai. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

This pilot program, a collaboration between the CNCF (Cloud Native Computing Foundation), Ampere Computing, Equinix Metal, and Actuated, streamlines arm64 CI/CD for CNCF GitHub projects. The initiative addresses security concerns and performance lim

This tutorial guides you through building a serverless image processing pipeline using AWS services. We'll create a Next.js frontend deployed on an ECS Fargate cluster, interacting with an API Gateway, Lambda functions, S3 buckets, and DynamoDB. Th

Stay informed about the latest tech trends with these top developer newsletters! This curated list offers something for everyone, from AI enthusiasts to seasoned backend and frontend developers. Choose your favorites and save time searching for rel
