ChatGPT is truly a dream come true for phone system programmers and designers. The AI is so smart, it feels like you're talking to a real person! Gone are the days of programming in hundreds of canned responses, or worse, paying a voice actor to read a list of prompts.
By using Twilio to build your IVR, ChatGPT as the AI backend, and NextJS for the middleware, you can get an AI-powered IVR set up in less than 20 minutes.
And I'm going to show you how.
Step 1 - Get an OpenAI API Key
You'll need to create an OpenAI account to get an API key. It's currently free, but that may change in the future. Once you're logged in, click on your profile image and select
View API Keys.
That's all you need to access ChatGPT.
Step 2 - Create a New NextJS Project
In this tutorial, I'm going to use NextJS solely for it's built in express API to be our middleman between Twilio and ChatGPT. This does slow down the time between asking a question and getting an answer and would not be suitable for a production environment, in my opinion, but it works!
As always, we begin with
Turn on typescript and ESlint if you prefer and then run:
Create a new file in the project root called
.env.local and add your OpenAI key:
Also, we'll need to generate a token for accessing our API from twilio as well to prevent abuse. Click here to get a secure randomly generated key. Then add it to
Finally, open up your project in VS Code by typing:
Ok, let's create our API route under
Now deploy your project to Vercel either by the Vercel command line tool or via GitHub. If you're not sure how to do this, check out this guide.
Remember to add your environment variables under the settings tab of your Vercel project:
Also, you may want to change the default Vercel domain to something more memorable:
Step 3 - Create a Twilio Account
Now head over to Twilio.com and create a free account. You'll need to enter credit card info for your minute usage, but it's super cheap. Also, they give you $10 in credit to play around with, so you can mess around with this without paying any money. If you decide to take this all the way and use this as a solution, then I highly recommend Twilio as they are the most comprehensive and extensible telephony solution on the market. And they are very reasonably priced. I'd say I put $25 on every couple of months for my personal usage.
Once you register your account, you'll need to requisition a phone number. To do this click on
Phone Numbers >>
Buy a number. In the search criteria, you can enter your area code to get a local number, or 888 to get a toll-free number (which costs a dollar more per month):
Ok, once you have your number, we need to create our IVR. There are many ways to accomplish this, but we are going to do it the easy way by using Twilio Studio. To get there, type
Studio in the search bar at the top of your dashboard. Then click the
+ icon to create a new Studio flow and name it
Chat IVR or something. Then click
Create from Scratch.
You should be presented with a blank grid. To create our flow, you simply drag a function from the right-hand column over to the grid area and then change its parameters. It's stupid easy! The first thing we'll want to do is greet the user when an incoming call comes in. To do this, we'll use the
Say / Play function. Change the parameters as follows:
You can select whatever voice you like, but I'm using the Polly / Sally voice, which are provided by Amazon's Polly TTS engine. Now click and drag from
incoming call from your triggers and connect it to your new widget. Now when a call comes in to this flow, the caller will be greeted by this voice. Your flow should look like this:
Next, we'll route the call to another widget called
Gather Input on Call. Configure it as follows:
Leave verything else in their default settings and click
save. Then connect the
Audio Complete node from
Greeting to the input node on
ReadInput. Your flow should look like this:
Since there is such a long delay from the time your speech is recognized to when the response is received from our API, it's necessary to indicate to the caller that their voice has been recognized and that it's being processed. To do this, we'll simply play a little tone. We'll use the
Say / Play widget for this:
You can download and add this file to your own Vercel project if you like, or simply use the one from my project. Click
save and then connect the
User Said Something node of
ReadInput to the input node of
Now we'll create our HTTP request to our API. To do that, we need the, you guessed it,
Make HTTP Request widget:
save and connect the
Audio Complete node from
Acknowledge to the input node of
Next, we need to take the response from our API and say the text back to the caller. Again we'll use the
Say / Play widget:
save and link the
Success node from
RequestAPI to its input.
That largely completes our flow. But you'll probably want to do this in a loop. So after the response is read back to the caller, we'll want to play another tone to indicate that they should continue speaking. So grab another
Say / Play:
save and link
Audio Complete from
Response to the input of
Prompt. Then link
Audio Complete from
Prompt to the input of
ReadInput. That should complete the loop.
Here is what your final flow should look like:
Click the red
Publish button to make your flow active.
I know you're probably as eager as I was to try it out, but there's one more thing we must do. We must set the route of your new phone number to connect to this Studio flow. So go back to
Active Numbers and click
Manage. Then click your number. Scroll down until you find
Configure With and select
Webhook, TwiML Bin, Function, Studio Flow, Proxy Service and then select
Studio Flow and
A Call Comes In. Then click save.
Now call your number and have a pleasant conversation!
I hope you enjoyed this article. Also, I hope the possibilities of ChatGPT excite you as much as me! For more great information about web dev, systems administration and telephony programming, please visit the Designly Blog.