Sep 4, 2025
How To Build Your First AI Voice Agent On Pipecat
### Build Your First AI Voice Agent with Pipecat and Twilio (Inbound & Outbound Calls)
In my opinion, voice AI has hit a maturity threshold. The tech isn't the bottleneck anymore. Instead, it's about you and how well you can build reliable, human-like agents. That’s exactly why mastering a framework like Pipecat is such a good decision. As companies get more serious about voice AI, they demand the reliability and control that only open-source platforms can deliver.
In this guide, I'm going to show you how to build your first Pipecat AI voice agent and deploy it with Twilio to handle both inbound and outbound calls. I’ve made this process ridiculously easy, and all the code is available in the GitHub repo linked below.
For those who don't know, **Pipecat** is an open-source framework for building real-time, multimodal, and voice conversational agents. It's winning on simple pricing, easy deployment, and incredibly intuitive code. Let's dive in and build something.
### What You'll Need (Prerequisites)
Before we start, make sure you have the following ready:
* **Python 3.10+**
* **An [ngrok](https://ngrok.com/) account** to tunnel your local server to a public URL.
* **A [Twilio](https://www.twilio.com/) Account** with at least one phone number.
* **API Keys** for the AI services we'll use:
* **Deepgram** (for Speech-to-Text)
* **Cartesia** (for Text-to-Speech)
* **Cerebras** (or any OpenAI-compatible LLM provider like Groq, TogetherAI, etc.)
### Quick Start: Let's Get It Running
We'll get the bot running first, then I'll break down how the code works.
#### Step 1: Clone the Repo
First, grab the code from GitHub and navigate into the directory.
```bash
git clone https://github.com/HugoPodworski/first-pipecat-agent.git
cd first-pipecat-agent
```
#### Step 2: Set Up Your API Keys
Create a `.env` file to store your secret keys. You can copy the example file to get started.
```bash
cp env.example .env
```
Now, open the `.env` file and fill in your API keys. It should look like this:
```ini
# .env file
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=...
CEREBRAS_API_KEY=... # Or OPENAI_API_KEY, GROQ_API_KEY, etc.
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
NGROK_AUTHTOKEN=... # Recommended for stable ngrok sessions
```
#### Step 3: Install Dependencies
It's best practice to use a virtual environment. Let's create one and install the required Python packages.
```bash
# Create and activate the virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
```
#### Step 4: Start ngrok and Configure Twilio
I've written a helper script (`setup_ngrok_twilio.py`) to automate the tedious part. This script will:
1. Start an ngrok tunnel to expose your local server.
2. List your available Twilio phone numbers.
3. Automatically update the webhook for the number you choose to point to your ngrok URL.
Run it with this command:
```bash
python setup_ngrok_twilio.py
```
You'll see a prompt asking you to choose which Twilio number you want to use. Just type the number and hit Enter.
```
Available Twilio phone numbers:
1: +15551112222 (Webhook: https://demo.twilio.com/welcome/voice/)
2: +15553334444 (Webhook: https://some-vapi-agent.ngrok.io)
Enter the number of the phone number you want to use: 1
```
Once you select a number, the script will update its webhook and give you the exact command you need to run your bot. It will look something like this:
```
✅ Twilio number +15551112222 updated to webhook https://<your-ngrok-host>.ngrok.io
Next steps:
1. Ensure the bot server is running in a new terminal:
python bot.py --transport twilio --proxy <your-ngrok-host>.ngrok.io
```
#### Step 5: Run the Bot!
Open a **new terminal** (leaving the `setup_ngrok_twilio.py` script running to keep the tunnel active), activate the virtual environment again, and run the command from the previous step's output.
```bash
# In a new terminal
source .venv/bin/activate
python bot.py --transport twilio --proxy <your-ngrok-host>.ngrok.io
```
That's it! Your AI voice agent is now running locally and connected to your Twilio number.
### Test Your Phone Bot
Time for the fun part. **Call your Twilio phone number** from your cell phone. You should be connected to your AI agent and be able to have a conversation! 🚀
💡 **Tip:** Keep an eye on the terminal where `bot.py` is running. You'll see detailed logs showing Pipecat's pipeline in action, which is super helpful for debugging.
### A Quick Look at the Code
The core logic lives in `bot.py`. It defines a simple Pipecat pipeline:
```python
# A simplified view of the pipeline
pipeline = [
transport.input(),
rtvi,
stt,
llm_context_aggregator,
llm,
tts,
transport.output(),
llm_context_aggregator_assistant,
]
```
1. **Transport Input:** Receives raw audio from the caller via Twilio.
2. **STT (Speech-to-Text):** Transcribes the audio using Deepgram.
3. **LLM:** Processes the user's text using an LLM (Cerebras, Groq, OpenAI, etc.), decides on a response, and can even call functions (tools).
4. **TTS (Text-to-Speech):** Converts the LLM's text response into audio using Cartesia.
5. **Transport Output:** Streams the synthesized audio back to the caller.
The code is designed to be easy to read and modify. You can swap out services just by changing a few lines. For example, to switch from Cerebras to Groq, you'd just update the `LLM` service definition:
```python
# Example of switching LLM providers
llm = OpenAILLMService(
api_key=os.getenv("GROQ_API_KEY"),
model="llama3-8b-8192",
base_url="https://api.groq.com/openai/v1"
)
```
### Making Outbound Calls
This setup doesn't just handle incoming calls; you can also trigger outbound calls.
1. Make sure your bot server (`bot.py`) is running and the ngrok tunnel is active.
2. Use the `outbound.py` script, specifying the number to call (`--to`) and your Twilio number (`--from`).
```bash
python outbound.py --to +15551234567 --from +<YOUR_TWILIO_NUMBER> --proxy <your-ngrok-host>.ngrok.io
```
This will initiate a call from your Twilio number to the destination number. When they pick up, they'll be connected to your running Pipecat agent.
### Troubleshooting: When Things Go Wrong
During my own testing for the video, I ran into an issue: the response latency was really high. I checked the logs and immediately saw the problem:
```
OpenAI LLM service processing time: 5.23s
```
The logs told me that Cerebras was taking over 5 seconds to respond. Because Pipecat makes it so easy to swap components, I switched to Groq in under a minute, restarted the server, and the latency was gone.
This is the power of an open-source framework—you have the visibility and control to diagnose and fix problems yourself. If you run into issues:
* **Call doesn't connect:** Double-check that your ngrok URL is correctly set in your Twilio webhook.
* **No audio or bot doesn't respond:** Make sure all your API keys in the `.env` file are correct and have active quotas.
* **ngrok tunnel issues:** Free ngrok URLs change every time you restart it. Remember to re-run the `setup_ngrok_twilio.py` script to update Twilio with the new URL.
### Next Steps
You now have a fully functional AI voice agent connected to the telephone network. From here, you can:
* **Deploy to Production:** To move beyond ngrok, containerize the application with Docker and deploy it to a cloud service. The Pipecat documentation has a great guide on [deploying to their cloud](https://docs.pipecat.ai/deployment/pipecat-cloud), which has incredibly fair pricing.
* **Add More Tools:** Expand the agent's capabilities by defining more functions, like booking appointments or looking up information in a database.
* **Join the Community:** The Pipecat team and community are very active. If you have questions, [join their Discord](https://discord.gg/pipecat)!
Hopefully, you found this guide useful. The goal was to show you just how simple it is to get a powerful, production-ready voice agent up and running.
If you want us to build a custom voice agent for your business, feel free to reach out through the contact form on our website, [Artillery](https://your-website-link.com).
Thanks for reading
