I Replaced OpenAI with Groq in My CrewAI Project – Here's What Actually Happened
The Problem: OpenAI Costs Add Up
Running multiple agents inside CrewAI can produce high token usage fast. Proprietary models like GPT-4 become expensive when scaling production workloads.
The Solution: OpenAI-Compatible APIs
CrewAI supports any OpenAI-compatible API endpoint through its native LLM class. This allows you to switch to alternative providers like Groq, Moonshot (Kimi), or local models (Ollama) without rewriting agent code.
Verified Pricing Comparison (Updated Late 2025)
| Provider | Model | Input / 1M tokens | Output / 1M tokens | API Base URL |
|---|---|---|---|---|
| Groq | Llama-3.3-70B Versatile | $0.59 | $0.79 | https://api.groq.com/openai/v1 |
| OpenAI | GPT-5.1 | $1.25 | $10.00 | https://api.openai.com/v1 |
Pricing sourced from Groq Model Cards and independent LLM pricing aggregators as of November–December 2025.
Cost difference example: For high-volume processing, Groq Llama-3.3-70B can be dramatically cheaper depending on usage scale.
How to Configure CrewAI with Alternative LLMs
Step 1: Environment Setup
OPENAI_API_KEY="your-api-key"
OPENAI_API_BASE="https://api.groq.com/openai/v1"
OPENAI_MODEL="llama-3.3-70b-versatile"
`
Step 2: Configure Custom LLM in agents.py
`
from crewai import Agent, LLM
from decouple import config
import os
class CustomAgents:
def __init__(self):
api_base = config("OPENAI_API_BASE", default=os.getenv("OPENAI_API_BASE", ""))
api_key = config("OPENAI_API_KEY", default=os.getenv("OPENAI_API_KEY", ""))
model = config("OPENAI_MODEL", default="gpt-3.5-turbo")
if api_base and api_key:
self.llm = LLM(
model=model,
api_key=api_key,
base_url=api_base
)
else:
self.llm = LLM(model="gpt-4")
Step 3: Agent Definition (No Changes Required)
def research_agent(self):
return Agent(
role="Senior Research Analyst",
backstory="Expert in market analysis with 10+ years experience",
goal="Provide comprehensive research and insights",
verbose=True,
llm=self.llm,
)
Testing Your Configuration
Verify API Connectivity
python test_api.py
Supported Providers
Groq
OPENAI_API_BASE="https://api.groq.com/openai/v1"
OPENAI_MODEL="llama-3.3-70b-versatile"
Moonshot / Kimi
OPENAI_API_BASE="https://api.moonshot.cn/v1"
OPENAI_MODEL="moonshot-v1-8k"
Local Models (Ollama Example)
from langchain_ollama import OllamaLLM
self.Ollama = OllamaLLM(model="openhermes")
Performance Considerations
- Speed: Groq reports ~276 tokens/sec throughput for Llama-3.3-70B Versatile under typical inference conditions.
- Latency: Typical 0.5–1.5s end-to-end response for standard workloads.
- Context Length: Large context support (model-card dependent, e.g. up to ~128K tokens).
- Rate Limits: See Groq account dashboard for up-to-date tier limits.
Troubleshooting
Invalid API Key
Check formatting & run:
python test_api.py
Model Not Found
Verify with:
python test_api.py --list-models
Connection Timeout
Verify base URL formatting and internet connectivity.
Quality Differences
temperature=0.3 # range 0.1–0.9 recommended
When to Use Alternative LLMs
Use alternatives when
- You need lower cost & high throughput
- Running many agents or parallel processes
- Reducing vendor lock-in is important
Use proprietary models when
- You require specialized features or alignment
- Your workload is small (<$20/month usage)
Complete Example Setup
test_api.py
#!/usr/bin/env python3
import requests
import os
from decouple import config
def test_api_key():
"""Test if the API key is valid"""
api_key = config('OPENAI_API_KEY')
api_base = config('OPENAI_API_BASE')
model = config('OPENAI_MODEL')
print(f"Testing API key: {api_key[:10]}...")
print(f"API Base: {api_base}")
print(f"Model: {model}")
# Test the API key with a simple request
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
# Make a simple API call to test authentication
try:
response = requests.get(f'{api_base}/models', headers=headers, timeout=10)
print(f'Status Code: {response.status_code}')
if response.status_code == 200:
print('SUCCESS: API Key is VALID')
models = response.json().get('data', [])
print(f'Found {len(models)} available models')
if models:
print('First few models:', [m['id'] for m in models[:3]])
else:
print('ERROR: API Key is INVALID')
print('Response:', response.text)
return False
except Exception as e:
print(f'ERROR: Error testing API: {e}')
return False
return True
if __name__ == "__main__":
test_api_key()
the commands
pip install crewai langchain-openai python-decouple
echo "OPENAI_API_KEY='your-key'" > .env
echo "OPENAI_API_BASE='https://api.groq.com/openai/v1'" >> .env
echo "OPENAI_MODEL='llama-3.3-70b-versatile'" >> .env
python test_api.py
python main.py
Key Takeaways
- CrewAI supports OpenAI-compatible APIs directly via environment configs.
- Switching providers requires no rewrite of agent logic.
- Groq provides high performance and significantly lower token cost for scale workloads.
- Swapping between providers is reversible by modifying
.env.
Top comments (0)