Build a conversational AI with RAG, LangChain, OpenAI embeddings, and ChromaDB vector database
This article has been refreshed with the latest RAG best practices, updated embedding models (text-embedding-3), and current LangChain/ChromaDB features. Originally published December 2023.
This tutorial is part of the AI series where we explore building practical AI applications. In this post, I share the source code and video tutorial for using LangChain with ChromaDB to create a conversational AI that can talk to PDF documents.
The concept is known as RAG - Retrieval-Augmented Generation. We use the ChromaDB vector database to store embedding vectors locally, which significantly reduces API costs from OpenAI while enabling fast semantic search.
The RAG workflow consists of two main phases: indexing (preparing your documents) and retrieval (answering questions).
The complete source code for this project is available on GitHub. The script loads PDF documents, creates embeddings, stores them in ChromaDB, and enables conversational interaction:
# Install dependencies
pip install langchain openai chromadb pypdf
# Set your OpenAI API key
export OPENAI_API_KEY="your-api-key"
# Run the script
python talk_to_pdf.py
Watch the complete video walkthrough where I demonstrate how to build and use this RAG application:
The RAG landscape has evolved significantly since this tutorial was first published. Here are the key updates:
| Model | Dimensions | Performance | Cost |
|---|---|---|---|
| text-embedding-3-large | Up to 3072 | Best accuracy (MTEB: 64.6%) | $0.00013 / 1k tokens |
| text-embedding-3-small | Up to 1536 | Great for most tasks | $0.00002 / 1k tokens (5x cheaper) |
| text-embedding-ada-002 (legacy) | 1536 | Previous generation | $0.0001 / 1k tokens |
Continue your AI learning journey with these related guides: