Vector databases have become essential infrastructure for AI applications, enabling efficient similarity search and semantic understanding. Let’s explore their role in modern systems.
What are Vector Databases?
Vector databases store and query high-dimensional vectors (embeddings) representing data like text, images, or audio. They enable:
- Semantic search
- Recommendation systems
- Anomaly detection
- Image similarity
- Question answering
How They Work
Embeddings
Transform data into vectors:
1
2
3
4
5
6
| from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
text = "Machine learning is fascinating"
embedding = model.encode(text)
# Returns: [0.042, -0.123, 0.891, ...] 384 dimensions
|
Similarity Search
Find similar vectors using distance metrics:
- Cosine similarity
- Euclidean distance
- Dot product
Popular Vector Databases
Pinecone
Managed service:
1
2
3
4
5
6
7
8
9
10
11
12
13
| import pinecone
pinecone.init(api_key="your-key")
index = pinecone.Index("my-index")
# Upsert vectors
index.upsert(vectors=[
("id1", embedding1, {"text": "Original text"}),
("id2", embedding2, {"text": "Another text"})
])
# Query
results = index.query(query_vector, top_k=5)
|
Weaviate
Open-source with GraphQL:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| import weaviate
client = weaviate.Client("http://localhost:8080")
# Add data
client.data_object.create({
"content": "AI is transforming industries",
"category": "technology"
}, "Article")
# Semantic search
result = client.query.get("Article", ["content"]) \
.with_near_text({"concepts": ["artificial intelligence"]}) \
.with_limit(5) \
.do()
|
Qdrant
High-performance Rust-based:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
client = QdrantClient("localhost", port=6333)
# Create collection
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
# Insert
client.upsert(
collection_name="documents",
points=[{
"id": 1,
"vector": embedding,
"payload": {"text": "Document content"}
}]
)
|
Chroma
Developer-friendly:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| import chromadb
client = chromadb.Client()
collection = client.create_collection("my_collection")
# Add documents
collection.add(
documents=["Document 1", "Document 2"],
metadatas=[{"source": "web"}, {"source": "api"}],
ids=["id1", "id2"]
)
# Query
results = collection.query(
query_texts=["search query"],
n_results=5
)
|
Use Cases
Semantic Search
1
2
3
4
5
6
| # Traditional keyword search
SELECT * FROM articles WHERE title LIKE '%AI%'
# Vector search (finds semantically similar)
query_embedding = model.encode("artificial intelligence")
results = index.query(query_embedding, top_k=10)
|
RAG Systems
1
2
3
4
5
6
7
8
9
10
11
| from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
# Index documents
vectorstore = Chroma.from_documents(
documents=docs,
embedding=OpenAIEmbeddings()
)
# Retrieve relevant context
relevant_docs = vectorstore.similarity_search(query, k=5)
|
Recommendation Engine
1
2
3
4
5
6
7
8
9
| # User profile embedding
user_vector = create_user_embedding(user_history)
# Find similar items
recommendations = index.query(
user_vector,
top_k=20,
filter={"category": user_preferences}
)
|
Duplicate Detection
1
2
3
4
5
6
7
8
9
| # Image embeddings
image_embedding = vision_model.encode(image)
# Find duplicates
similar = index.query(
image_embedding,
top_k=5,
score_threshold=0.95 # High similarity
)
|
Indexing Strategies
HNSW (Hierarchical Navigable Small World)
- Fast approximate search
- Good recall
- Higher memory usage
IVF (Inverted File Index)
- Efficient for large datasets
- Configurable accuracy/speed tradeoff
LSH (Locality-Sensitive Hashing)
- Fast but lower accuracy
- Good for high dimensions
Hybrid Search
Combine vector and traditional search:
1
2
3
4
5
6
7
8
9
| # Combine semantic and keyword search
results = index.query(
vector=query_embedding,
filter={
"keywords": {"$contains": "machine learning"},
"date": {"$gte": "2024-01-01"}
},
top_k=10
)
|
Best Practices
- Choose appropriate dimensions: Balance accuracy and performance
- Normalize vectors: Ensure consistent scale
- Batch operations: Improve throughput
- Monitor performance: Track latency and recall
- Use filters wisely: Combine with metadata
- Regular maintenance: Optimize and reindex
- Backup data: Implement disaster recovery
Challenges
- Storage costs: High-dimensional vectors require significant space
- Cold start: Need initial data for recommendations
- Dimension curse: Performance degrades with very high dimensions
- Embedding quality: Results depend on embedding model
Emerging Trends
- Multi-modal embeddings: Text + images + audio
- Sparse vectors: Efficient representation
- GPU acceleration: Faster processing
- Distributed systems: Scalability improvements
- Hybrid architectures: Combine multiple approaches
Comparison Matrix
| Database | Open Source | Managed | Best For |
|---|
| Pinecone | No | Yes | Production, scale |
| Weaviate | Yes | Yes | GraphQL, modules |
| Qdrant | Yes | Yes | Performance |
| Chroma | Yes | No | Development, prototyping |
| Milvus | Yes | Yes | Large scale |
Getting Started
- Choose embedding model
- Generate vectors for your data
- Select vector database
- Create index/collection
- Insert vectors with metadata
- Query and evaluate results
- Optimize based on metrics
Conclusion
Vector databases are foundational for modern AI applications. Understanding their capabilities and trade-offs is essential for building effective semantic search, RAG, and recommendation systems.