How to Use Milvus Vector Database with Python for Semantic Search
I tried setting up semantic search for my document search application and hit a wall with vector database configuration. Every tutorial seemed to require Docker, Kubernetes, or some complex deployment setup. I just wanted to store some vectors and search them with cosine similarity.
After installing PyMilvus, I got this error when trying to create a connection:
from pymilvus import MilvusClient
client = MilvusClient("http://localhost:19530")ConnectionError: Failed to connect to Milvus server at http://localhost:19530The server wasn’t running because I hadn’t set it up. The documentation assumed I was deploying a full Milvus cluster, but I just needed something for local development.
Then I found Milvus Lite - it’s embedded directly in the Python client and stores data in a local file. No Docker, no server setup, just import and go.
Milvus Lite: Zero-Config Vector Database
The solution was simpler than I expected:
from pymilvus import MilvusClient
# This creates a local file database - no server neededclient = MilvusClient("./documents.db")
# Drop existing collection if it existsif client.has_collection("documents"): client.drop_collection("documents")
# Create collection with 384 dimensions (sentence-transformers)client.create_collection( collection_name="documents", dimension=384, metric_type="COSINE")That’s it. The database file is created locally and persists between runs. No connection strings, no authentication, no infrastructure setup.
The key insight is that Milvus Lite uses the exact same API as the full Milvus Server. You develop locally with a file, then change one line to connect to a production server:
# Developmentclient = MilvusClient("./documents.db")
# Production - just change the connection stringclient = MilvusClient("http://localhost:19530")# Or Zilliz Cloud (managed)client = MilvusClient(uri="https://xxx.api.gcp-us-west1.zillizcloud.com", token="...")The Complete Semantic Search Flow
Here’s how semantic search works with Milvus:
User Query ↓[Embed Query Vector] - Turn text into 384-dimensional numbers ↓[Milvus Vector DB] - Compare with stored vectors ↓[Cosine Similarity Search] - Find nearest neighbors ↓Top-K Results + Scores - Most similar documentsThe vector embeddings capture semantic meaning. “configure Redis” and “Redis setup” end up close to each other in vector space even though they use different words.
Full Working Example
Here’s a complete example showing the full pipeline:
from pymilvus import MilvusClientfrom sentence_transformers import SentenceTransformerimport numpy as np
# Initialize Milvus Liteclient = MilvusClient("./semantic_search.db")
# Setup collectionif client.has_collection("documents"): client.drop_collection("documents")
client.create_collection( collection_name="documents", dimension=384, # Matches sentence-transformers/all-MiniLM-L6-v2 metric_type="COSINE")
# Sample documentsdocuments = [ {"id": "doc_1", "content": "How to configure Redis caching"}, {"id": "doc_2", "content": "Redis setup and installation guide"}, {"id": "doc_3", "content": "Python caching strategies"},]
# Create embeddingsmodel = SentenceTransformer('all-MiniLM-L6-v2')for doc in documents: doc["vector"] = model.encode(doc["content"]).tolist()
# Insert into Milvusclient.insert("documents", documents)
# Load collection (required for search)client.load_collection("documents")
# Searchquery = "Redis configuration"query_vector = model.encode(query).tolist()
results = client.search( collection_name="documents", data=[query_vector], limit=3, output_fields=["content"])
# Display resultsfor hits in results: for hit in hits: print(f"Score: {hit['distance']:.4f}") print(f"Content: {hit['entity']['content']}") print()Output:
Score: 0.8921Content: How to configure Redis caching
Score: 0.8145Content: Redis setup and installation guide
Score: 0.4523Content: Python caching strategiesNote that Milvus returns distance (lower = more similar for COSINE). For cosine similarity with normalized vectors, you can convert to similarity: similarity = 1 - distance.
Understanding Metric Types
This was a common pitfall I encountered - choosing the wrong metric type:
| Metric Type | Best For | Range | Interpretation |
|---|---|---|---|
| COSINE | Normalized vectors | 0 to 2 | Lower = more similar |
| L2 | Unnormalized vectors | 0 to infinity | Lower = more similar |
| IP | Normalized vectors, want dot product | -1 to 1 | Higher = more similar |
If your embeddings are normalized (unit length), use COSINE. If they’re not normalized, use L2. IP with normalized vectors equals cosine similarity.
Most modern embedding models (OpenAI’s text-embedding-3-small, sentence-transformers) produce normalized vectors, so COSINE is usually the right choice.
Common Mistakes I Made
Mistake 1: Wrong Embedding Dimension
I created a collection with 384 dimensions but was using OpenAI’s embeddings which are 1536 dimensions:
# WRONGclient.create_collection("docs", dimension=384) # For sentence-transformersembedding = openai.embeddings.create(model="text-embedding-3-small", input="text") # Returns 1536 dimsclient.insert("docs", [{"vector": embedding.data[0].embedding}]) # Dimension mismatch error
# CORRECTclient.create_collection("docs", dimension=1536) # Match your embedding modelMistake 2: Forgetting to Load Collection
Search returns empty results if you don’t load the collection first:
# WRONG - no resultsclient.insert("docs", documents)results = client.search("docs", data=[query_vector])
# CORRECTclient.insert("docs", documents)client.load_collection("docs") # Required before searchresults = client.search("docs", data=[query_vector])Mistake 3: Manual IDs with auto_id=True
# WRONG - Errorclient.create_collection("docs", dimension=384, auto_id=True)client.insert("docs", [{"id": "doc_1", "vector": [...]}]) # Can't specify ID with auto_id
# CORRECTclient.create_collection("docs", dimension=384, auto_id=False)client.insert("docs", [{"id": "doc_1", "vector": [...]}])Custom Schema for Real-World Data
For production, you’ll want a custom schema with multiple fields:
from pymilvus import MilvusClient, DataType
client = MilvusClient("./products.db")
# Define schemaschema = client.create_schema( auto_id=False, enable_dynamic_field=True, description="Product catalog")
# Add fieldsschema.add_field("product_id", DataType.VARCHAR, is_primary=True, max_length=100)schema.add_field("title", DataType.VARCHAR, max_length=500)schema.add_field("price", DataType.DOUBLE)schema.add_field("category", DataType.VARCHAR, max_length=100)schema.add_field("embedding", DataType.FLOAT_VECTOR, dim=384)
# Create indexindex_params = client.prepare_index_params()index_params.add_index( field_name="embedding", index_type="AUTOINDEX", metric_type="COSINE")
# Create collectionclient.create_collection( collection_name="products", schema=schema, index_params=index_params)
# Insert dataproducts = [ { "product_id": "p1", "title": "Wireless Headphones", "price": 149.99, "category": "Electronics", "embedding": model.encode("Wireless Headphones").tolist() }]client.insert("products", products)
# Search with filterclient.load_collection("products")results = client.search( collection_name="products", data=[model.encode("headphones").tolist()], filter="price < 200 and category == 'Electronics'", limit=5, output_fields=["title", "price"])The filter parameter lets you combine semantic search with structured queries - find semantically similar products that also match your price and category criteria.
Performance Considerations
Milvus Lite works great for development and small datasets (< 100K vectors). For larger scale, consider:
| Mode | Best For | Storage | Limits |
|---|---|---|---|
| Milvus Lite | Development, personal projects | Local file | < 1M vectors |
| Milvus Server | Team environments, production | On-premises | Scales to billions |
| Zilliz Cloud | Managed production | Cloud | Auto-scaling |
The code is identical across all three modes - you only change the connection string.
Quick Reference: Essential Methods
# Connectionclient = MilvusClient("./local.db") # File-basedclient = MilvusClient("http://localhost:19530") # Serverclient = MilvusClient(uri="...", token="...") # Cloud
# Collection Managementclient.create_collection(name, dimension, metric_type="COSINE")client.has_collection(name)client.drop_collection(name)client.list_collections()
# Data Operationsclient.insert(collection, data)client.delete(collection, ids)client.update(collection, data)client.get(collection, ids, output_fields)
# Searchclient.load_collection(collection) # Required first!client.search(collection, data, limit, filter, output_fields)Related Knowledge
Why 384 dimensions? SentenceTransformer’s all-MiniLM-L6-v2 uses 384 dimensions as a balance between accuracy and speed. More dimensions capture more nuance but increase storage and computation. OpenAI’s text-embedding-3-small uses 1536 dimensions for higher quality.
Hybrid search: Milvus supports combining multiple vector fields (e.g., title embedding + description embedding) using rankers like RRF (Reciprocal Rank Fusion) to merge results from different searches.
Index types: AUTOINDEX lets Milvus choose the optimal index automatically. FLAT gives exact search (100% recall) but is slower for large datasets. HNSW is faster for approximate search on large datasets.
Final Thoughts
Milvus Lite removes the friction from getting started with vector search. Install PyMilvus, connect to a local file, and you have a fully functional vector database. The same code scales to production - just change the connection string.
The key takeaways: use COSINE for normalized embeddings, always load_collection() before search, match your embedding dimensions, and leverage filters for hybrid semantic + structured queries.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 PyMilvus Documentation
- 👨💻 Milvus Documentation
- 👨💻 Milvus Lite Guide
- 👨💻 Vector Embeddings Guide
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments