Skip to content

How to Use PostgreSQL Extensions (pgvector, PostGIS) with MikroORM

I was building a RAG (Retrieval Augmented Generation) application and wanted to use pgvector for semantic search with MikroORM. I checked the documentation, searched GitHub issues, and even asked on Reddit. The answer was always the same: MikroORM doesn’t have built-in support for PostgreSQL extensions.

That’s when I realized I needed a different approach. Here’s what I learned about integrating pgvector, PostGIS, and other PostgreSQL extensions with MikroORM.

The Problem

PostgreSQL extensions like pgvector, PostGIS, and pg_search provide powerful capabilities that have no equivalent in MikroORM’s ORM abstraction:

  • pgvector: Vector similarity search for AI/ML applications
  • PostGIS: Geospatial queries for location-based features
  • pg_search: Advanced full-text search with ranking

When I tried to use these extensions, I hit three walls:

  1. No native types: MikroORM doesn’t provide TypeScript types for vector, geometry, or tsvector
  2. Missing operators: Extension-specific operators like <=> (cosine distance) aren’t available in QueryBuilder
  3. Sparse documentation: Community knowledge is thin for integration patterns

First Attempt: Using Unknown Type

I started by trying to define a vector column with MikroORM’s unknown type:

Document.ts
import { Entity, PrimaryKey, Property } from '@mikro-orm/core'
@Entity()
export class Document {
@PrimaryKey()
id!: number
@Property()
content!: string
@Property({ type: 'unknown', columnType: 'vector(1536)' })
embedding!: number[]
}

This worked for schema generation, but I couldn’t query with it. The <=> operator for cosine distance doesn’t exist in MikroORM’s QueryBuilder.

failed-query.ts
// This doesn't work - MikroORM doesn't know about vector operators
const similar = await em.find(Document, {
embedding: { $distance: { $lt: 0.2 } } // Nope!
})

I was stuck. The entity could store vectors, but I couldn’t use them.

The Solution: Kysely Integration

MikroORM 7 ships with Kysely as the query runner. This was the key I was missing. I could use Kysely for raw queries while keeping MikroORM for standard CRUD operations.

Setting Up Raw Queries

vector-repository.ts
import { EntityManager } from '@mikro-orm/postgresql'
import { Document } from '../entities/Document'
export class VectorRepository {
constructor(private readonly em: EntityManager) {}
async findSimilar(embedding: number[], limit = 5): Promise<Array&lt;Document & { distance: number }&gt;&gt; {
const knex = this.em.getConnection().getKnex()
const results = await knex.raw(`
SELECT id, content, embedding, embedding <=> ?::vector as distance
FROM documents
ORDER BY distance
LIMIT ?
`, [`[${embedding.join(',')}]`, limit])
return results.rows
}
}

This worked. I could now perform vector similarity searches with the <=> operator while still using MikroORM for entity management.

Why This Approach Works

The hybrid pattern gives me the best of both worlds:

  • MikroORM handles: Entity relationships, migrations, change tracking, standard CRUD
  • Kysely handles: Extension-specific queries, custom operators, raw SQL when needed

No need to abandon the ORM entirely.

Defining Custom Types

To maintain type safety, I created custom TypeScript types for extension columns:

types/vector.ts
// pgvector type - adjust dimensions based on your embedding model
export type Vector = number[]
types/geometry.ts
// PostGIS geometry type
export interface Geometry {
type: 'Point' | 'Polygon' | 'LineString'
coordinates: number[] | number[][]
srid?: number
}

Then in my entities:

Document.ts
import { Entity, PrimaryKey, Property } from '@mikro-orm/core'
import { Vector } from '../types/vector'
@Entity()
export class Document {
@PrimaryKey()
id!: number
@Property()
content!: string
@Property({ type: 'unknown', columnType: 'vector(1536)' })
embedding!: Vector
@Property({ type: 'unknown', columnType: 'tsvector' })
searchVector!: unknown
}
Location.ts
import { Entity, PrimaryKey, Property } from '@mikro-orm/core'
import { Geometry } from '../types/geometry'
@Entity()
export class Location {
@PrimaryKey()
id!: number
@Property()
name!: string
@Property({ type: 'unknown', columnType: 'geometry(Point, 4326)' })
coordinates!: Geometry
}

MikroORM won’t validate these types at runtime, but TypeScript will catch type mismatches during development.

Real-World Examples

Here’s a complete repository for vector similarity search:

VectorRepository.ts
import { EntityManager } from '@mikro-orm/postgresql'
import { Document } from '../entities/Document'
export class VectorRepository {
constructor(private readonly em: EntityManager) {}
async insertWithEmbedding(content: string, embedding: number[]): Promise&lt;Document&gt; {
const doc = this.em.create(Document, { content, embedding })
await this.em.persistAndFlush(doc)
return doc
}
async findSimilar(embedding: number[], limit = 5): Promise<Array&lt;Document & { distance: number }&gt;&gt; {
const knex = this.em.getConnection().getKnex()
const results = await knex.raw(`
SELECT id, content, embedding, embedding <=> ?::vector as distance
FROM documents
ORDER BY distance
LIMIT ?
`, [`[${embedding.join(',')}]`, limit])
return results.rows
}
async searchHybrid(query: string, embedding: number[], limit = 10) {
const knex = this.em.getConnection().getKnex()
// Combine semantic (vector) and keyword (full-text) search
const results = await knex.raw(`
SELECT
id,
content,
embedding <=> ?::vector as semantic_distance,
ts_rank(search_vector, plainto_tsquery(?)) as keyword_rank
FROM documents
WHERE search_vector @@ plainto_tsquery(?)
ORDER BY (semantic_distance * 0.7) + ((1 - keyword_rank) * 0.3)
LIMIT ?
`, [`[${embedding.join(',')}]`, query, query, limit])
return results.rows
}
}

The searchHybrid method combines vector similarity with PostgreSQL’s built-in full-text search. This gives better results than either approach alone.

PostGIS: Location Queries

For geospatial data:

LocationRepository.ts
import { EntityManager } from '@mikro-orm/postgresql'
import { Location } from '../entities/Location'
export class LocationRepository {
constructor(private readonly em: EntityManager) {}
async findNearby(lat: number, lng: number, radiusKm: number): Promise&lt;Location[]&gt; {
const knex = this.em.getConnection().getKnex()
const results = await knex.raw(`
SELECT id, name,
ST_Distance(location, ST_MakePoint(?, ?)) as distance
FROM locations
WHERE ST_DWithin(location, ST_MakePoint(?, ?)::geography, ?)
ORDER BY distance
`, [lng, lat, lng, lat, radiusKm * 1000])
return results.rows
}
}

For advanced text search with ranking:

SearchRepository.ts
import { EntityManager } from '@mikro-orm/postgresql'
export class SearchRepository {
constructor(private readonly em: EntityManager) {}
async searchDocuments(searchTerm: string) {
const knex = this.em.getConnection().getKnex()
const results = await knex.raw(`
SELECT id, title,
ts_rank(search_vector, plainto_tsquery(?)) as rank
FROM documents
WHERE search_vector @@ plainto_tsquery(?)
ORDER BY rank DESC
`, [searchTerm, searchTerm])
return results.rows
}
}

Hybrid ORM Pattern

The best practice I found is using MikroORM for CRUD operations and raw queries for extension-specific operations:

hybrid-pattern.ts
async function updateDocumentWithSearch(
em: EntityManager,
id: number,
content: string
) {
// 1. Use MikroORM for the update
const doc = await em.findOneOrFail(Document, id)
doc.content = content
await em.flush()
// 2. Update search vector via raw query (alternative to triggers)
await em.getConnection().getKnex().raw(`
UPDATE documents
SET search_vector = to_tsvector('english', ?)
WHERE id = ?
`, [content, id])
}

This keeps entity management clean while handling extension-specific operations separately.

Using View Entities

For complex extension queries, I found database views helpful:

create-view.sql
-- Create a view for vector similarity
CREATE VIEW document_similarities AS
SELECT
d1.id as source_id,
d2.id as target_id,
d1.embedding <=> d2.embedding as distance
FROM documents d1
CROSS JOIN documents d2
WHERE d1.id != d2.id;

Then define a MikroORM entity for the view:

DocumentSimilarity.ts
import { Entity, PrimaryKey, Property } from '@mikro-orm/core'
@Entity({ expression: 'document_similarities' })
export class DocumentSimilarity {
@PrimaryKey()
sourceId!: number
@PrimaryKey()
targetId!: number
@Property()
distance!: number
}
// Query the view through MikroORM
const similarDocs = await em.find(DocumentSimilarity, {
sourceId: documentId,
distance: { $lt: 0.2 }
})

This gives type-safe access to extension-generated data.

Migrations for Extensions

Don’t forget to enable extensions in your migrations:

ExtensionSetup.ts
import { Migration } from '@mikro-orm/migrations'
export class ExtensionSetup extends Migration {
async up(): Promise&lt;void&gt; {
// Enable extensions
this.addSql('CREATE EXTENSION IF NOT EXISTS vector;')
this.addSql('CREATE EXTENSION IF NOT EXISTS postgis;')
// Create extension-specific indexes
this.addSql(`
CREATE INDEX documents_embedding_idx
ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
`)
this.addSql(`
CREATE INDEX locations_coords_idx
ON locations
USING GIST (coordinates);
`)
this.addSql(`
CREATE INDEX documents_search_idx
ON documents
USING GIN (search_vector);
`)
}
async down(): Promise&lt;void&gt; {
this.addSql('DROP INDEX IF EXISTS documents_embedding_idx;')
this.addSql('DROP INDEX IF EXISTS locations_coords_idx;')
this.addSql('DROP INDEX IF EXISTS documents_search_idx;')
}
}

I learned this the hard way. I deployed to production without the indexes and queries were 100x slower. Extension-specific indexes (HNSW, IVFFlat, GIST, GIN) are critical for performance.

Common Mistakes to Avoid

1. Ignoring extension indexes

Without indexes, vector similarity searches do full table scans. For a table with 100,000 documents, my query went from 2 seconds to 5ms after adding an IVFFlat index.

2. Type mismatches

Using wrong vector dimensions causes runtime errors. If your embedding model produces 1536 dimensions, your column must be vector(1536).

3. N+1 queries with extensions

Fetching extension data in loops is slow. Use batch queries:

bad-vs-good.ts
// BAD: N+1 queries
for (const doc of documents) {
const similar = await vectorRepo.findSimilar(doc.embedding)
}
// GOOD: Batch query
const embeddings = documents.map(d => d.embedding)
const allSimilar = await knex.raw(`
SELECT * FROM documents
WHERE embedding <=> ANY(?)
`, [embeddings])

4. Forgetting migrations in production

I’ve seen production deployments fail because extensions weren’t enabled. Always include extension setup in migrations, not manual database commands.

5. Abandoning type safety

Using any everywhere defeats the purpose of TypeScript. Define proper types for extension columns.

Why This Matters

PostgreSQL extensions unlock capabilities that pure ORMs can’t touch:

  • AI Applications: pgvector enables RAG pipelines and semantic search
  • Location Services: PostGIS powers mapping and logistics applications
  • Search Quality: pg_search provides sophisticated ranking beyond LIKE queries

The hybrid approach I’ve shown keeps MikroORM’s benefits (migrations, relationships, change tracking) while accessing PostgreSQL’s full power.

When to Use This Approach

This pattern makes sense when:

  • You need vector similarity search for AI/ML features
  • Your application has geospatial requirements
  • Full-text search with ranking is important
  • You’re invested in MikroORM but need PostgreSQL-specific features

It’s overkill for simple applications. But as your requirements grow, knowing how to integrate extensions while keeping your ORM is a valuable tool.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments